D01 What is HPC? |
Before you can use larger resources, you need to understand the
difference from your own computers |
>What are the scales of computing? |
|
>HPC Intro |
Triton cluster intro |
D20 Modules and software |
Using and installing software on a cluster is different from
your own computer, because hundreds of people are sharing it.
Modules are the solution. |
>How do you use module ?
>How do you find software? |
>Lmod introduction |
>Triton tutorials for intro:
modules,
applications,
>Lmod user guide |
> Software and applications,
> modules |
D21 Batch systems |
On a cluster, you have to share resources with others. Slurm
is one batch queuing system that makes it possible. |
>What role does the batch system fill?
>How does one submit to the batch system? |
>Slurm basics
>interactive jobs
>batch jobs |
Triton tutorials:
>interactive,
>serial,
>array |
Triton tutorials:
interactive,
serial,
array |
D22 HPC Storage |
Storage turns out to be just as important as computing power.
There are different places available, each with different
advantages. |
>Why is storage so important?
>How can you monitor input/output (I/O) performance?
>How to best handle your data? |
>HPC I/O principles |
>Storage basics. |
Triton tutorials:
storage basics. More advanced:
lustre,
local storage,
small files |
D23 Parallel computing |
The point of a cluster is to run things in parallel. Shared
memory (OpenMP) and message passing (MPI) are the most common models.
Learn how to run them, not write them. |
>What are the main models of parallel code?
>How are they run on clusters?
>How do you figure out what your code uses? |
|
>Parallel jobs. |
Triton tutorials:
parallel. |
D24 Advanced shell scripting and automation |
Hands-on shell scripting, putting everything together to
automate large computations on the cluster. |
|
|
Various courses,
finishing the linux shell tutorial
is a good start. The Advanced bash scripting guide is a classic. |
|