Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Do not perform substantial computation on the login nodes.
    • They are closely monitored, and you will get warnings from the TACC admin folks!
    • Code is usually developed and tested somewhere other than TACC, and only moved over when pretty solid.
  • Do not perform significant network access from your batch jobs.
    • Instead, stage your data onto $SCRATCH from a login node onto $SCRATCH before submitting your job.

...

Lonestar6 and Stampede2 overview and comparison

Here is a comparison of the configurations and ls5 ls6 and stampede2. As you can see, stampede2 is the larger cluster, launched in 2017, but ls6, launched this year, has fewer but more powerful nodes.


ls5ls6stampede2
login nodes

64

20 128 cores each
128 256 GB memory

6

28 cores each
128 GB memory

standard compute nodes

1,252560

24 128 cores per node (48 virtual)
64 256 GB memory

4,200 KNL (Knights Landing)

  • 68 cores per node (272 virtual)
  • 96 GB memory

1,736 SKX (Skylake)

  • 48 cores per node (96 virtual)
  • 192 GB memory
large memory GPU nodes

10 16 total

128 cores per nod
256 GB memory

2x NVIDIA A100 GPUs
w/ 40GB RAM onboard 2 w/1 TB memory, 48 cores
8 w/512 GB RAM, 32 cores

--
batch systemSLURMSLURM
maximum job run time

48 hours, normal queue

2 hours, development queue

96 hours on KNL nodes, normal queue

48 hours on SKX nodes, normal queue

2 hours, development queue

Note the use of the term virtual core above on stampede2. Compute cores are standalone processors – mini CPUs, each of which can execute separate sets of instructions. However modern cores may also have hyper-threading enabled, where a single core can appear as more than one virtual processor to the operating system (see https://en.wikipedia.org/wiki/Hyper-threading for more on hyper-threading). For example, Lonestar5 stampede2 nodes have 2 or 4 hyperthreads (HTs) per core. So KNL nodes with 2 4 HTs for each of the 24 68 physical cores, each node has have a total of 48 272 virtual cores.

User guides for ls5 ls6 and stampede2 can be found at:

Unfortunately, the TACC user guides are aimed towards a different user community – the weather modelers and aerodynamic flow simulators who need very fast matrix manipulation and other high performance computing (HPC) features. The usage patterns for bioinformatics – generally running 3rd party tools on many different datasets – is rather a special case for HPC. TACC calls our type of processing "parameter sweep jobs" and has a special process for running them, using their launcher module.

...

To determine where the shell will find a particular program, use the which command:. Note that which tells you where it looked if it cannot find the program.

Code Block
languagebash
titleUsing which to search $PATH
which rsync
which cat

which bwa # not yet available to you

The module system

The module system is an incredibly powerful way to have literally thousands of software packages available, some of which are incompatible with each other, without causing complete havoc. The TACC staff builds the desired package from source code in well-known locations that are NOT on your $PATH. Then, when a module is loaded, its binaries are added to your $PATH.

...