...
Tip | ||
---|---|---|
| ||
The Tab key is one of your best friends in Linux. Hitting it invokes "shell completion", which is as close to magic as it gets!
|
...
As you can see, there are a lot of locations on the $PATH. That's because when you load modules at TACC (such as the module load lines in the common login script), that mechanism makes the programs available to you by putting their installation directories on your $PATH. We'll learn more about modules shortly.
...
Code Block | ||||
---|---|---|---|---|
| ||||
########## # SECTION 3 -- controlling the prompt # for NGS course if [[ -n "$PS1" ]]; then PS1='stampls5:\w$ ' fi |
File systems at TACC
...
TACC storage areas and Linux commands to access data (all commands to be executed at TACC except laptop-to-TACC copies, which must be executed on your laptop) |
Local file systems
...
On lonestar5 these local file systems have the following characteristics:
Home | Work | Scratch | |
---|---|---|---|
quota | 5 10 GB | 1024 GB = 1 TB | 12+ PB (basically infinite) |
policy | backed up | not backed up, not purged | not backed up, purged if not accessed recently (~10 days) |
access command | cd | cdw | cds |
environment variable | $HOME | $STOCKYARD (root of the shared Work file system) $WORK (different sub-directory for each cluster) | $SCRATCH |
root file system | /home | /work | /scratch |
use for | Small files such as scripts that you don't want to lose. | Medium-sized artifacts you don't want to copy over all the time. For example, custom programs you install (these can get large), or annotation file used for analysis. | Large files accessed from batch jobs. Your starting files will be copied here from somewhere else, and your final results files will be copied back to your home systemelsewhere (e.g. stockyard, corral, or your BRCF POD). |
When you first login, the system gives you information about disk quota and your compute allocation quota:
Code Block |
---|
--------------------- Project balances for user abattenh ----------------------- | Name Avail SUs Expires | Name Avail SUs Expires | | CancerGenetics 821054856 20152018-09-30 | human_brains A-cm10 456341096 20152018-0612-3031 | | UT-2015-05-18 10000 2100 2 0152019-0603-3031 | genomeAnalysis 29324 2500 20162019-03-31 | ------------------------ Disk quotas for user abattenh ------------------------- | Disk Usage (GB) Limit %Used File Usage Limit %Used | | /home1 0.0 510.0 0.0312 178 91 1500001000000 0.1201 | | /work 54538.85 1024.0 5.35 52.59 61053 3000000 2.04 | | /scratch 2621 3725.9 3000000 0.09 | -----------------------------0 0.00 4137 0 0.00 | ------------------------------------------------------------------------------- |
changing TACC directories
...
Tip |
---|
The cd (change directory) command with no arguments takes you to your home directory on any Linux/Unix system. The cdw and cds commands are specific to the TACC environment. |
Corral
Stockyard (shared Work)
TACC compute clusters now share a common Work file system called stockyard. So files in your Work area do not have to be copied, for example from ls5 to stampede2 – they can be accessed directly from either cluster.
Note that there are two environment variables pertaining to the shared Work area:
- $STOCKYARD - This refers to the root of your shared Work area
- e.g. /work/01063/abattenh
- $WORK - Refers to a sub-directory of the shared Work area that is different for different clusters, e.g.:
- /work/01063/abattenh/lonestar on lonestar5
- /work/01063/abattenh/stampede2 on stampede2
A mechanism for purchasing larger stockyard allocations (above the 1 TB basic quota) are in development.
The UT Austin BioInformatics Team, a loose group of researchers, maintains a common directory area on stockyard.
Code Block | ||||
---|---|---|---|---|
| ||||
ls /work/projects/BioITeam |
Files we will use in this course are in a sub-directory there:
Code Block | ||||
---|---|---|---|---|
| ||||
ls /work/projects/BioITeam/courses/Core_NGS_Tools |
Corral
Corral is a gigantic (multiple PB) storage system (spinning Corral is a gigantic (multiple PB) storage system (spinning disk) where researchers can store data. UT researchers may request up to 5 TB of corral storage through the normal TACC allocation request process. Additional space on corral can be rented for < $100~$85/TB/year.
The UT/Austin BioInformatics Team , a loose group of researchers, maintains a common directory area on corral.
Code Block | ||||
---|---|---|---|---|
| ||||
ls /corral-repl/utexas/BioITeam |
Files we will use in this course are in a sub-directory there:also has an older, common directory area on corral.
Code Block | ||||
---|---|---|---|---|
| ||||
ls /corral-repl/utexas/BioITeam/core_ngs_tools |
A couple of things to keep in mind regarding corral:
- corral is a great place to store data in between analyses.
- Store your permanent, original sequence data on corral
- Copy the data you want to work with from corral to $SCRATCH
- Run your analyses (batch jobs)
- Copy your results back to corral
- Copy your results back to corral
- This is because corral is a network file system, like Samba or NFS.
- Since stampede has so many compute nodes, it doesn't have the network bandwidth that would allow simultaneous access to corral .
- Occasionally corral can become unavailable. This can cause any command to hang that tries to access corral data.
Stockyard (shared $WORK)
TACC compute clusters now share a common $WORK file system called stockyard. So files in your $WORK area do not have to be copied, for example from stampede to ls5 ("lonestar5") – they can be accessed from either cluster.
...
- !
Ranch
Ranch is a gigantic (multiple PB) tape archive system where researchers can archive data. UT researchers may request large (multi-TB) ranch storage allocations through the normal TACC allocation request process.
...