OK. So you just read the latest issue of Bioinformatics (or did a Google search) and have discovered some new pieces of software that promise to slice and dice your data in new, interesting, and useful ways. Most often, these tools will be designed to run in a Linux environment. Unfortunately, the helpful support staff at TACC may not have had time to test these tools and make a proper module out of them (or maybe they didn't want to make 1,000+ modules for every piece of bioinformatics software out there). Perhaps there is a TACC module, but it was made a month or two back when the software was at version 1.01 and now it's at version 1.03, which has a bug fix or some nifty new bell and whistle.
The bottom line is that you are going to find yourself in a situation where
module spider will come up empty and you're on your own installing a piece of software that you are dying try out on TACC.
Unfortunately, there is no double-click installer for TACC. Fortunately, a majority of the better and more mature programs out there (but by no means all bioinformatics software) can be fairly easily installed. If these instructions fail, you might need to find your nearest Linux guru. Or, you might try to consult Google and tinker with things a bit.
The overall steps for installing a program on a Linux system are:
- Download the executable or source code
- Compile or make the project (if installing from source code)
- Set up your
$PATHto find the new executable
Note: Most Linux installs will work similarly on MacOSX, with just a few additional preambles (install XCode, maybe some extra libraries, etc). With more extra work, it is possible to set up a Linux-like environment in Windows as well. Both of these topics are outside the scope of what we are going to cover here.
Case 1: Installing a precompiled binary (executable)
For programs that are already compiled (converted from high level source code in a language like C into machine specific code), you are often given some choices and need to determine how to download the version that has the correct CPU architecture for your machine.
You can get your CPU architecture with this command:
Output might be something like i386 (for my MacBook) or x86_64 (for Lonestar).
Example: Install SSAHA2 precompiled binary
The website for the SSAHA2 read mapper has links to download executables compiled for several different architectures. Using commands that you have learned in earlier lessons, download the correct one to Lonestar and place it under the directory
You can often right-click to copy the URL of a link on a website and then use
wget to download it directly to TACC.
How the shell finds executables: $PATH
Now, you might want to tell your login shell that it should look for executable files in this new directory
$HOME/local/bin. This will allow you to use the executable as a one-word command like you are used to:
Instead of writing out the entire path to the executable to run it, like in one of these examples:
Assuming you are using the bash shell, you can do this by editing your
$HOME/.profile_user configuration file. These files are basically just bash scripts that are run whenever you log in. You want to add a line that looks like this to the top of
This sets the environmental variable
PATH to point to its old value with your new directory appended to the front (the : separates multiple paths). This means the shell will look for executables in this new location first, then it will look in all of the standard locations after that. For more information on environmental variables see the Bash Beginner's Guide.
Important! In order to have this change take effect, you must log out or log in again to force the shell to re-read the
$HOME/.profile_user file. (Alternately, you can use one of these commands to re-read it at any time:
If your path is not working or you're curious about where else your shell is looking for commands and the order, then you might want to see the value of your
Warning! If you forget to include
$PATH on the right side in the above example, then you will tell your shell to not look in the usual places for executables any more. This means that
cd, and other common commands will no longer work without typing out their whole paths, e.g.
/bin/ls. This can be extremely confusing!!
Handling multiple versions If you install a newer version of a command that is already available on TACC for yourself, then you might get confused about what version you are running when you type the command. You can see the whole path to the executable that will be run when you type a one-word command using the
Many tools will also have a
--version flag, or output their version information in a header when they are run. This can help you be sure that you are running the version that you think you are.
Case 2: Install from the source code
Note on TACC compilers
There are multiple compilers available on TACC:
icc- the default compiler. Preferred for optimizing speed of compiled executables.
gcc- the GNU compiler collection. Tends to be more compatible.
Be aware that if you compile libraries and programs that link to them, that generally you must compile all components with the same compiler.
If you run into an error during compilation, try the
gcc compiler by loading its module. You may get a message like this:
So, follow the directions:
You will need to do this to get breseq to compile in the next example.
Example: Install breseq from a source code archive
breseq is a tool developed by the Barrick lab. You might use it in a later lesson. It is a good example of a tool that can be downloaded and compiled.
breseq uses the common GNU build system install sequence. If you install other GNU tools then the same
./configure; make; make install command sequence will often be used.
The extra option
./configure sets where the executable and any other files associated with the program will be installed. If you leave off this flag, then it will try to install them in a system-side location. You must have administrator privileges to do this and would generally have to substitute
sudo make install for the last step to get this to work. That won't work on TACC! (
sudo means "super-user do".)
For some other tools, the instructions may tell you to skip straight to
make, or you might also have to install some other programs or libraries that the tool you want to use needs to run. Generally, you can find this information in the online documentation or an
INSTALL file in the root of the downloaded code.
Example: Install the latest version of Bowtie2
There is a newer version of Bowtie2 available than the one loaded into a module on TACC. You might want to use it because it includes some new bug fixes. You can download either a source code version to compile using the above instructions or a binary version of bowtie2. Try to get this running on your own.
Bowtie2 is comprised of multiple executables. You will need to copy or move all of them into
$HOME/local/bin to have a functioning Bowtie2 install. (Be sure that both
In other lessons we'll cover various deviations and elaborations on these two procedures in order to install specific programs, R modules, Perl modules, Python modules, etc.