R and R Studio Server versions

The issue of R versions is a difficult one, especially now that many important single-cell packages are only available in newer R versions, but not all older, but still popular R packages are. This section describes the versioning issues in both the system R and in the R Studio Server web application.

As of September 2020, the "default" system version of R on compute servers is R 3.6.1 – this is the version that is invoked if you type R from the command line, and the version used by all R Studio Server instances.

We also have three versions of R versions "side by side" – R-3.5.3 and R 3.6.1 – which can be accessed by typing R-3.5.3 and R 3.6.1 from the command line. However these R versions are not available in R Studio Server because its R version setting can only be set to one value system-wide and cannot be specified per-user.

We have also installed many popular add-on packages in the all R versions (e.g. tidyverse, ggplot2, DESeq2); however be aware that not all packages are available in all R versions.

If you need a GUI environment to access versions of R other than 3.6.1, an option that provides maximum per-user flexibility is as follows. Use the R Studio Server web application for R 3.6.1-compatible workflows. For workflows requiring other R versions, users can install the R Studio desktop application on their own desktop/laptop computers, using an underlying version(s) of R 3.5/3.6. Then, users can access files on shared storage by mounting their Work area file system via Samba (see Samba remote file system access for more information). The main drawback to this workflow is that typical personal computers do not have as much RAM as POD compute servers, and some R tasks can be memory intensive. What users can do in such cases is test the code in R Studio on their desktop computer, using smaller data sets if necessary. Then run the "full" workflow from the POD compute server command line using the appropriate R version.

Understanding R add-on packages

The libraries/add-on packages available in any given R version depend on the configured package installation directories, which can be listed in the R environment via the .libPaths() function. Typically, each user has a local package installation directory with packages they have installed. This local directory is searched first, followed by one or more system directories where we have installed add-on packages system-wide.

User local package installations directories are typically under the user's ~/R directory (e.g. /stor/home/<user_name>/R). If a user has installed packages under multiple versions of R, there will be sub-directories for the different versions (e.g. ~/R/x86_64-pc-linux-gnu-library/3.4, ~/R/x86_64-pc-linux-gnu-library/3.6). Users can list the contents of these directories to see what packages they have installed locally.

To see what packages are installed system-wide for a given R version, users can look at the version's package installation directories:

Local/Global package installation conflicts

Globally-installed R add-on packages may be updated during system maintenance. This can sometimes cause problems when users invoke R tools with many dependencies (e.g. DESeq2), some of which have been updated system-wide, but others of which have been locally installed and are not at a compatible level. The resulting error messages can be rather obscure, but typically show up after system maintenance has been performed. For example:

Error in checkSlotAssignment(object, name, value) : 
  assignment of an object of class “NULL” is not valid for slot ‘NAMES’ in an object of class “DESeqDataSet”; is(value, "characterORNULL") is not TRUE

To determine if this is due to a Local/Global package conflict, users can make their local installation directory invisible to R and see if the error goes away like this:

mv ~/R ~/R.bak

If this resolves the issue, the user may later find that they need to re-install other packages that were previously installed locally (check the now-named  ~/R.bak/x86_64-pc-linux-gnu-library/3.x directory, where x is the R version being used to see locally installed packages).

If this produces a different error indicating that one or more locally installed packages are missing, the user can re-install them then see if the problem is resolved.

Finally, if renaming the local R installation directory does not resolve the issue, it may be an issue with the globally installed packages, so Contact Us.

Troubleshooting other R issues

In addition to the Local/Global package conflict issue described above, other issues can arise involving R Studio Server (or less commonly, command-line R).

R Studio Server becomes unresponsive

One common problem is that R Studio Server may become unresponsive, even with repeated attempts to establish a new session. To troubleshoot this sort of issue, close the R Studio Server application and make some R-associated files and directories invisible to R like this:

mv ~/.rstudio ~/.rstudio.bak
mv ~/.RData ~/.RData.bak  

Note that .RData files may be in different directories. For example, if you a working in an R Project you have set up, there may be an .RData file in the project directory.

Large .RData files can be extremely slow to load from both R and R Studio Server. If you must save R data this way, consider renaming the .RData file to a different name so that it can be loaded explicitly only when needed, instead of always when R is invoked.

Disk quota exceeded

Another type of problem can arise when a user's 100 GB Home directory quota has been exceeded. This can produce errors when trying to start R Studio Server or R, perform work in R, or even install additional packages. For example, you may see a "Cannot connect to service" message after logging in to R Studio Server. Or, if an R session has been established and saving a new file would exceed the Home directory quota, users will often (but not always) see an error like the following:

cannot create file'/stor/home/abattenh/output.tsv', reason 'Disk quota exceeded

To determine the status of your Home directory quota, just use SSH to login to one of your POD's compute servers. A message such as the one below will be displayed:

Quota Report for abattenh
Mount Point          Used            Total      Last Checked
stor/home/abattenh   52G (51%)       100G       Mon 21 Sep 2020 11:32:02 AM CDT

If this issue arises, you should Contact Us to help relocate some of your Home directory contents to your Work or Scratch area. Just moving them yourself does not resolve the problem because Home directories have frequent snapshots taken that preserve copies of deleted files, and it requires a systems administrator to remove these snapshots (see Home directories for more information).

This issue can arise because R's default input/output directory is the user's home directory – but large files should not be stored or created there due to the 100 GB quota. Instead, R processing of large files should take place in the user's Work or Scratch area (e.g. /stor/work/<user's group name> or /stor/scratch/<user's group name>; users can find out which group(s) they belong to by typing the groups command on the command line). Users can navigate to Work or Scratch area directories using R's setwd function or using R Studio Server's file browser (e.g. via "Session" menu → "Set Working Directory" → "Choose Directory", or when a new R Project is created). Note that R Studio Server's file browser dialog will default to the user's Home directory, and the full path of the desired Work or Scratch area must be typed in.