You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

IF YOU HAVE YOUR OWN DATA:

  • Make sure data is on ls5 and in your scratch directory
  • Make sure genome/transcriptome files are on ls5 and in your scratch directory. If you need to download reference files, some options include:
    • igenomes: With on download, you can get genome fasta files, annotation and bowtie and bwa (slightly older version) indexes for many organisms.
    • Ensembl: Fasta files (genome and transcriptome), gtf files (annotation) for multiple organisms.
  • Make sure the files are readable

IF YOU DON'T HAVE YOUR OWN DATA:

 You have several choices for datasets to play with, depending on the stage in the analysis that you would like to work on.

  1. RAW DATA:  Start with 6 raw fastq files from the following dataset:
    Bottomley et Al mouse dataset SRA026846.1 (http://dx.doi.org/10.1371/journal.pone.0017820)
    Single end RNA-Seq data generated on Illumina GAIIx for 2 strains of mice (B6 and D2) to detect differential striatal gene expression between the two nbred mouse strains. 
    We have provided three B6 and three D2 fastq files for you to work with.

    Get the data
     
    cds
    cd my_rnaseq_course
    cp -r /corral-repl/utexas/BioITeam/rnaseq_course_2016/day_4_bottomley_raw_data . &
    
  2. MAPPED FILES: Start with 6 bam files (mapped to the mouse MM9 genome) from the following dataset:
    Bottomley et Al mouse dataset SRA026846.1 (http://dx.doi.org/10.1371/journal.pone.0017820)
    Single end RNA-Seq data generated on Illumina GAIIx for 2 strains of mice (B6 and D2) to detect differential striatal gene expression between the two nbred mouse strains. 
    We have provided three B6 and three D2 fastq files for you to work with.

    Get the data
     
    cds
    cd my_rnaseq_course
    cp -r /corral-repl/utexas/BioITeam/rnaseq_course_2016/day_4_bottomley_mapped_data . &
     
  3. GENE COUNT DATA: Start with a table providing per-gene read counts  for 3 treated and 4 untreatead Drosophila samples. From the following dataset:
    Brooks et al, 2011 dataset GSE18508 (PMID: 20921232)
    The experiment studied the effects of RNAi knockdown of Pasilla, the Drosophila melanogaster ortholog of mammalian NOVA1 and NOVA2, on the transcriptome. Treated samples have been RNAi depleted of mRNAs encoding RNA binding proteins and untreated samples have not.

    Get the data
     
    cds
    cd my_rnaseq_course
    cp -r /corral-repl/utexas/BioITeam/rnaseq_course_2016/day_4_brooks_gene_count_data . &
     

REMINDERS

How to request an interactive session in one of stampede's compute nodes?

idev -m 120 -q development -A UT-2015-05-18

How to submit jobs?  Submitting Jobs to LS5


How to submit jobs in a sequence?

You can create a job that is dependent on another job finishing only start after the first job has completed using this command:


sbatch --dependency=afterok:<job-ID> launcher.sge



  • No labels