Data wrangling best practices summary

keep fastq files compressed

You may be tempted un-compress your sequencing files to manipulate them more directly

arrange adequate storage space

backup analysis artifacts regularly

distinguish between types of data

Artifacts from different stages of the analysis will have different archival requirements.

While a project is active you will want to keep more intermediate artifacts for reference. Many of these can be deleted after publication.

track your analysis steps

Your analyses should be reproducible by others so you need to keep the equivalent of a lab notebook to document your protocols.