FastQC MultiQC - Andy Dittman Preliminary RNA-seq Data on Hyak
2026
FastQC
MultiQC
RNA-seq
hyak
Author
Sam White
Published
May 19, 2026
INTRO
RNA-seq data from Andy Dittman at NOAA came in on 20260508 (Notebook entry). The data is an initial run of a subset of samples to see how the data looks and to make sure it is of good quality. The initial GitHub Issue indicated that the RNA Integrity Number (RIN) for these samples was not ideal, so he wanted to see how a subset of the data looked before proceeding with the rest of the samples. I ran FastQC and MultiQC on the raw reads to assess their quality.
Analysis was run on Hyak using the following Roberts Lab Apptainer image:
srlab-R4.4-bioinformatics-container-c3d3116.sif (built on 20260410)
Overall, the FastQC and MultiQC reports indicate that the raw reads are of good quality, with quality being consistent for what we usually see with RNA-seq data. Additionally, none of the sequencing There is a bit of “wonkiness” in the per base sequence content plot beyond the first 15-20bp for a number of samples. However, knowing these samples had low RINs, this is not unexpected. However, it does make me wonder if those long stretches after the initial 15-20bp are due to adapter contamination and/or polyG sequence (polyG is a common occurrence in Illumina sequencing data when the transcripts are short and the sequencing read extends into the adapter sequence). I will look into this further in the next steps of the analysis, which will include running FastP on the raw reads to trim adapters and polyG sequences, and then re-running FastQC and MultiQC on the trimmed reads to see if that “wonkiness” in the per base sequence content plot is resolved after trimming.
Code is below.
VARIABLES
Code
# DIRECTORIESraw_reads_dir <-"/mmfs1/gscratch/scrubbed/samwhite/data/dittman_grc_rnaseq_1/"output_dir <-"/mmfs1/gscratch/scrubbed/samwhite/outputs/20260519-FastQC-MultiQC-dittman-grc-rnaseq/"# FILESfastq_pattern="*.fastq.gz"# PROGRAMSgfatools <-c("/srlab/programs/gfatools/gfatools")hifiasm <-c("/srlab/programs/miniforge3-24.7.1-0/envs/hifiasm-0.25.0_env/bin/hifiasm")# SETTINGSthreads <-"32"# Export these as environment variables for bash chunks.Sys.setenv(fastq_pattern = fastq_pattern,raw_reads_dir = raw_reads_dir,output_dir = output_dir,threads = threads)
FASTQC/MULTIQC
Code
# Make output directory if it doesn't existmkdir--parents"${raw_reads_dir}"############ RUN FASTQC ############# Create array of trimmed FastQsraw_fastqs_array=(${raw_reads_dir}/${fastq_pattern})# Pass array contents to new variable as space-delimited listraw_fastqc_list=$(echo"${raw_fastqs_array[*]}")echo"Beginning FastQC on raw reads..."echo""# Run FastQC### NOTE: Do NOT quote raw_fastqc_listfastqc\--threads ${threads}\--outdir ${raw_reads_dir}\--quiet \${raw_fastqc_list}echo"FastQC on raw reads complete!"echo""############ END FASTQC ######################## RUN MULTIQC ############echo"Beginning MultiQC on raw FastQC..."echo""multiqc${raw_reads_dir}\--interactive \-o ${raw_reads_dir}echo""echo"MultiQC on raw FastQs complete."echo""############ END MULTIQC ############echo"Removing FastQC zip files."echo""rm${raw_reads_dir}/*.zipecho"FastQC zip files removed."echo""
mkdir: cannot create directory ‘’: No such file or directory
Beginning FastQC on raw reads...
bash: line 19: fastqc: command not found
FastQC on raw reads complete!
Beginning MultiQC on raw FastQC...
This is MultiQC v1.14
For more help, run 'multiqc --help' or visit http://multiqc.info
╭─ Error ──────────────────────────────────────────────────────────────────────╮
│ Option '-o' requires an argument. │
╰──────────────────────────────────────────────────────────────────────────────╯
MultiQC on raw FastQs complete.
Removing FastQC zip files.
rm: cannot remove '/*.zip': No such file or directory
FastQC zip files removed.