TrimGalore/FastQC/MultiQC - Auto-trim C.virginica MBD BS-seq FASTQ data – Sam’s Notebook

Author

Sam White

Published

April 9, 2018

Yesterday, I ran FastQC/MultiQC on the Crassostrea virginica MBD BS-seq data from ZymoResearch. Steven wanted to trim it and see how things turned out.

I ran TrimGalore (using the built-in FastQC option) and followed up with MultiQC for a summary of the FastQC reports.

TrimGalore job script:

20180409_trimgalore_autotrim_Cvirginica_MBD.sh

Standard error was redirected on the command line to this file:

20180409_trimgalore_autotrim_Cvirginica_MBD/stderr.log

MD5 checksums were generated on the resulting trimmed FASTQ files:

20180409_trimgalore_autotrim_Cvirginica_MBD/checksums.md5

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer.

Results:

Output folder:

20180409_trimgalore_autotrim_Cvirginica_MBD/

FastQC output folder:

20180409_trimgalore_autotrim_Cvirginica_MBD/20180409_fastqc_trimgalore_autotrim_Cvirginica_MBD/

MultiQC output folder:

20180409_trimgalore_autotrim_Cvirginica_MBD/20180409_fastqc_trimgalore_autotrim_Cvirginica_MBD/multiqc_data/

MultiQC HTML report:

20180409_trimgalore_autotrim_Cvirginica_MBD/20180409_fastqc_trimgalore_autotrim_Cvirginica_MBD/multiqc_data/multiqc_report.html

Overall, the auto-trim didn’t alter things too much. Specifically, Steven is concerned about the variability in the first 15bp (seen in the Per Base Sequence Content section of the MultiQC output). It was reduced, but not greatly. Will perform an independent run of TrimGalore and employ a hard trim of the first 14bp of each read and see how that looks.