Yesterday, I ran TrimGalore/FastQC/MultiQC on the Crassostrea virginica MBD BS-seq data from ZymoResearch with the default settings (i.e. “auto-trim”). There was still some variability in the first ~15bp of the reads and Steven wanted to see how a hard trim would change things.
I ran TrimGalore (using the built-in FastQC option), with a hard trim of the first 14bp of each read and followed up with MultiQC for a summary of the FastQC reports.
TrimGalore job script:
Standard error was redirected on the command line to this file:
MD5 checksums were generated on the resulting trimmed FASTQ files:
All data was copied to my folder on Owl.
Checksums for FASTQ files were verified post-data transfer (data not shown).
Results:
Output folder:
FastQC output folder:
MultiQC output folder:
MultiQC HTML report:
OK, this trimming definitely took care of the variability seen in the first ~15bp of all the reads.
However, I noticed that the last 2bp of each of the Read 1 seqs all have some wonky stuff going on. I’m guessing I should probably trim that stuff off, too…