Last week, I ran the two raw FASTQ files through FastQC. As expected, FastQC detected “errors”. These errors are due to the presence of adapter sequences, barcodes, and the use of a restriction enzyme (ApeKI) in library preparation. In summary, it’s not surprising that FastQC was not please with the data because it’s expecting a “standard” library prep that’s already been trimmed and demultiplexed.
However, just for comparison, I ran the demultiplexed files through FastQC. The Jupyter notebook is linked (GitHub) and embedded below. I recommend viewing the Jupyter notebook on GitHub for easier viewing.
Results:
Pretty much the same, but with slight improvements due to removal of adapter and barcode sequences. The restriction site still leads to FastQC to report errors, which is expected.
Links to all of the FastQC output files are linked at the bottom of the notebook.
Jupyter notebook (GitHub): 20170306_docker_fastqc_demultiplexed_bgi_oly_gbs.ipynb