We’re in the process of organizing files for a manuscript dealing with the geoduck genome assembly/annotation we’ve done. As part of that, we need the Stringtie BAM file that was used with GenSAS for Pgenerosa_v074 annotation to upload to the Open Science Foundation repository for this project. Unfortunately, at 73GB, the file far exceeds the individual file size limit for OSF (5GB). So, I split it into 5GB chunks. See the following notebook for deets:
Jupyter Notebook (GitHub):
TL;DR:
Use Bash command
split
to split the file into desired chunk sizesReassemble chunks into full size BAM using the Bash
cat
command.Run
md5sum
on original BAM and reassembled BAM to confirm the two files are the same.
RESULTS
Output folder:
Will upload split files to OSF repository.