INTRO
The data we received on 20250804 was received as two pools of data and had not been demultiplexed. Steven assigned the task of demultiplexing to me (GitHub Issue).
MATERIALS & METHODS
I installed the PacBio software, lima(GitHub repo) on Raven in a mamba/conda environment:
mamba create -n pacbio_lima_env lima -c bioconda -y
Using this environment, I demultiplexed the data into both BAM files files.
A second environment using PacBio software, ‘pbtk’ was created for BAM to FastQ conversion:
mamba create -n pacbio_pbtk_env pbtk -c bioconda -y
I used the bam2fastq software in the pbtk package to perform the conversions.
Although labelled as BAMs, these are not alignment files. I believe the BAM format allows for better compression and the usage of SAM tags (PacBio methylation tag explanation) to allow for data annotation (e.g. marking methylation).
POOL1
BAMs
Created BAM outputs in /home/shared/16TB_HDD_01/sam/data/S_namaychush/Pool1/hifi_reads/demultiplexed/bams:
lima ../../m84082_250614_081210_s3.hifi_reads.bam \
../../barcodes.fasta \
m84082_250614_081210_s3.hifi_reads.demux.bam \
--hifi-preset SYMMETRIC \
--split-named \
--split-subdirsThe command splits the demutlplexed BAMs into their own subdirectories.
FastQs
Created FastQ outputs in /home/shared/16TB_HDD_01/sam/data/S_namaychush/Pool1/hifi_reads/demultiplexed/fastqs:
bam2fastq \
-o /home/shared/16TB_HDD_01/sam/data/S_namaychush/Pool1/hifi_reads/demultiplexed/fastqs/{basename of input BAM} \
/home/shared/16TB_HDD_01/sam/data/S_namaychush/Pool1/hifi_reads/demultiplexed/bams/{BAM_directory}/{individual_BAM}cat ../../barcodes.fasta
>bc2041
TATGATCACTGAGTAT
>bc2071
CGAGTCTAGCGAGTAT
>bc2072
TATCAGTAGTGAGTATPOOL2
BAMs
Created BAM outputs in /home/shared/16TB_HDD_01/sam/data/S_namaychush/Pool2/hifi_reads/demultiplexed/bams:
lima \
../../m84082_250614_101514_s4.hifi_reads.bam \
../../barcodes.fasta m84082_250614_101514_s4.hifi_reads.demux.bam \
--hifi-preset SYMMETRIC \
--split-named \
--split-subdirsThe command splits the demutlplexed BAMs into their own subdirectories.
FastQs
Created FastQ outputs in /home/shared/16TB_HDD_01/sam/data/S_namaychush/Pool2/hifi_reads/demultiplexed/fastqs:
bam2fastq \
-o /home/shared/16TB_HDD_01/sam/data/S_namaychush/Pool2/hifi_reads/demultiplexed/fastqs/{basename of input BAM} \
/home/shared/16TB_HDD_01/sam/data/S_namaychush/Pool1/hifi_reads/demultiplexed/bams/{BAM_directory}/{individual_BAM}cat ../../barcodes.fasta
>bc2069
TCTATGACATGAGTAT
>bc2070
TACTGCTCACGAGTAT
>bc2073
ATCACTAGTCGAGTATRESULTS
Data was rsync’d to Owl:
https://owl.fish.washington.edu/nightingales/S_namaycush/LakeTrout/Pool1/hifi_reads/demultiplexed/
https://owl.fish.washington.edu/nightingales/S_namaycush/LakeTrout/Pool2/hifi_reads/demultiplexed/
Pool1 Directory Layout
This is truncated to focus on the demultiplexing.
├── [4.0K] hifi_reads
│ ├── [ 76] barcodes.fasta
│ ├── [4.0K] demultiplexed
│ │ ├── [4.0K] bams
│ │ │ ├── [4.0K] bc2041--bc2041
│ │ │ │ ├── [9.0G] m84082_250614_081210_s3.hifi_reads.demux.bc2041--bc2041.bam
│ │ │ │ ├── [ 35M] m84082_250614_081210_s3.hifi_reads.demux.bc2041--bc2041.bam.pbi
│ │ │ │ └── [2.2K] m84082_250614_081210_s3.hifi_reads.demux.bc2041--bc2041.consensusreadset.xml
│ │ │ ├── [4.0K] bc2071--bc2071
│ │ │ │ ├── [7.6G] m84082_250614_081210_s3.hifi_reads.demux.bc2071--bc2071.bam
│ │ │ │ ├── [ 36M] m84082_250614_081210_s3.hifi_reads.demux.bc2071--bc2071.bam.pbi
│ │ │ │ └── [2.2K] m84082_250614_081210_s3.hifi_reads.demux.bc2071--bc2071.consensusreadset.xml
│ │ │ ├── [4.0K] bc2072--bc2072
│ │ │ │ ├── [5.5G] m84082_250614_081210_s3.hifi_reads.demux.bc2072--bc2072.bam
│ │ │ │ ├── [ 25M] m84082_250614_081210_s3.hifi_reads.demux.bc2072--bc2072.bam.pbi
│ │ │ │ └── [2.2K] m84082_250614_081210_s3.hifi_reads.demux.bc2072--bc2072.consensusreadset.xml
│ │ │ ├── [5.9K] m84082_250614_081210_s3.hifi_reads.demux.consensusreadset.xml
│ │ │ ├── [1.6K] m84082_250614_081210_s3.hifi_reads.demux.json
│ │ │ ├── [ 156] m84082_250614_081210_s3.hifi_reads.demux.lima.counts
│ │ │ ├── [1.8G] m84082_250614_081210_s3.hifi_reads.demux.lima.report
│ │ │ └── [ 845] m84082_250614_081210_s3.hifi_reads.demux.lima.summary
│ │ └── [4.0K] fastqs
│ │ ├── [8.1G] m84082_250614_081210_s3.hifi_reads.demux.bc2041--bc2041.fastq.gz
│ │ ├── [6.8G] m84082_250614_081210_s3.hifi_reads.demux.bc2071--bc2071.fastq.gz
│ │ └── [4.9G] m84082_250614_081210_s3.hifi_reads.demux.bc2072--bc2072.fastq.gz
│ ├── [ 21G] m84082_250614_081210_s3.hifi_reads.bam
│ └── [ 93M] m84082_250614_081210_s3.hifi_reads.bam.pbiPool2 Directory Layout
├── [4.0K] hifi_reads
│ ├── [ 75] barcodes.fasta
│ ├── [4.0K] demultiplexed
│ │ ├── [4.0K] bams
│ │ │ ├── [4.0K] bc2069--bc2069
│ │ │ │ ├── [3.5G] m84082_250614_101514_s4.hifi_reads.demux.bc2069--bc2069.bam
│ │ │ │ ├── [ 14M] m84082_250614_101514_s4.hifi_reads.demux.bc2069--bc2069.bam.pbi
│ │ │ │ └── [2.2K] m84082_250614_101514_s4.hifi_reads.demux.bc2069--bc2069.consensusreadset.xml
│ │ │ ├── [4.0K] bc2070--bc2070
│ │ │ │ ├── [6.9G] m84082_250614_101514_s4.hifi_reads.demux.bc2070--bc2070.bam
│ │ │ │ ├── [ 32M] m84082_250614_101514_s4.hifi_reads.demux.bc2070--bc2070.bam.pbi
│ │ │ │ └── [2.2K] m84082_250614_101514_s4.hifi_reads.demux.bc2070--bc2070.consensusreadset.xml
│ │ │ ├── [4.0K] bc2073--bc2073
│ │ │ │ ├── [8.8G] m84082_250614_101514_s4.hifi_reads.demux.bc2073--bc2073.bam
│ │ │ │ ├── [ 39M] m84082_250614_101514_s4.hifi_reads.demux.bc2073--bc2073.bam.pbi
│ │ │ │ └── [2.2K] m84082_250614_101514_s4.hifi_reads.demux.bc2073--bc2073.consensusreadset.xml
│ │ │ ├── [5.9K] m84082_250614_101514_s4.hifi_reads.demux.consensusreadset.xml
│ │ │ ├── [1.6K] m84082_250614_101514_s4.hifi_reads.demux.json
│ │ │ ├── [ 156] m84082_250614_101514_s4.hifi_reads.demux.lima.counts
│ │ │ ├── [1.6G] m84082_250614_101514_s4.hifi_reads.demux.lima.report
│ │ │ └── [ 843] m84082_250614_101514_s4.hifi_reads.demux.lima.summary
│ │ └── [4.0K] fastqs
│ │ ├── [3.1G] m84082_250614_101514_s4.hifi_reads.demux.bc2069--bc2069.fastq.gz
│ │ ├── [6.2G] m84082_250614_101514_s4.hifi_reads.demux.bc2070--bc2070.fastq.gz
│ │ └── [7.9G] m84082_250614_101514_s4.hifi_reads.demux.bc2073--bc2073.fastq.gz
│ ├── [ 18G] m84082_250614_101514_s4.hifi_reads.bam
│ └── [ 83M] m84082_250614_101514_s4.hifi_reads.bam.pbi