We recently received reviews back for the Tanner crab paper submission (“Characterization of the gene repertoire and environmentally driven expression patterns in Tanner crab (Chionoecetes bairdi)”) and one of the reviewers requested a more in-depth analysis. As part of addressing this, we’ve decided to identify SNPs withing the _Chionoecetes bairdi (Tanner crab) transcriptome used in the paper (
cbai_transcriptome_v3.1). Since the process involves aligning sequencing reads to the transcriptome, the first thing that needed to be done was to generate index files for the aligner (
HISAT2, in this particular case), so I ran
HISAT2 on Mox.
Replaced the battery pack in the top APC SUA2200RM2U UPS in our computing “cluster” cabinet.
We’ve been having an issue with our computer Raven where it would become inaccessible after some time after a reboot. Attempts to remote in would just indicate no route to host or something like that. We realized it seemed like this was caused by a power saving setting, but changing the sleep setting in the Ubuntu GUI menu didn’t fix the issue. It also seemd like the sleep/hibernate issue was only a problem after the computer had been rebooted and no one had logged in yet…
Previously performed quality trimming on the Crassostrea virginica (Eastern oyster) gonad/sperm RNAseq data on 20210714. Next, I needed to identify exons and splice sites, as well as generate a genome index using
HISAT2 to be used with
StringTie downstream to identify potential alternative transcripts. This utilized the following NCBI genome files:
As part of identifying alternative transcripts in the Crassostrea virginica (Eastern oyster) gonad RNAseq data we have, I previously used
HISAT2 to index the NCBI Crassostrea virginica (Eastern oyster) genome and identify exon/splice sites on 20210720. Then, I used this genome index to run
StringTie on Mox in order to map sequencing reads to the genome/alternative isoforms.
Per this GitHub Issue, I’ve compiled a summary table, with links, of all of our Panopea generosa (Pacific geoduck) RNAseq data as it exists in NCBI. This will be a “dynamic” notebook entry, whereby I will update this post continually as we acquire new data and/or change the information we’d like to have here.
As part of our Panopea generosa (Pacific geoduck) genome sequencing efforts, Steven came across a tool designed to help identify if there are any contaminating sequences in your assembly. The software is BlobToolKit. The software is actually a complex pipeline of separate tools ([minimap2])https://github.com/lh3/minimap2,
DIAMOND BLAST, and BUSCO) which aligns sequencing reads and assigns taxonomy to the reads, as well as marking regions of the assembly with various taxonomic assignments.