- Miscellaneous 879
- Geoduck Genome Sequencing 96
- Olympia Oyster Genome Sequencing 70
- Olympia oyster reciprocal transplant 67
- PROPS 36
- Samples Submitted 25
- Computer Servicing 25
- Crassostrea gigas larvae OA (2011) bisulfite sequencing 24
- LSU C.virginica Oil Spill MBD BS Sequencing 22
- 2bRAD Library Tests for Sequencing at Genewiz 22
- Genotype-by-sequencing at BGI 22
- Goals 20
- Samples Received 17
- Protein expression profiles during sexual maturation in Geoduck 14
- Lineage-specific DNA methylation patterns in developing oysters 11
- BS-seq Libraries for Sequencing at Genewiz 11
- Sea star RNA-seq 10
- Reagent Prep 8
- MBD Enrichment for Sequencing at ZymoResearch 8
- SRA Submissions 6
- SRA Submission 4
- Myostatin Interacting Proteins 3
- Tanner Crab RNAseq 3
- Data Received 3
- 1
- 1
- Olypia Oyster Genome Sequencing 1
- Miscellneous 1
- Tutorials 1
- Sample Submission 1
- Genome Assembly 1
- Data received 1
Miscellaneous
Samples Submitted - M.magister MBD-BSseq Libraries to Univ. of Oregon GC3F
Submitted the M.magister MBD-BSseq libraries created 20201124 using the 4nM aliquots created for the MiSeq test run on 20201202 to the Univ. of Oregon GC3F sequencing core.
Transcriptome Comparisons - C.bairdi Transcriptomes Evaluations with DETONATE rsem-eval on Mox
UPDATE: I’ll lead in with the fact that this failed with an error message that I can’t figure out. This will save the reader some time. I’ve posted the problem as an Issue on the DETONATE GitHub repo, however it’s clear that this software is no longer maintained, as the repo hasn’t been updated in >3yrs; even lacking responses to Issues that are that old.
Alignments - C.bairdi RNAseq Transcriptome Alignments Using Bowtie2 on Mox
I had previously attempted to compare all of our C.bairdi transcriptome assemblies using DETONATE on 20200601, but, due to hitting time limits on Mox, failed to successfully get the analysis to complete. I realized that the limiting factor was performing FastQ alignments, so I decided to run this step independently to see if I could at least get that step resolved. DETONATE (rsem-eval) will accept BAM files as input, so I’m hoping I can power through this alignment step and then provided DETONATE (rsem-eval) with the BAM files.
FastQC-MultiQC - M.magister MBD-BSseq Pool Test MiSeq Run on Mox
Earlier today we received the M.magister (C.magister; Dungeness crab) MiSeq data from Mac.
Data Received - M.magister MBD-BSseq Pool Test MiSeq Run
After creating _M.magister (C.magister; Dungeness crab) MBD-BSseq libraries (on 20201124), I gave the pooled set of samples to Mac for a test sequencing run on the MiSeq on 20201202.
Alignment - C.gigas RNAseq to GCF_000297895.1_oyster_v9 Genome Using STAR on Mox
Mac was getting some weird results when mapping some single cell RNAseq data to the C.gigas mitochondrial (mt) genome that she had, so she asked for some help mapping other C.gigas RNAseq data (GitHub Issue) to the C.gigas mt genome to see if someone else would get similar results.
Trimming - Haws Lab C.gigas Ploidy pH WGBS 10bp 5 and 3 Prime Ends Using fastp and MultiQC on Mox
Making the assumption that the 24 C.gigas ploidy pH WGBS data we receved 20201205 will be analyzed using Bismark
, I decided to go ahead and trim the files according to Bismark
guidelines for libraries made with the ZymoResearch Pico MethylSeq Kit.
FastQC-MultiQc - C.gigas Ploidy pH WGBS Raw Sequence Data from Haws Lab on Mox
Yesterday (20201205), we received the whole genome bisulfite sequencing (WGBS) data back from ZymoResearch from the 24 C.gigas diploid/triploid subjected to two different pH treatments (received from the Haws’ Lab on 20200820 that we submitted to ZymoResearch on 20200824. As part of our standard sequencing data receipt pipeline, I needed to generate FastQC
files for each sample.
Trimming - Ronits C.gigas Ploidy WGBS 10bp 5 and 3 Prime Ends Using fastp and MultiQC on Mox
Steven asked me to trim (GitHub Issue) Ronit’s WGBS sequencing data we received on 20201110, according to Bismark
guidelines for libraries made with the ZymoResearch Pico MethylSeq Kit.
Library Quantification - M.magister MBD BSseq Libraries with Qubit
After reviewing the Bionalyzer assays for the MBD BSseq libraries Mac indicated she’d like to have the libraries quantified using the Qubit.
Trimming - Ronits C.gigas Ploidy WGBS Using fastp and MultiQC on Mox
Steven asked me to trim (GitHub Issue) Ronit’s WGBS sequencing data we received on 20201110, according to Bismark
guidelines for libraries made with the ZymoResearch Pico MethylSeq Kit.
Bioanalyzer - M.magister MBD BSseq Libraries
MBD BSseq library construction was completed yesterday (20201124). Next, I needed to evaluate the libraries using the Roberts Lab Bioanalyzer 2100 (Agilent) to assess library sizes, yields, and qualities (i.e. primer dimers).
MBD BSseq Library Prep - M.magister MBD-selected DNA Using Pico Methyl-Seq Kit
After finishing the final set of eight MBD selections on 20201103, I’m finally ready to make the BSseq libraries using the Pico Methyl-Seq Library Prep Kit (ZymoResearch) (PDF). I followed the manufacturer’s protocols with the following notes/changes (organized by each section in the protocol):
RNA Isolation and Quantification - P.generosa Hemocytes from Shelly
Shelly asked me to isolate RNA from some P.generosa hemocytes (GitHub Issue) that she had.
FastQC-MultiQc - C.gigas Ploidy WGBS Raw Sequence Data from Ronits Project on Mox
Transcriptome Assessment - Crustacean Transcripome Completeness Evaluation Using BUSCO on Mox
Grace was recently working on writing up a manuscript which did a basic comparison of our C.bairdi transcriptome (cbai_transcriptome_v3.1
) (see the Genomic Resources wiki for more deets) to two other species’ transcriptome assemblies. We wanted BUSCO evaluations as part of this comparison, but the two other species did not have BUSCO scores in their respective publications. As such, I decided to generate them myself, as BUSCO runs very quickly. The job was run on Mox.
Data Wrangling - MultiQC on S.salar RNAseq from fastp and HISAT2 on Mox
In Shelly’s GitHub Issue for this S.salar project, she also requested a MultiQC
report for the trimming (completed on 20201029) and the genome alignments (completed on 20201103).
RNAseq Alignments - S.salar HISAT2 BAMs to GCF_000233375.1_ICSASG_v2_genomic.gtf Transcriptome Using StringTie on Mox
This is a continuation of addressing Shelly Trigg’s (regarding some Salmo salar RNAseq data) request (GitHub Issue) to trim (completed 20201029), perform genome alignment (completed on 20201103), and transcriptome alignment.
RNAseq Alignments - Trimmed S.salar RNAseq to GCF_000233375.1_ICSASG_v2_genomic.fa Using Hisat2 on Mox
This is a continuation of addressing Shelly Trigg’s (regarding some Salmo salar RNAseq data) request (GitHub Issue) to trim (completed 20201029), perform genome alignment, and transcriptome alignment.
MBD Selection - M.magister Sheared Gill gDNA 16 of 24 Samples Set 3 of 3
Click here for notebook on the first eight samples processed. Click here for the second set of eight samples processed. M.magister (Dungeness crab) gill gDNA provided by Mackenzie Gavery was previously sheared on 20201026 and three samples were subjected to additional rounds of shearing on 20201027, in preparation for methyl bidning domain (MBD) selection using the MethylMiner Kit (Invitrogen).
MBD Selection - M.magister Sheared Gill gDNA 8 of 24 Samples Set 2 of 3
Click here for notebook on the first eight samples processed. M.magister (Dungeness crab) gill gDNA provided by Mackenzie Gavery was previously sheared on 20201026 and three samples were subjected to additional rounds of shearing on 20201027, in preparation for methyl bidning domain (MBD) selection using the MethylMiner Kit (Invitrogen).
Trimming - Shelly S.salar RNAseq Using fastp and MultiQC on Mox
Shelly asked that I trim, align to a genome, and perform transcriptome alignment counts in this GitHub issue with some Salmo salar RNAseq data she had and, using a subset of the NCBI Salmo salar RefSeq genome, GCF_000233375.1. She created a subset of this genome using only sequences designated as “chromosomes.” A link to the FastA (and a link to her notebook on creating this file) are in that GitHub issue link above. The transcriptome she has provided has not been subsetted in a similar fashion; maybe I’ll do that prior to alignment.
MBD Selection - M.magister Sheared Gill gDNA 8 of 24 Samples Set 1 of 3
DNA Shearing - M.magister gDNA Additional Shearing CH05-01_21 CH07-11 and Bioanalyzer
After shearing all of the M.magister gill gDNA on 20201026, there were still three samples that still had average fragment lengths that were a bit longer than desired (~750bp, but want ~250 - 550bp):
DNA Shearing - M.magister gDNA Shearing All Samples and Bioanalyzer
I previously ran some shearing tests on 20201022 to determine how many cycles to run on the sonicator (Bioruptor 300; Diagenode) to achieve an average fragment length of ~350 - 500bp in preparation for MBD-BSseq. The determination was 70 cycles (30s ON, 30s OFF; low intensity), sonicating for 35 cycles, followed by successive rounds of 5 cycles each.
DNA Shearing - M.magister CH05-21 gDNA Full Shearing Test and Bioanalyzer
Yesterday, I did some shearing of Metacarcinus magister gill gDNA on a test sample (CH05-21) to determine how many cycles to run on the sonicator (Bioruptor 300; Diagenode) to achieve an average fragment length of ~350 - 500bp in preparation for MBD-BSseq. The determination from yesterday was 70 cycles (30s ON, 30s OFF; low intensity). That determination was made by first sonicating for 35 cycles, followed by successive rounds of 5 cycles each. I decided to repeat this, except by doing it in a single round of sonication.
DNA Shearing - M.magister gDNA Shear Testing and Bioanalyzer
Steven assigned me to do some MBD-BSseq library prep (GitHub Issue) for some Dungeness crab (Metacarcinus magister) DNA samples provided by Mackenzie Gavery. The DNA was isolated from juvenile (J6/J7 developmental stages) gill tissue. One of the first steps in MBD-BSseq is to fragment DNA to a desired size (~350 - 500bp in our case). However, we haven’t worked with Metacarcinus magister DNA previously, so I need to empirically determine sonicator (Bioruptor 300; Diagenode) settings for these samples.
Read Mapping - C.bairdi 201002558-2729-Q7 and 6129-403-26-Q7 Taxa-Specific NanoPore Reads to cbai_genome_v1.01.fasta Using Minimap2 on Mox
After extracting FastQ reads using seqtk
on 20201013 from the various taxa I had been interested in, the next thing needed doing was mapping reads to the cbai_genome_v1.01
“genome” assembly from 20200917. I found that Minimap2 will map long reads (e.g. NanoPore), in addition to short reads, so I decided to give that a rip.
Data Wrangling - C.bairdi NanoPore Reads Extractions With Seqtk on Mephisto
In my pursuit to identify which contigs/scaffolds of our “C.bairdi” genome assembly from 20200917 correspond to interesting taxa, based on taxonomic assignments produced by MEGAN6 on 20200928, I used MEGAN6 to extract taxa-specific reads from cbai_genome_v1.01
on 20201007 - the output is only available in FastA format. Since I want the original reads in FastQ format, I will use the FastA sequence IDs (from the FastA index file) and provide that to seqtk
to extract the FastQ reads for each sample and corresponding taxa.
NanoPore Reads Extractions - C.bairdi Taxonomic Reads Extractions with MEGAN6 on 201002558-2729-Q7 and 6129-403-26-Q7
After completing the taxonomic comparisons of 201002558-2729-Q7 and 6129-403-26-Q7 on 20201002, I decided to extract reads assigned to the following taxa for further exploration (primarily to identify contigs/scaffolds in our cbai_genome_v1.0.fasta (19MB).
Comparison - C.bairdi 20102558-2729 vs. 6129-403-26 NanoPore Taxonomic Assignments Using MEGAN6
After noticing that the initial MEGAN6 taxonomic assignments for our combined C.bairdi NanoPore data from 20200917 revealed a high number of bases assigned to E.canceri and Aquifex sp., I decided to explore the taxonomic breakdown of just the individual samples to see which of the samples was contributing to these taxonomic assignments most.
Taxonomic Assignments - C.bairdi 6129-403-26-Q7 NanoPore Reads Using DIAMOND BLASTx on Mox and MEGAN6 daa2rma on emu
After noticing that the initial MEGAN6 taxonomic assignments for our combined C.bairdi NanoPore data from 20200917 revealed a high number of bases assigned to E.canceri and Aquifex sp., I decided to explore the taxonomic breakdown of just the individual samples to see which of the samples was contributing to these taxonomic assignments most.
Taxonomic Assignments - C.bairdi 20102558-2729-Q7 NanoPore Reads Using DIAMOND BLASTx on Mox and MEGAN6 daa2rma on emu
After noticing that the initial MEGAN6 taxonomic assignments for our combined C.bairdi NanoPore data from 20200917 revealed a high number of bases assigned to E.canceri and Aquifex sp., I decided to explore the taxonomic breakdown of just the individual samples to see which of the samples was contributing to these taxonomic assignments most.
Data Wrangling - C.bairdi NanoPore 6129-403-26 Quality Filtering Using NanoFilt on Mox
Last week, I ran all of our Q7-filtered C.baird NanoPore reads through MEGAN6 to evaluate the taxonomic breakdown (on 20200917) and noticed that there were a large quantity of bases assigned to E.canceri (a known microsporidian agent of infection in crabs) and Aquifex sp. (a genus of thermophylic bacteria), in addition to the expected Arthropoda assignments. Notably, Alveolata assignments were remarkably low.
Data Wrangling - C.bairdi NanoPore 20102558-2729 Quality Filtering Using NanoFilt on Mox
Last week, I ran all of our Q7-filtered C.baird NanoPore reads through MEGAN6 to evaluate the taxonomic breakdown (on 20200917) and noticed that there were a large quantity of bases assigned to E.canceri (a known microsporidian agent of infection in crabs) and Aquifex sp. (a genus of thermophylic bacteria), in addition to the expected Arthropoda assignments. Notably, Alveolata assignments were remarkably low.
Assembly Assessment - BUSCO C.bairdi Genome v1.01 on Mox
After creating a subset of the cbai_genome_v1.0
of contigs >100bp yesterday (subset named cbai_genome_v1.01
), I wanted to generate BUSCO scores for cbai_genome_v1.01
. This is primarily just to keep info consistent on our Genomic Resources wiki, as I don’t expect these scores to differ at all from the cbai_genome_v1.0
BUSCO scores.
Data Wrangling - Subsetting cbai_genome_v1.0 Assembly with faidx
Previously assembled cbai_genome_v1.0.fasta
with our NanoPore Q7 reads on 20200917 and noticed that there were numerous sequences that were well shorter than the expected 500bp threshold that the assembler (Flye) was supposed to spit out. I created an Issue on the Flye GitHub page to find out why. The developer responded and determined it was an issue with the assembly polisher and that sequences <500bp could be safely ignored.
Assembly Assessment - BUSCO C.bairdi Genome v1.0 on Mox
After using Flye to perform a de novo assembly of our Q7 filtered NanoPore sequencing data on 20200917, I decided to check the “completeness” of the assembly using BUSCO on Mox.
Data Wrangling - C.bairdi NanoPore Quality Filtering Using NanoFilt on Mox
I previously converting our C.bairdi NanoPre sequencing data from the raw Fast5 format to FastQ format for our three sets of data:
Taxonomic Assignments - C.bairdi NanoPore Reads Using DIAMOND BLASTx on Mox and MEGAN6 daa2rma on swoose
Earlier today I quality filtered (>=Q7) our C.baird NanoPore reads. One of the things I’d like to do now is to attempt to filter reads taxonomically, since the NanoPore data came from both an uninfected crab and Hematodinium-infected crab.
qPCR - Geoduck Normalizing Gene Primers 28s-v4 and EF1a-v1 Tests
On Monday (20200914), I checked a set of 28s and EF1a primer sets and determined that 28s-v4 and EF1a-v1 were probably the best of the bunch, although they all looked great. So, I needed to test these out on some individual cDNA samples to see if they might be useful as normalizing genes - should have consistent Cq values across all samples/treatments.
qPCR - Geoduck Normalizing Gene Primer Checks
Shelly ordered some new primers (designed by Sam Gurr) (GitHub Issue) to potentially use as normalizing genes for her geoduck reproduction gene expression project and asked that I test them out.
Data Wrangling - Visualization of C.bairdi NanoPore Sequencing Using NanoPlot on Mox
I previously converting our C.bairdi NanoPre sequencing data from the raw Fast5 format to FastQ format for our three sets of data:
Data Wrangling - NanoPore Fast5 Conversion to FastQ of C.bairdi 6129_403_26 on Mox with GPU Node
Time to start working with the NanoPore data that I generated back in March (???!!!). In order to proceed, I first need to convert the raw Fast5 files to FastQ. To do so, I’ll use the NanoPore program guppy
.
Data Wrangling - NanoPore Fast5 Conversion to FastQ of C.bairdi 20102558-2729 Run-02 on Mox with GPU Node
Continuing to work with the NanoPore data that I generated back in January(???!!!). In order to proceed, I first need to convert the raw Fast5 files to FastQ. To do so, I’ll use the NanoPore program guppy
. I converted the first run from this flowcell earlier today.
Data Wrangling - NanoPore Fast5 Conversion to FastQ of C.bairdi 20102558-2729 Run-01 on Mox with GPU Node
Time to start working with the NanoPore data that I generated back in January(???!!!). In order to proceed, I first need to convert the raw Fast5 files to FastQ. To do so, I’ll use the NanoPore program guppy
.
DNA Quantification - Re-quant Ronits C.gigas Diploid-Triploid Ctenidia gDNA Submitted to ZymoResearch
I received notice from ZymoResearch yesterday afternoon that the DNA we sent on 20200820 for this project (Quote 3534) had insufficient DNA for sequencing for most of the samples. This was, honestly, shocking. I had even submitted well over the minimum amount of DNA required (submitted 1.75ug - only needed 1ug). So, I’m not entirely sure what happened here.
Transcriptome Annotation - Trinotate C.bairdi v2.1 on Mox
To continue annotation of our C.bairdi v2.1 transcriptome assembly], I wanted to run Trinotate.
Transcriptome Annotation - Trinotate C.bairdi v3.1 on Mox
To continue annotation of our C.bairdi v3.1 transcriptome assembly], I wanted to run Trinotate.
Transcriptome Annotation - Trinotate Hematodinium v3.1 on Mox
Transcriptome Annotation - Trinotate Hematodinium v2.1 on Mox
Transcriptome Annotation - Trinotate Hematodinium v1.7 on Mox
Transcriptome Annotation - Trinotate Hematodinium v1.6 on Mox
TransDecoder - C.bairdi Transcriptomes v2.1 and v3.1 on Mox
To continue annotation of our C.bairdi v2.1 & v3.1 transcriptome assemblies, I needed to run TransDecoder before performing the more thorough annotation with Trinotate.
qPCR - P.generosa RPL5 and TIF3s6b v2 and v3 Normalizing Gene Assessment
After testing out the RPL5 and TIF3s6b v2 and v3 primers yesterday on pooled cDNA, we determined the primers looked good, so will go forward testing them on a set of P.generosa hemolymph cDNA made by Kaitlyn on 20200212. This will evaluate whether or not these can be utilized as normalizing genes for subsequent gene expression analyses.
Sample Submitted - C.gigas Diploid-Triploid pH Treatments Ctenidia to ZymoResearch for WGBS
Submitted 1.5ug of the 24 C.gigas ctenidia ctenidia gDNA isolated last week (20200821) to ZymoResearch for whole genome bisulfite sequencing (WGBS) to compare differences in diploid/triploids and responses to elevated pH:
qPCR - P.generosa RPL5-v2-v3 and TIF3s6b-v2-v3 Primer Tests
Shelly ordered some new primers as potential normalizing genes and asked me to test them out (GitHub Issue).
DNA Isolation and Quantification - C.gigas High-Low pH Triploid and Diploid Ctenidia
Isolated DNA from 24 of the Crassostrea gigas high/low pH triploid/diploid ctenidia samples that we received yesterday from the Haws Lab. Samples selected by Steven.
Samples Submitted - Ronits C.gigas Diploid and Triploid Ctenidia to ZymoResearch for WGBS
Submitted 1.75ug of gDNA from 10 Crassostrea gigas ctenidia samples from Ronit’s dessication/temp/ploidy experiment to ZymoResearch for whole genome bisulfite sequencing (BSseq). They will sequence to ~30x coverage, using 150bp PE reads.
Assembly Stats - C.bairdi Transcriptomes v2.1 and v3.1 Trinity Stats on Mox
Realized that transcriptomes v2.1 and v3.1 (extracted from BLASTx-annotated FastAs from 20200605) didn’t have any associated stats.
Trimming-FastQC-MultiQC - Robertos C.gigas WGBS FastQ Data with fastp FastQC and MultiQC on Mox
Steven asked me to trim Roberto’s C.gigas whole genome bisulfite sequencing (WGBS) reads (GitHub Issue) “following his methods”. The only thing specified is trimming Illumina adaptors and then trimming 10bp from the 5’ end of reads. No mention of which software was used.
TransDecoder - Hematodinium Transcriptomes v1.6, v1.7, v2.1 and v3.1 on Mox
To continue annotation of our Hematodinium v1.6, v1.7, v2.1 & v3.1 transcriptome assemblies, I needed to run TransDecoder before performing the more thorough annotation with Trinotate.
Assembly Stats - cbaiodinium Transcriptomes v2.1 and v3.1 Trinity Stats on Mox
Working on dealing with our various cbaiodinium sp. transcriptomes and realized that transcriptomes v2.1 and v3.1 (extracted from BLASTx-annotated FastAs from 20200605) didn’t have any associated stats.
Transcriptome Annotation - Hematodinium Transcriptomes v1.6 v1.7 v2.1 v3.1 with DIAMOND BLASTx on Mox
Needed to annotate the Hematodinium sp. transcriptomes that I’ve assembled using DIAMOND BLASTx. This will also be used for additional downstream annotation (TransDecoder, Trinotate):
Transcriptome Assessment - BUSCO Metazoa on Hematodinium v1.6 v1.7 v2.1 and v3.1 on Mox
Transcriptome Assembly - Hematodinium Transcriptomes v1.6 and v1.7 with Trinity on Mox
I’d previously assembled hemat_transcriptome_v1.0.fasta
on 20200122, hemat_transcriptome_v1.5.fasta
on 20200408, extracted hemat_transcriptome_v2.1.fasta
from an existing FastA on 20200605, as well as extracted hemat_transcriptome_v3.1.fasta
on 20200605.
qPCR - P.generosa APLP and TIF3s8-1 with cDNA
Shelly asked me to run some qPCRs (GitHub Issue), after some of the qPCR results I got from primer tests with normalzing genes and potential gene targets.
FastQ Read Alignment and Quantification - P.generosa Water Metagenomic Libraries to MetaGeneMark Assembly with Hisat2 on Mox
Continuing working on the manuscript for this data, Emma wanted the number of reads aligned to each gene. I previously created and assembly with genes/proteins using MetaGeneMark on 20190103, but the assemby process didn’t output any sort of stastics on read counts.
Primer Design and In-Silico Testing - Geoduck Reproduction Primers
Shelly asked that I re-run the primer design pipeline that Kaitlyn had previously run to design a set of reproduction-related qPCR primers. Unfortunately, Kaitlyn’s Jupyter Notebook wasn’t backed up and she accidentally deleted it, I believe, so there’s no real record of how she designed the primers. However, I do know that she was unable to run the EMBOSS primersearch tool, which will check your primers against a set of sequences for any other matches. This is useful for confirming specificity.
qPCR - Testing P.generosa Reproduction-related Primers
Ran some qPCRs on some other primers on 20200723 and then Shelly has asked me to test some additional qPCR primers that might have acceptable melt curves and be usable as normalizing genes.
SRA Submission - P.generosa Metagenomics Data
Added our P.generosa metagenomics sequencing data to NCBI sequencing read archive (SRA).
qPCR - Testing P.generosa Reproduction-related Primers
Shelly has asked me to test some qPCR primers related to geoduck reproduction.
DNA Isolation and Quantification - C.gigas Diploid (Ronit) and Triploid (Nisbet)
Isolated some gDNA from the triploid Nisbet oysters we received on 20200218 and one of Ronit’s diploid ctenidia samples (Google Sheet) using the E.Z.N.A. Mollusc DNA Kit (Omega). See the “Results” section for sample info.
Metagenomics - Data Extractions Using MEGAN6
Decided to finally take the time to methodically extract data from our metagenomics project so that I have the tables handy when I need them and I can easily share them with other people. Previously, I hadn’t done this due to limitations on looking at the data remotely. I finally downloaded all of the RMA6 files from 20191014 after being fed up with the remote desktop connection and upgrading the size of my hard drive (5 of the six RMA6 files are >40GB in size).
Transcriptome Annotation - C.bairdi Transcriptomes v2.1 and v3.1 Using DIAMOND BLASTx on Mox
Decided to annotate the two C.bairdi transcriptomes , cbai_transcriptome_v2.1
and cbai_transcriptome_v3.1
, generated on 20200605 using DIAMOND BLASTx on Mox.
Transcriptome Assessment - BUSCO Metazoa on C.bairdi Transcriptome v3.1
Continuing to try to identify the best C.bairdi transcriptome, we decided to extract all non-dinoflagellate sequences from cbai_transcriptome_v2.0
(RNAseq shorthand: 2018, 2019, 2020-GW, 2020-UW) and cbai_transcriptome_v3.0
(RNAseq shorthand: 2018, 2019, 2020-UW).
Transcriptome Assessment - BUSCO Metazoa on C.bairdi Transcriptome v2.1
Continuing to try to identify the best C.bairdi transcriptome, we decided to extract all non-dinoflagellate sequences from cbai_transcriptome_v2.0
(RNAseq shorthand: 2018, 2019, 2020-GW, 2020-UW) and cbai_transcriptome_v3.0
(RNAseq shorthand: 2018, 2019, 2020-UW).
Sequence Extractions - C.bairdi Transcriptomes v2.0 and v3.0 Excluding Alveolata with MEGAN6 on Swoose
Continuing to try to identify the best C.bairdi transcriptome, we decided to extract all non-dinoflagellate sequences from cbai_transcriptome_v2.0
(RNAseq shorthand: 2018, 2019, 2020-GW, 2020-UW) and cbai_transcriptome_v3.0
(RNAseq shorthand: 2018, 2019, 2020-UW). Both of these transcriptomes were assembled without any taxonomic filter applied. DIAMOND BLASTx and conversion to MEGAN6 RMA6 files was performed yesterday (20200604).
Transcriptome Annotation - C.bairdi Transcriptomes v2.0 and v3.0 with DIAMOND BLASTx on Mox
Continuing to try to identify the best C.bairdi transcriptome, we decided to extract all non-dinoflagellate sequences from cbai_transcriptome_v2.0
(RNAseq shorthand: 2018, 2019, 2020-GW, 2020-UW) and cbai_transcriptome_v3.0
(RNAseq shorthand: 2018, 2019, 2020-UW). Both of these transcriptomes were assembled without any taxonomic filter applied.
Transcriptome Comparison - C.bairdi Transcriptomes Evaluations with DETONATE on Mox
Transcriptome Comparison - C.bairdi Transcriptomes Compared with DETONATE on Mox
We’ve produced a number of C.bairdi transcriptomes and we’re interested in doing some comparisons to try to determine which one might be “best”. I previously compared the BUSCO scores of each of these transcriptomes and now will be using the DETONATE software package to perform two different types of comparisons: compared to a reference (REF-EVAL) and determine an overall quality “score” (RSEM-EVAL). I’ll be running REF-EVAL in this notebook.
Transcriptome Annotation - Trinotate C.bairdi Transcriptome-v1.7 on Mox
After creating a de novo assembly of C.bairdi transcriptome v1.7 on 20200527, performing BLASTx annotation on 202000527, and TransDecoder for ORF identification on 20200527, I continued the annotation process by running Trinotate.
Transcriptome Comparisons - C.bairdi BUSCO Scores
Since we’ve generated a number of versions of the C.bairdi transcriptome, we’ve decided to compare them using various metrics. Here, I’ve compared the BUSCO scores generated for each transcriptome using BUSCO’s built-in plotting script. The script generates a stacked bar plot of all BUSCO short summary files that it is provided with, as well as the R code used to generate the plot.
TransDecoder - C.bairdi Transcriptome v1.7 on Mox
Need to run TransDecoder on Mox on the C.bairdi transcriptome v1.7 from 20200527.
Transcriptome Annotation - C.bairdi Transcriptome v1.7 Using DIAMOND BLASTx on Mox
As part of annotating cbai_transcriptome_v1.7.fasta from 20200527, I need to run DIAMOND BLASTx to use with Trinotate.
Transcriptome Assessment - BUSCO Metazoa on C.bairdi Transcriptome v1.7
I previously created a C.bairdi de novo transcriptome assembly v1.7 with Trinity from all our C.bairdi taxonomically filtered pooled RNAseq samples on 20200527 and decided to assess its “completeness” using BUSCO and the metazoa_odb9
database.
Transcriptome Assembly - C.bairdi All Pooled Arthropoda-only RNAseq Data with Trinity on Mox
For completeness sake, I wanted to create an additional C.bairdi transcriptome assembly that consisted of Arthropoda only sequences from just pooled RNAseq data (since I recently generated a similar assembly without taxonomically filtered reads on 20200518). This constitutes samples we have designated: 2018, 2019, 2020-UW. A de novo assembly was run using Trinity on Mox. Since all pooled RNAseq libraries were stranded, I added this option to Trinity command.
Transcriptome Annotation - Trinotate C.bairdi Transcriptome-v3.0 on Mox
After performing de novo assembly on all of our Tanner crab RNAseq data (no taxonomic filter applied, either) on 20200518, I continued the annotation process by running Trinotate.
Transcriptome Assembly - P.trituberculatus (Japanese blue crab) NCBI SRA BioProject PRJNA597187 Data with Trinity on Mox
After generating a number of C.bairdi (Tanner crab) transcriptomes, we decided we should compare them to evaluate which to help decide which one should become our “canonical” version. As part of that, the Trinity wiki offers a list of tools that one can use to check the quality of transcriptome assemblies. Some of those require a transcriptome of a related species.
SRA Library Assessment - Determine RNAseq Library Strandedness from P.trituberculatus SRA BioProject PRJNA597187
We’ve produced a number of C.bairid transcriptomes utilizing different assembly approaches (e.g. Arthropoda reads only, stranded libraries only, mixed strandedness libraries, etc) and we want to determine which of them is “best”. Trinity has a nice list of tools to assess the quality of transcriptome assemblies, but most of the tools rely on comparison to a transcriptome of a related species.
Transcriptome Annotation - Trinotate C.bairdi Transcriptome-v1.6 on Mox
After creating a de novo assembly of C.bairdi transcriptome v1.6 on 20200518, performing BLASTx annotation on 202000519, and TransDecoder for ORF identification on 20200519, I continued the annotation process by running Trinotate.
TransDecoder - C.bairdi Transcriptome v1.6 on Mox
Need to run TransDecoder on Mox on the C.bairdi transcriptome v1.6 from 20200518.
TransDecoder - C.bairdi Transcriptome v3.0 from 20200518 on Mox
Need to run TransDecoder on Mox on the C.bairdi transcriptome v3.0 from 20200518.
Transcriptome Annotation - C.bairdi Transcriptome v1.6 Using DIAMOND BLASTx on Mox
As part of annotating cbai_transcriptome_v1.6.fasta from 20200518, I need to run DIAMOND BLASTx to use with Trinotate.
Transcriptome Assessment - BUSCO Metazoa on C.bairdi Transcriptome v1.6
I previously created a C.bairdi de novo transcriptome assembly v1.6 with Trinity from all our C.bairdi taxonomically filtered RNAseq on 20200518 and decided to assess its “completeness” using BUSCO and the metazoa_odb9
database.
Transcriptome Annotation - C.bairdi Transcriptome v3.0 Using DIAMOND BLASTx on Mox
As part of annotating cbai_transcriptome_v3.0.fasta from 20200518, I need to run DIAMOND BLASTx to use with Trinotate.
Transcriptome Assessment - BUSCO Metazoa on C.bairdi Transcriptome v3.0
I previously created a C.bairdi de novo transcriptome assembly with Trinity from all our C.bairdi pooled RNAseq (not taxonomically filtered) on 20200518 and decided to assess its “completeness” using BUSCO and the metazoa_odb9
database.
Transcriptome Assembly - C.bairdi All Arthropoda-specific RNAseq Data with Trinity on Mox
I realized I hadn’t performed taxonomic read separation from one set of RNAseq data we had. And, since I was on a transcriptome assembly kick, I figured I’d generate another C.bairdi transcriptome that included only Arthropoda-specific sequence data from all of our RNAseq.
Data Wrangling - Arthropoda and Alveolata D26 Pool RNAseq FastQ Extractions
After using MEGAN6 to extract Arthropoda and Alveolata reads from our RNAseq data on 20200114, I had then extracted taxonomic-specific reads and aggregated each into basic Read 1 and Read 2 FastQs to simplify transcriptome assembly for C.bairdi and for Hematodinium. That was fine and all, but wasn’t fully thought through.
Transcriptome Assembly - C.bairdi All Pooled RNAseq Data Without Taxonomic Filters with Trinity on Mox
Steven asked that I assemble a transcriptome with just our pooled C.bairdi RNAseq data (not taxonomically filtered; see the FastQ list file linked in the Results section below). This constitutes samples we have designated: 2018, 2019, 2020-UW. A de novo assembly was run using Trinity on Mox. Since all pooled RNAseq libraries were stranded, I added this option to Trinity command.
Transcriptome Annotation - Trinotate C.bairdi Transcriptome v2.0 from 20200502 on Mox
After performing de novo assembly on all of our Tanner crab RNAseq data (no taxonomic filter applied, either) on 20200502 and performing BLASTx annotation on 20200508, I continued the annotation process by running Trinotate.
TransDecoder - C.bairdi Transcriptome v2.0 from 20200502 on Mox
Need to run TransDecoder on Mox on the C.bairdi transcriptome v2.0 from 20200502.
Transcriptome Annotation - C.bairdi Transcriptome v2.0 Using DIAMOND BLASTx on Mox
As part of annotating the C.bairdi v2.0 transcriptome assembly from 20200502, I need to run DIAMOND BLASTx to use with Trinotate.
Transcriptome Assessment - BUSCO Metazoa on C.bairdi v2.0 Transcriptome
I previously created a C.bairdi de novo transcriptome assembly with Trinity using all existing, unfiltered (i.e. no taxonomic selection) RNAseq data on 20200502 and decided to assess its “completeness” using BUSCO and the metazoa_odb9
database.
Transcriptome Assembly - C.bairdi All RNAseq Data Without Taxonomic Filters with Trinity on Mox
Steven asked that I assemble an unfiltered (i.e. no taxonomic selection) transcriptome with all of our C.bairdi RNAseq data (see the FastQ list file linked in the Results section below). A de novo assembly was run using Trinity on Mox. It should be noted that this assembly is a mixture of stranded/non-stranded library preps.
GO to GOslim - C.bairdi Enriched GO Terms from 20200422 DEGs
After running pairwise comparisons and identify differentially expressed genes (DEGs) on 20200422 and finding enriched gene ontology terms, I decided to map the GO terms to Biological Process GOslims. Additionally, I decided to try another level of comparison (I’m not sure how valid it is), whereby I will count the number of GO terms assigned to each GOslim and then calculate the percentage of GOterms that get assigned to each of the GOslim categories. The idea being that it might help identify Biological Processes that are “favored” in a given set of DEGs. I decided to set up “fancy” pyramid plots to view a given set of GO-GOslims for each DEG comparison.
FastQC-MultiQC - Laura Spencer’s QuantSeq Data
Laura Spencer received her O.lurida QuantSeq data, so I put it through FastQC/MultiQC and put the pertinent info in the nightingales Google Sheet. I also moved the data to /owl/nightingales/O_lurida
, updated the readme file and checksums file. There were 148 individual samples, so I won’t list them all here.
Gene Expression - C.bairdi Pairwise DEG Comparisons with 2019 RNAseq using Trinity-Salmon-EdgeR on Mox
Per a Slack request, Steven asked me to take the Genewize RNAseq data (received 2020318) through edgeR. Ran the analysis using the Trinity differential expression pipeline:
RNAseq Reads Extractions - C.bairdi Taxonomic Reads Extractions with MEGAN6 on swoose
Transcript Abundance - C.bairdi Alignment-free with Salmon on Mox for Grace
Per this GitHub Issue, Grace and Steven asked if I could help by generating a transcript abundance file for Grace to use with EdgeR. To do so, I used Salmon for alignment-free transcript abundance estimates due to its speed and its incorporation into Trinity with the following files:
SRA Submission - C.bairdi RNAseq Data
Since we received the last of our RNAseq data for this project on 20200413, I submitted all of it to the NCBI Sequencing Read Archive (SRA). Data was released today and all accession numbers can be found in the table below:
TrimmingFastQCMultiQC—C.bairdi-RNAseq-FastQ-with-fastp-on-Mox
After receiving our RNAseq data from Genewiz earlier today, needed to trim and check trimmed reads with FastQC.
Taxonomic Assignments - C.bairdi RNAseq Using DIAMOND BLASTx on Mox and MEGAN6 Meganizer on swoose
After receiving/trimming the latest round of C.bairdi RNAseq data on 20200413, need to get the data ready to perform taxonomic selection of sequencing reads. To do this, I first need to run DIAMOND BLASTx, then “meganize” the output files in preparation for loading into MEGAN6, which will allow for taxonomic-specific read separation.
FastQC-MultiQC - C.bairdi Raw RNAseq from NWGSC
Yesterday, we received the last of the RNAseq data for the C.bairdi crab project from NWGSC. FastQC, followed by MultiQC was run on the raw FastQ reads on my computer (swoose).
Data Wrangling - Arthropoda and Alveolata Day and Treatment Taxonomic RNAseq FastQ Extractions
After using MEGAN6 to extract Arthropoda and Alveolata reads from our RNAseq data on 20200330, I had then extracted taxonomic-specific reads and aggregated each into basic Read 1 and Read 2 FastQs to simplify transcriptome assembly for C.bairdi and for Hematodinium. That was fine and all, but wasn’t fully thought through.
Transcriptome Annotation - Trinotate C.bairdi MEGAN6 Taxonomic-specific Trinity Assembly on Mox
After performing de novo assembly on our Tanner crab MEGAN6 taxonomic-specific RNAseq data on 20200330 and performing BLASTx annotation on 20200408, I continued the annotation process by running Trinotate.
Transcriptome Annotation - C.bairdi MEGAN Trinity Assembly Using DIAMOND BLASTx on Mox
As part of annotating the most recent transcriptome assembly from the MEGAN6 Arthropoda taxonomic-specific reads, I need to run DIAMOND BLASTx to use with Trinotate.
Transcriptome Annotation - Trinotate Hematodinium MEGAN6 Taxonomic-specific Trinity Assembly on Mox
After performing de novo assembly on our Hematodinium MEGAN6 taxonomic-specific RNAseq data on 20200330 and performing BLASTx annotation on 20200331, I continued the annotation process by running Trinotate.
Transdecoder - Hematodinium MEGAN6 Taxonomic-Specific Reads Assembly from 20200330
Ran Trinity to de novo assembly on the the Alveolata MEGAN6 taxonomic-specific RNAseq data on 20120330 and now will begin annotating the transcriptome using TransDecoder on Mox.
Transdecoder - C.bairdi MEGAN6 Taxonomic-Specific Reads Assembly from 20200330
Ran Trinity to de novo assembly on the the Arthropoda MEGAN6 taxonomic-specific RNAseq data on 20120330 and now will begin annotating the transcriptome using TransDecoder on Mox.
Transcriptome Assessment - BUSCO Metazoa on C.bairdi MEGAN Transcriptome
I previously created a C.bairdi de novo transcriptome assembly with Trinity from the MEGAN6 taxonomic-specific reads for Arthropoda on 20200330 and decided to assess its “completeness” using BUSCO and the metazoa_odb9
database.
Transcriptome Annotation - Hematodinium MEGAN Trinity Assembly Using DIAMOND BLASTx on Mox
As part of annotating the most recent transcriptome assembly from the MEGAN6 Hematodinium taxonomic-specific reads, I need to run DIAMOND BLASTx to use with Trinotate.
Transcriptome Assessment - BUSCO Metazoa on Hematodinium MEGAN Transcriptome
I previously created a C.bairdi de novo transcriptome assembly with Trinity from the MEGAN6 taxonomic-specific reads for Alveolata on 20200331 and decided to assess its “completeness” using BUSCO and the metazoa_odb9
database.
Transcriptome Assembly - Hematodinium with MEGAN6 Taxonomy-specific Reads with Trinity on Mox
Ran a de novo assembly using the extracted reads classified under Alveolata from:
Transcriptome Assembly - C.bairdi with MEGAN6 Taxonomy-specific Reads with Trinity on Mox
Ran a de novo assembly using the extracted reads classified under Arthropoda from:
RNAseq Reads Extractions - C.bairdi Taxonomic Reads Extractions with MEGAN6 on swoose
Transcriptome Annotation - C.bairdi Using DIAMOND BLASTx on Mox and MEGAN6 Meganizer on swoose
After receiving/trimming the latest round of C.bairdi RNAseq data on 20200318, need to get the data ready to perform taxonomic selection of sequencing reads. To do this, I first need to run DIAMOND BLASTx, then “meganize” the output files in preparation for loading into MEGAN6, which will allow for taxonomic-specific read separation.
Trimming/FastQC/MultiQC - C.bairdi RNAseq FastQ with fastp on Mox
After receiving our RNAseq data from Genewiz earlier today, needed to run FastQC, trim, check trimmed reads with FastQC.
DNA Isolation and Quantification - C.bairdi Hemocyte Pellets in RNAlater
Isolated DNA from 22 samples (see Qubit spreadsheet in “Results” below for sample IDs) using the Quick DNA/RNA Microprep Kit (ZymoResearch; PDF) according to the manufacturer’s protocol for liquids/cells in RNAlater.
NanoPore Sequencing - C.bairdi gDNA 6129_403_26
After getting high quality gDNA from Hematodinium-infected C.bairdi hemolymph on 2020210 we decided to run some of the sample on the NanoPore MinION, since the flowcells have a very short shelf life. Additionally, the results from this will also help inform us on whether this sample might worth submitting for PacBio sequencing. And, of course, this provides us with additional sequencing data to complement our previous NanoPore runs from 20200109.
qPCR - C.bairdi RNA Check for Residual gDNA
Previuosly checked existing crab RNA for residual gDNA on 20200226 and identified samples with yields that were likely too low, as well as samples with residual gDNA. For those samples, was faster/easier to just isolate more RNA and perform the in-column DNase treatment in the ZymoResearch Quick DNA/RNA Microprep Plus Kit; this keeps samples concentrated. So, I isolated more RNA on 20200306 and now need to check for residual gDNA.
Trimming/MultiQC - Methcompare Bisulfite FastQs with fastp on Mox
Steven asked me to trim a set of FastQ files, provided by Hollie Putnam, in preparation for methylation analysis using Bismark. The analysis is part of a coral project comparing DNA methylation profiles of different species, as well as comparing different sample prep protocols. There’s a dedicated GitHub repo here:
RNA Isolation and Quantification - C.bairdi RNA from Hemolymph Pellets in RNAlater
Based on qPCR results testing for residual gDNA from 20200225, a set of 24 samples were identified that required DNase treatment and/or additional RNA. I opted to just isolate more RNA from all samples, since the kit includes a DNase step and avoids diluting the existing RNA using the Turbo DNA-free Kit that we usully use. Isolated RNA using the Quick DNA/RNA Microprep Kit (ZymoResearch; PDF) according to the manufacturer’s protocol for liquids/cells in RNAlater.
Data Wrangling - Create Canonical Olurida_v081 Genes FastA
I finally had some time to tackle this GitHub Issue and create a canonical genes FastA file using the MAKER IDs, instead of the original contig IDs from our Olympia oyster genome assembly - https://owl.fish.washington.edu/halfshell/genomic-databank/Olurida_v081.fa (FastA; 1.1GB).
qPCR - C.bairdi RNA Check for Residual gDNA
After deciding on a primer set to use for gDNA detection on 20200225, went ahead and ran a qPCR on most of the RNA samples described in Grace’s Google Sheet. Some samples were not run, as they had not yet been located at the time I began the qPCR.
qPCR - C.bairdi Primer Tests on gDNA
We received the primers I ordered on 20200220 and now need to test them to see if they detect gDNA. If yes, then they’re good candidates to assess the presence of residual gDNA in our RNA samples before we proceed with reverse transcription.
Primer Design - C.bairdi Primers for Checking RNA for Residual gDNA
Getting ready to run some qPCRs and first we need to confirm that our RNA is actually DNA-free. Before we can do that, we need some primers to use, so I decided to semi-arbitrarily select three different gene targets from our MEGAN6 taxonomic-specific Trinity assembly from 20200122.
DNA Isolation & Quantification - C.bairdi RNA from Samples 6212_132_9 6212_334_12 6212_485_26
Isolated DNA from three samples (see Qubit spreadsheet in “Results” below for sample IDs) using the Quick DNA/RNA Microprep Kit (ZymoResearch; PDF) according to the manufacturer’s protocol for liquids/cells in RNAlater.
RNA Isolation & Quantification - C.bairdi RNA from Samples 6212_132_9 6212_334_12 6212_485_26
We are supposed to get RNA sent out for sequencing today, but it turns out that a few of the designated samples have insufficient RNA in them. So, I’m going to attempt to isolate enough RNA from the following samples in order to have enough RNA to send to Genewiz today:
RNA Isolation & Quantification - C.bairdi RNA from Sample 6129_403_26
Since I was isolating gDNA from C.bairdi 6129_403_26 hemolymph, I figured I might as well co-isolate RNA since I was using the Quick DNA/RNA Microprep Plus Kit (ZymoResearch).
DNA Isolation & Quantification - Additional C.bairdi gDNA from Sample 6129_403_26
Earlier today I isolated gDNA from C.bairi 6129_403_26 hemolymph pellets and recovered decently intact gDNA that could be used for sequencing. However, I still need more gDNA, so will isolate that (and co-isolate RNA, since I’m going through the procedure anyway) using the rest of the sample using the Quick DNA/RNA Microprep Plus Kit (ZymoResearch).
DNA Isolation, Quantification, and Gel - C.bairdi gDNA Sample 6129_403_26
In order to do some genome sequencing on C.bairid and Hematodinium, we need hihg molecular weight gDNA. I attempted this twice before, using two different methods (Quick DNA/RNA Microprep Kit (ZymoResearch) on 20200122 and the E.Z.N.A Mollusc DNA Kit (Omega) on 20200108) using ~10yr old ethanol-preserved tissue provided by Pam Jensen. Both methods yielded highly degrade gDNA. So, I’m now attempting to get higher quality gDNA from the RNAlater-preserved hemolymph pellets from this experiment.
Transcriptome Assessment - BUSCO Metazoa on Hematodinium MEGAN Transcriptome
I previously created a Hematodinium de novo transcriptome assembly with Trinity from the MEGAN6 taxonomic-specific reads for Alveolata on 20200122 and decided to assess its “completeness” using BUSCO and the metazoa_odb9
database.
Transcriptome Assessment - BUSCO Metazoa on C.bairdi MEGAN Transcriptome
I previously created a C.bairdi de novo transcriptome assembly with Trinity from the MEGAN6 taxonomic-specific reads for Arthropoda on 20200122 and decided to assess its “completeness” using BUSCO and the metazoa_odb9
database.
Gene Expression - Hematodinium MEGAN6 with Trinity and EdgeR
After completing annotation of the Hematodinium MEGAN6 taxonomic-specific Trinity assembly using Trinotate on 20200126, I performed differential gene expression analysis and gene ontology (GO) term enrichment analysis using Trinity’s scripts to run EdgeR and GOseq, respectively. The comparison listed below is the only comparison possible, as there were no reads present in the uninfected Hematodinium extractions.
Gene Expression - C.bairdi MEGAN6 with Trinity and EdgeR
After completing annotation of the C.bairdi MEGAN6 taxonomic-specific Trinity assembly using Trinotate on 20200126, I performed differential gene expression analysis and gene ontology (GO) term enrichment analysis using Trinity’s scripts to run EdgeR and GOseq, respectively, across all of the various treatment comparisons. The comparison are listed below and link to each individual SBATCH script (GitHub) used to run these on Mox.
Data Wrangling - Arthropoda and Alveolata Day and Treatment Taxonomic RNAseq FastQ Extractions
After using MEGAN6 to extract Arthropoda and Alveolata reads from our RNAseq data on 20200114, I had then extracted taxonomic-specific reads and aggregated each into basic Read 1 and Read 2 FastQs to simplify transcriptome assembly for C.bairdi and for Hematodinium. That was fine and all, but wasn’t fully thought through.
DNA Isolation and Quantification - C.bairdi Hemocyte Pellets in RNAlater
Isolated DNA from 56 samples (see Qubit spreadsheet in “Results” below for sample IDs) using the Quick DNA/RNA Microprep Kit (ZymoResearch; PDF) according to the manufacturer’s protocol for liquids/cells in RNAlater.
Transcriptome Annotation - Trinotate Hematodinium MEGAN6 Taxonomic-specific Trinity Assembly on Mox
After performing de novo assembly on our Hematodinium MEGAN6 taxonomic-specific RNAseq data on 20200122 and performing BLASTx annotation on 20200123, I continued the annotation process by running Trinotate.
Transcriptome Annotation - Trinotate C.bairdi MEGAN6 Taxonomic-specific Trinity Assembly on Mox
After performing de novo assembly on our Tanner crab MEGAN6 taxonomic-specific RNAseq data on 20200122 and performing BLASTx annotation on 20200123, I continued the annotation process by running Trinotate.
RNA Isolation and Quantification - C.bairdi Hemocyte Pellets in RNAlater
Isolated RNA from the following hemolymph pellet samples:
RNA Isolation and Quantification - C.bairdi Hemocyte Pellets in RNAlater
Isolated RNA from the following hemolymph pellet samples:
Transdecoder - Hematodinium MEGAN6 Taxonomic-Specific Reads Assembly from 20200122
Ran Trinity to de novo assembly on the the C.bairdi MEGAN6 taxonomic-specific RNAseq data on 201200122 and now will begin annotating the transcriptome using TransDecoder on Mox.
Transdecoder - C.bairdi MEGAN6 Taxonomic-Specific Reads Assembly from 20200122
Ran Trinity to de novo assembly on the the Hematodinium MEGAN6 taxonomic-specific RNAseq data on 201200122 and now will begin annotating the transcriptome using TransDecoder on Mox.
Transcriptome Annotation - Hematodinium MEGAN Trinity Assembly Using DIAMOND BLASTx on Mox
As part of annotating the transcriptome assembly from the MEGAN6 Hematodinium taxonomic-specific reads, I need to run DIAMOND BLASTx to use with Trinotate.
Transcriptome Annotation - C.bairdi MEGAN Trinity Assembly Using DIAMOND BLASTx on Mox
As part of annotating the transcriptome assembly from the MEGAN6 C.bairdi taxonomic-specific reads, I need to run DIAMOND BLASTx to use with Trinotate.
RNA Isolation and Quantification - C.bairdi Hemocyte Pellets in RNAlater Troubleshooting
After the failure to obtain RNA from any C.bairdi hemocytes pellets (out of 24 samples processed) on 20200117, I decided to isolate RNA from just a subset of that group to determine if I screwed something up last time or something. Also, I am testing two different preparations of the kit-supplied DNase I: one Kaitlyn prepped and a fresh preparation that I made. Admittedly, I’m not doing the “proper” testing by trying the different DNase preps on the same exact sample, but it’ll do. I just want to see if I get some RNA from these samples this time…
DNA Quality Assessment - Agarose Gel for C.bairdi 20102558-2729 gDNA from 20200122
Earlier today, I isolated gDNA from C.bairdi 20102558-2729 ethanol-preserved muscle tissue using the Quick DNA/RNA MicroPrep Plus Kit (ZymoResearch) and prepared the tissue in three different ways to see how they would compare:
Data Wrangling - Arthropoda and Alveolata Taxonomic RNAseq FastQ Extractions
After using MEGAN6 to extract Arthropoda and Alveolata reads from our RNAseq data on 20200114 (for reference, these include RNAseq data using a newly established “shorthand”: 2018, 2019), I realized that the FastA headers were incomplete and did not distinguish between paired reads. Here’s an example:
DNA Isolation - C.bairdi 20102558-2729 EtOH-preserved Tissue via Three Variations Using Quick DNA-RNA MicroPrep Kit
Previously, I isolated gDNA from a C.bairdi EtOH-preserved muscle sample (20102558-2729) on 20200108 using the E.Z.N.A. Mollusc DNA Kit (Omega). Although the yields were excellent, the DNA looked completely degraded on a gel and running that DNA on a minION flowcell yielded relatively short reads (which wasn’t terribly surprising).
Transcriptome Assembly - Hematodinium with MEGAN6 Taxonomy-specific Reads with Trinity on Mox
Ran a de novo assembly using the extracted reads classified under Alveolata from 20200122 The assembly was performed with Trinity on Mox.
Transcriptome Assembly - C.bairdi with MEGAN6 Taxonomy-specific Reads with Trinity on Mox
Ran a de novo assembly using the extracted reads classified under Arthropoda from 20200122 (for reference, these include RNAseq data using a newly established “shorthand”: 2018, 2019). The assembly was performed with Trinity on Mox.
DNA Isolation and Quantification - C.bairdi Hemolymph Pellets in RNAlater
Isolated DNA from the following 23 samples:
RNA Isolation and Quantification - C.bairdi Hemolymph Pellets in RNAlater
TL;DR - Recovered absolutely no RNA from any sample! However, I did recover DNA from each sample.
RNAseq Reads Extractions - C.bairdi Taxonomic Reads Extractions with MEGAN6 on swoose
I previously ran BLASTx and “meganized” the output DAA files on 20200103 (for reference, these include RNAseq data using a newly established “shorthand”: 2018, 2019) and now need to use MEGAN6 to bin the results into the proper taxonomies. This is accomplished using the MEGAN6 graphical user interface (GUI). This is how the process goes:
Lab Maintenance - Cluster UPS Battery Replacement
Replaced the batteries on one of the APC uninterruptable power supplies (UPS) on our local server cabinet.
NanoPore Sequencing - C.bairdi gDNA Sample 20102558-2729
I performed the initial Lambda sequencing test on 20200107 and everything went smoothly, so I’m ready to give the NanoPore (ONT) MinION a run with an actual sample!
DNA Quality Assessment - Agarose Gel and NanoDrop on C.bairdi gDNA
I isolated C.bairdi gDNA yesterday (20200108) and now want to get an idea if it’s any good (i.e. no contaminants, high molecule weight).
DNA Isolation and Quantification - C.bairdi gDNA from EtOH Preserved Tissue
I isolated gDNA from ethanol-preserved C.bairdi muscle tissue from sample 20102558-2729 (SPNO-ReferenceNO). This sample was chosen as it had 0
in the SMEAR_result
and BCS_PCR_results
columns, indicating it should be free of Hematodinium. See the sample spreadsheet linked below for more info.
NanoPore Sequencing - Initial NanoPore MinION Lambda Sequencing Test
We recently acquired a NanoPore MinION sequencer, FLO-MIN106 flow cell and the Rapid Sequencing Kit (SQK-RAD004). The NanoPore website provides a pretty thorough an user-friendly walk-through of how to begin using the system for the first time. With that said, I believe the user needs to have a registered account with NanoPore and needs to have purchased some products to have full access to the protocols they provide.
Transcriptome Annotation - C.bairdi Using DIAMOND BLASTx on Mox and MEGAN6 Meganizer
Although I previously annotated our C.bairdi transcriptome from 20191218, I realized that the assembly and annotations were combine infected/uninfected samples, possibly making separating crab/Hematodinium sequences a bit more difficult.
Transcriptome Annotation - C.bairdi Trinity Assembly Trinotate on Mox
After performing de novo assembly on our Tanner crab RNAseq data on 20191218 and performing BLASTx annotation on 20191224, I continued the annotation process by running Trinotate.
Transcriptome Annotation - C.bairdi Trinity Assembly BLASTx on Mox
In preparation for complete transcriptome annotation of the C.bairdi de novo assembly fro 20191218, I needed to run BLASTx. The assembly was BLASTed against the SwissProt database that comes with Trinotate. Initial BLAST output format selected was format 11 (i.e. ASN format), as this allows for simple conversion between different formats later on, if desired.
Transdecoder - C.bairdi De Novo Transcriptome from 20191218 on Mox
Ran Trinity to de novo assemble the C.bairdi RNAseq data we had on 20191218 and now will begin annotating the transcriptome using TransDecoder on Mox.
Transcriptome Assembly - C.bairdi Trimmed RNAseq Using Trinity on Mox
Earlier today, I trimmed our existing C.bairdi RNAseq data, as part of producing generating a transcriptome (per this GitHub issue). After trimming, I performed a de novo assembly using Trinity (v2.9.0) with the stranded library option (--SS_lib_type RF
) on Mox.
Trimming/FastQC/MultiQC - C.bairdi RNAseq FastQ with fastp on Mox
Grace/Steven asked me to generate a de novo transcriptome assembly of our current C.bairdi RNAseq data in this GitHub issue. As part of that, I needed to quality trim the data first. Although I could automate this as part of the transcriptome assembly (Trinity has Trimmomatic built-in), I would be unable to view the post-trimming results until after the assembly was completed. So, I opted to do the trimming step separately, to evaluate the data prior to assembly.
Data Wrangling - Olurida_v081 UTR GFFs and Intergenic, Intron BED files
After a meeting last week, we realized we needed to update the paper-oly-mbdbs-gen GitHub repo with the most current versions of feature files we had.
PCR - Crassostrea gigas and sikamea Mantle gDNA from Marinelli Shellfish Company
I ran this PCR a couple of times before and, embarrassingly, I had ordered/used the wrong primers.
qPCR - Geoduck hemolymph and hemocyte cDNA with vitellogenin primers
Previously isolated RNA on 20191125 and made cDNA on 20191126 from some geoduck hemolymph and hemocyte samples that Shelly asked me to run qPCRs on.
Reverse Transcription - P.generosa DNased Hemolypmh and Hemocyte RNA from 20191125
Performed reverse transcription on the DNased hemolymph and hemocyte RNA from yesterday.
RNA Isolation and Quantification - Geoduck hemolymph and hemocyte samples
Shelly asked me to isolate RNA and run some qPCRs on the following samples:
PCR - Crassostrea gigas and sikamea Mantle gDNA from Marinellie Shellfish Company - No Multiplex
UPDATE 20191125
PCR - Crassostrea gigas and sikamea Mantle gDNA from Marinelli Shellfish Company
UPDATE 20191125
PCR - Crassostrea gigas and sikamea Mantle gDNA from Marinelli Shellfish Company
UPDATE 20191125
DNA Isolation and Quantification - Crassostrea gigas and Crassostrea sikamea Mantle Tissue from Marinelli Shellfish Company
Isolated DNA from the C.gigas and C.sikamea samples we received from Marinelli Shellfish Company on 20191030 using DNAzol.
Samples Received - Marinelli Shellfish Company C.gigas and C.sikamea Oysters
Steven was recently contacted by Marinelli Shellfish Company to see if we could help them determine if some oysters they had were Crassostrea gigas (Pacific oyster) or Crassostrea sikamea (Kumamoto). Steven knows of a paper with primer sequences to use with qPCR for this specific determination.
FastQC-MultiQC - C.bairdi RNAseq Day 12 26 Infected Uninfected
After receiving the rest of the crab data and concatenating it all together, I ran FastQC and MultiQC on the FastQ files.
Data Received - C.bairdi RNAseq Day9-12-26 Infected-Uninfected
Previously, we “received” this data, but it turns out it was incomplete (see 20191003).
Lab Maintenance - Cluster UPS Battery Replacement
Replaced the batteries on one of the APC uninterruptable power supplies (UPS) on our local server cabinet.
Metagenomics Annotation - P.generosa Water Samples with MEGAN6
After running DIAMOND BLASTx and MEGANIZER on these samples on 20190925 to assess taxonomy info, I began the analyses/visualization of this data with MEGAN6.
Data Received - C.bairdi RNAseq Day9-12-26 Infected-Uninfected
UPDATE (20191024): This post details receipt of incomplete data. Additional sequencing was performed and that additional data was received 20191024. This notebook entry on 20191024 contains details on FastQ concatenation of the data below and the data received on 20191024.
Metagenomics Annotation - P.generosa Water Samples Using DIAMOND BLASTx on Mox
Trimming/FastQC/MultiQC - P.generosa EPI FastQs with FASTP on Mox
Steven noticed that the M-Bias plots generated by Bismark from these files was a little wonky and asked that I try trimming them a bit more. The files were originally quality/adaptor trimmed with TrimGalore! on 20180516.
Genome Comparison - Pgenerosa_v074 vs Pgenerosa_v070 with MUMmer Promer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer 3.23 (specifically, promer
for protein level comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs S.glomerata NCBI with MUMmer Promer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer 3.23 (specifically, promer
for protein level comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs M.yessoensis NCBI with MUMmer Promer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer 3.23 (specifically, promer
for protein level comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs H.sapiens NCBI with MUMmer Promer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer 3.23 (specifically, promer
for protein level comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs C.virginica NCBI with MUMmer Promer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer 3.23 (specifically, promer
for protein level comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs C.gigas NCBI with MUMmer Promer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer 3.23 (specifically, promer
for protein level comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs Pgenerosa_v070 with MUMmer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer (v4) (specifically, nucmer
for nucleotide comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs Pgenerosa_v074 with MUMmer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer (v4) (specifically, nucmer
for nucleotide comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs S.glomerata NCBI with MUMmer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer (v4) (specifically, nucmer
for nucleotide comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs M.yessoensis NCBI with MUMmer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer (v4) (specifically, nucmer
for nucleotide comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs H.sapiens NCBI with MUMmer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer (v4) (specifically, nucmer
for nucleotide comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs C.gigas NCBI with MUMmer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer (v4) (specifically, nucmer
for nucleotide comparisons). This software is specifically designed to do this type of comparison.
Genome Comparison - Pgenerosa_v074 vs C.virginica NCBI with MUMmer on Mox
In continuing to further improve our geoduck genome annotation, I’m attempting to figure out why Scaffold 1 of our assembly doesn’t have any annotations. As part of that I’ve decided to perform a series of genome comparisons and see how they match up, with an emphasis on Scaffold 1, using MUMmer (v4) (specifically, nucmer
for nucleotide comparisons). This software is specifically designed to do this type of comparison.
Data Wrangling - FastA Splitting With faSplit
Steven posted an issue on GitHub regarding splitting a FastA file into multiple sequences. Specifically, he wanted a single, large FastA sequence (~89Mbp) split into smaller FastAs for BLASTing.
Data Summary - P.generosa Transcriptome Assemblies Stats
In our continuing quest to wrangle the geoduck transcriptome assemblies we have, I was tasked with compiling assembly stats for our various assemblies. The table below provides an overview of some stats for each of our assemblies. Links within the table go to the the notebook entries for the various methods from which the data was gathered. In general:
Transcriptome Compression - P.generosa Transcriptome Assemblies Using CD-Hit-est on Mox
In continued attempts to get a grasp on the geoduck transcriptome size, I decided to “compress” our various assemblies by clustering similar transcripts in each assembly in to a single “representative” transcript, using CD-Hit-est. Settings use to run it were taken from the Trinity FAQ regarding “too many transcripts”.
Transcriptome Annotation - Geoduck Larvae Day5 EPI99 with Transdecoder on Mox
Used Transdecoder to identify open reading frames (ORFs) for use in annotating Pgenerosa_v074 genome assembly. Relies on BLASTp, Pfam, and HMM scanning to ID ORFs.