For completeness sake, I wanted to create an additional C.bairdi transcriptome assembly that consisted of Arthropoda only sequences from just pooled RNAseq data (since I recently generated a similar assembly without taxonomically filtered reads on 20200518). This constitutes samples we have designated: 2018, 2019, 2020-UW. A de novo assembly was run using Trinity on Mox. Since all pooled RNAseq libraries were stranded, I added this option to Trinity command.
After performing de novo assembly on all of our Tanner crab RNAseq data (no taxonomic filter applied, either) on 20200518, I continued the annotation process by running Trinotate.
After generating a number of C.bairdi (Tanner crab) transcriptomes, we decided we should compare them to evaluate which to help decide which one should become our “canonical” version. As part of that, the Trinity wiki offers a list of tools that one can use to check the quality of transcriptome assemblies. Some of those require a transcriptome of a related species.
We’ve produced a number of C.bairid transcriptomes utilizing different assembly approaches (e.g. Arthropoda reads only, stranded libraries only, mixed strandedness libraries, etc) and we want to determine which of them is “best”. Trinity has a nice list of tools to assess the quality of transcriptome assemblies, but most of the tools rely on comparison to a transcriptome of a related species.
I was looking for some crab transcriptomic data today and, unable to find any previously assembled transcriptomes, turned to the good ol’ NCBI SRA. In order to simplify retrieval and conversion of SRA data, need to use the SRA Toolkit software suite. Since I haven’t used this in many years, I figured I might as well put together a brief guide/tutorial so I can refer back to it in the future.
After creating a de novo assembly of C.bairdi transcriptome v1.6 on 20200518, performing BLASTx annotation on 202000519, and TransDecoder for ORF identification on 20200519, I continued the annotation process by running Trinotate.
I previously created a C.bairdi de novo transcriptome assembly v1.6 with Trinity from all our C.bairdi taxonomically filtered RNAseq on 20200518 and decided to assess its “completeness” using BUSCO and the