Data Wrangling - Pgenerosa_v074.a3 Annotation Genome Feature Sequence Lengths

The GenSAS Pgenerosa_v074 annotation from 20190710 (referred to as: Panopea-generosa-vv0.74.a3) recently completed (after nearly a month of running).

In preparation for a paper we’re writing, we needed some summary stats for Panopea-generosa-vv0.74.a3. This info will be compiled in to a table for the manuscript. See our Genomic Resources wiki for more info on GFFs:

Calculations were performed using Python in a Jupyter Notebook.

Jupyter Notebook (GitHub):


RESULTS

I’ve copied/pasted the summary data for each of the GFFs that were analyzed, for quick reference. Will get this compiled in to a table of some sort for people to use for the manuscript.

Panopea-generosa-vv0.74.a3.exon.gff3
-------------------------
mean        255.932825
min           3.000000
median      157.000000
max       13359.000000




Panopea-generosa-vv0.74.a3.CDS.gff3
-------------------------
mean        255.932825
min           3.000000
median      157.000000
max       13359.000000




Panopea-generosa-vv0.74.a3.mRNA.gff3
-------------------------
mean       13318.053183
min          201.000000
median      2346.000000
max       345225.000000




Panopea-generosa-vv0.74.a3.gene.gff3
-------------------------
mean       13318.053183
min          201.000000
median      2346.000000
max       345225.000000