Data Management - O. lurida genotype-by-sequencing (GBS) data from BGI

We received a hard drive from BGI on 20160223 (while I was out on paternity leave) containing the Ostrea lurida GBS data.

Briefly, three sets (i.e. populations) of Olympia oyster tissue was collected from oysters raised in Oyster Bay and were sent to BGI for DNA extraction and GBS. A total of 23 individuals from each of the following three populations were sequenced (a grand total of 96 samples):

  • 1HL - (Hood Canal, Long Spit)

  • 1NF - (North Sound, Fidalgo Bay)

  • 1SN - (South Sound, Oyster Bay)

An overview of this project can be viewed on our GitHub Olympia oyster wiki.

Data was copied from the HDD to the following location on Owl (our server):

The data was generated from paired-end Illumina sequencing, so there are two FASTQ files for each individual.

The files were analyzed to create a MD5 checksum, perform read counts, and create a readme (markdown format) file. This was performed in a Jupyter/iPython notebook (see below).

IMPORTANT NOTE: The directory where this data is housed was renamed AFTER the Jupyter notebook was run. As such, the directory listed above will not be seen in the Jupyter notebook.

Jupyter notebook file: 20160314_Olurida_GBS_data_management.ipynb

Notebook Viewer: 20160314_Olurida_GBS_data_management.ipynb