In preparation for some a bisulfite analysis requested by Steven, I needed to make bisulfite genome with, and for, Bismark to use and wanted to use the same one Yaamini has been working with. Downloaded the NCBI version of the C.virginica genome.
NCBI genome here:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/002/022/765/GCF_002022765.2_C_virginica-3.0/GCF_002022765.2_C_virginica-3.0_genomic.fna.gz
SBATCH script contains all details regarding which genome was used as input and which software versions were used to produce the subsequent bisulfite genome:
#!/bin/bash
## Job Name
#SBATCH --job-name=bismark
## Allocation Definition
#SBATCH --account=coenv
#SBATCH --partition=coenv
## Resources
## Nodes
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=10-00:00:00
## Memory per node
#SBATCH --mem=120G
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/scrubbed/samwhite/outputs/20190312_cvir_genome_prep_bismark
# Load Python Mox module for Python module availability
module load intel-python3_2017
# Document programs in PATH (primarily for program version ID)
date >> system_path.log
echo "" >> system_path.log
echo "System PATH for $SLURM_JOB_ID" >> system_path.log
echo "" >> system_path.log
printf "%0.s-" {1..10} >> system_path.log
echo ${PATH} | tr : \\n >> system_path.log
# Directories and programs
## Programs
bismark_dir="/gscratch/srlab/programs/Bismark-0.19.0"
bowtie2_dir="/gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64"
samtools="/gscratch/srlab/programs/samtools-1.9/samtools"
## Files
genome="/gscratch/srlab/sam/data/C_virginica/genomes/GCF_002022765.2_C_virginica-3.0/"
# Prepare bisulfite genomes
${bismark_dir}/bismark_genome_preparation \
--path_to_bowtie ${bowtie2_dir} \
--verbose ${genome}
RESULTS
Output folder rsync
ed to the Owl genomic databank directory, and gzipped:
This contains all the necessary files for use with Bismark, as well as a readme
file and a copy of the SBATCH script.
Will add to our Genomic Resources wiki