This all takes a surprisingly long time to set up.
Setup AWS Identity and Access Management (IAM): https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html?icmpid=docs_iam_console
Install AWS command line interface: https://aws.amazon.com/cli/
Copy files to S3 bucket:
<code>aws s3 cp /Volumes/web/nightingales/O_lurida/20160223_gbs/160123_I132_FCH3YHMBBXX_L4_OYSzenG1AAD96FAAPEI-109_1.fq.gz s3://Samb</code>
<code>aws s3 cp /Volumes/web/nightingales/O_lurida/20160223_gbs/160123_I132_FCH3YHMBBXX_L4_OYSzenG1AAD96FAAPEI-109_2.fq.gz s3://Samb</code>
Launch EC2 instance c4.2xlarge (Ubuntu 14.04 LTS, 8 vCPUs, 16 GiB RAM). Configured to have SSH open (TCP, port 22) and also to be able to access Jupyter Notebook via tunnel (TCP, port 8888). Set with “My IP” to limit access to these ports.
Create new key pair. Have to change permissions:
<code>chmod 400 bioinformatics.pem</code>
Connect to instance
For Amazon AMI:
<code>ssh -i "bioinformatics.pem" ec2-user@ip.address.of.instance</code>
For Amazon Ubuntu Server:
<code>ssh -i "bioinformatics.pem" ubuntu@ip.address.of.instance
</code>
Update/Upgrade default Ubuntu packages at after initial launch:
<code>sudo apt-get update</code>
<code>sudo apt-get upgrade</code>
Set up Docker
Install Docker for Ubuntu 14.04 and copy our bioinformatics Dockerfile to the /home directory of the EC2 instance:
<code>ssh -i "bioinformatics.pem" /Users/Sam/GitRepos/LabDocs/code/dockerfiles/Dockerfile.bio ubuntu@ip.address.of.instance:</code>
Access data stored in Amazon S3 bucket(s)
Mounting S3 storage as volume in EC2 instance requires https://github.com/s3fs-fuse/s3fs-fuse
Mount bucket:
<code>sudo s3fs Samb /mnt/s3bucket/ -o passwd_file=/home/ubuntu/s3fs_creds</code>
Error:
<code>s3fs: BUCKET Samb, name not compatible with virtual-hosted style.</code>
Turns out, the error is due to the bucket name having an uppercase letter.
Made new bucket in S3 (via web interface) and copied data files to the new bucket. Will try mounting again once the files are copied over (this will take awhile; the two files total 36GB)..