Instructions and supporting data for the QIIME/IPython/StarCluster demo at the 2012 NIH Cloud Computing the Microbiome workshop.
The analysis made use of the IPython Notebook, QIIME, StarCluster, PyCogent, and PrimerProspector. All of these tools are pre-installed in the ami-2faa7346 public Amazon EC2 instance, which was used in this study.
Supporting Files
The IPython notebooks supporting this study are available here:
* Note that the Timing notebook is for reference as related to the paper only - it will not be directly reproducible on re-runs of the above notebooks as it relies on the semi-manual creation of the tasks.log file. The tasks.log file used to generate the original timing data is available in pynb_files.zip.
The Greengenes reference OTU collection used in this study is available for download here.
The IPython notebook files (.pynb) are available for download here.
The tree metadata mapping file used in generating the coloring categories in the 3D PCoA plot is available here.
A manuscript on this analysis is currently in review. When this is published we will make the Open Access article available from this page.
Reproducing the analysis
Four m2.4xlarge instances were booted using StarCluster to create a 32 core cluster with approximately 280GB of RAM (70GB per 8 core instance). This was used for the full analysis (a more complete analysis then was done during the workshop, where the workshop analysis was optimized to run quickly).
To reproduce the analyses presented in this paper you should install StarCluster locally, and configure it according to the instructions on the StarCluster website. You can then add the following to your ~/.starcluster/config file:
[plugin ipcluster]
setup_class = starcluster.plugins.ipcluster.IPCluster
enable_notebook = true
# If you leave notebook_passwd out, a random password
# will be generated instead.
notebook_passwd = YOUR-PASSWORD
[cluster qiime]
node_image_id = ami-2faa7346
keyname = YOUR-KEY
cluster_size = 4
node_instance_type = m2.4xlarge
plugins = ipcluster
You can then boot this cluster by running:
starcluster start -c qiime myqiime
You will be presented with the URL of your IPython notebook. You can upload our .pynb files to re-run the analysis directly on your own hardware, or tweak it to perform your own analysis.
Have fun!