Table of contents

Running Shasta using Docker

Docker is a tool that allows users to run software on a variety of operating systems without installing dependencies or compiling the software. It uses operating-system-level virtualization to run pre-compiled images in self-contained packages called containers. Running Shasta via this method has the advantage of portability, but building the executable from source on the host machine is still the most performant method of running Shasta.

Getting Started

To run Shasta via the Docker image, the Docker software must be installed on the host OS. Hardware and data requirements for running Shasta still apply. The command to run the image includes configuration to mount the current working directory to the data/ directory inside the container. The read data (input.fasta in this example) must be present in the current directory, and all the output will be there after execution. The image name and version tag must be specified, followed by any arguments that would be passed to the Shasta executable:
docker run -v `pwd`:/data kishwars/shasta:latest --input input.fasta
When Shasta is run via this method, the Docker image is set up to save the command, all output logged to the console, and the time and maximum memory usage to a file called shasta.log. This log will be present in the current working directory, alongside all output from Shasta. The Docker image will be automatically downloaded the first time it is used; subsequent runs will use a cached version.

Memory Mode

To run Shasta in the filesystem memory mode (for better performance), the Docker container needs to set up underlying infrastructure in the OS. By default, a Docker container is not allowed to take the actions necessary for this, so it must be given elevated privileges. To do this the --privileged flag must be set when invoking Docker:
docker run --privileged -v `pwd`:/data kishwars/shasta:latest --input input.fasta --memoryMode filesystem --memoryBacking 2M
When running Shasta in this mode, the cleanupBinaryData command is not required, and the binary data is not available for use with the http server or the Python API.

Compiling Docker Image

The Shasta image is currently published to the kishwars/shasta repository on Docker Hub and does not need to be compiled locally. For advanced users, an image can be compiled using the Makefile in the docker/ folder in the Shasta repository.
cd docker/
make
By default, the build script will find the current git commit in the repository where it is being run, create an image with the code from that commit (the commit must be pushed to github), then tag the image with the current Shasta version and the commit hash. To create an image from a specific git commit, the make command can be run with this option:
make -e git_commit=<your_git_commit>
In addition to git_commit, the specifications for image_name=kishwars/shasta, shasta_version=0.1.0, and tag=${version}--${git_commit} can be changed in the same way.
Table of contents