Running Shasta using Docker
Docker is a tool that allows users to run software on a variety
of operating systems without installing dependencies or compiling the software.
It uses operating-system-level virtualization to run pre-compiled images in
self-contained packages called containers.
Running Shasta via this method has the advantage of portability, but building
the executable from source on the host machine is still the most performant method of running Shasta.
Getting Started
To run Shasta via the Docker image, the Docker software must be installed
on the host OS. Hardware and data requirements for running Shasta still apply.
The command to run the image includes configuration to
mount the current working directory to the data/
directory inside the container. The read data
(input.fasta
in this example)
must be present in the current directory,
and all the output will be there after execution.
The image name and version tag must be specified, followed by
any arguments that would be passed to the Shasta executable:
docker run -v `pwd`:/data kishwars/shasta:latest --input input.fasta
When Shasta is run via this method, the Docker image is set up to save the command, all output logged to the console,
and the time and maximum memory usage to a file called shasta.log
. This log will be present in the
current working directory, alongside all output from Shasta. The Docker image will be automatically downloaded
the first time it is used; subsequent runs will use a cached version.
Memory Mode
To run Shasta in the filesystem
memory mode (for better performance),
the Docker container needs to set up underlying infrastructure in the OS.
By default, a Docker container is not allowed to take the actions necessary for this,
so it must be given elevated privileges.
To do this the --privileged
flag must be set when invoking Docker:
docker run --privileged -v `pwd`:/data kishwars/shasta:latest --input input.fasta --memoryMode filesystem --memoryBacking 2M
When running Shasta in this mode, the cleanupBinaryData
command is not required,
and the binary data is not available for use with the http server or the Python API.
Compiling Docker Image
The Shasta image is currently published to the kishwars/shasta
repository on
Docker Hub and does not need to be compiled locally.
For advanced users, an image can be compiled using the Makefile in the
docker/
folder in the Shasta repository.
cd docker/
make
By default, the build script will find the current git commit in the repository
where it is being run, create an image with the code from that commit
(the commit must be pushed to github), then tag the image with the
current Shasta version and the commit hash.
To create an image from a specific git commit,
the make
command can be run with this option:
make -e git_commit=<your_git_commit>
In addition to git_commit
, the specifications for image_name=kishwars/shasta
,
shasta_version=0.1.0
, and tag=${version}--${git_commit}
can be changed in the same way.