An Introduction to Docker
A collection of resources for researchers interested in using Docker.
Writing research software in Python presents numerous challenges to reproducibility - what version of Python is being used? What about the versions of PyTorch, Scikit Learn or Numpy? Should we use Conda, or venv, or Poetry to manage dependencies and environments? How can we control randomness? Do I have the right version of Cuda Toolkit? In principle, given the same data, and same algorithms and methodology, we should be able to reproduce the results of any given experiment to within an acceptable degree of error. Dealing with the above questions introduces significant problems to reproducing experiments in machine learning. This workshop will explore the use of Docker to help alleviate almost all of these questions. Furthermore, combining Docker, git and GitHub can be a powerful workflow, helping to minimise your tech stack, and declutter your python development experience.
These resources have been developed from a workshop run by the Accelerate Science team in Autumn 2024.
Future workshop dates will be shared on our events page.