Level Up Your Machine Learning Code from Notebook to ProductionAarni Koskela
Developing a machine learning model for a new project starts with certain common groundwork and exploration, to understand your data and figure out the approaches to try. A popular choice for this groundwork is Jupyter , an environment where you write Python code interactively. In Jupyter notebook's cells you can evaluate and revise and it is an attractive, visual choice (and many times the right choice) – for this step of data science work. Since Jupyter kernels, the processes backing a notebook’s execution, retain their internal state while the code is being edited and revised, they’re a highly interactive, fast-feedback environment.
Downsides of Jupyter Notebooks in Machine Learning
However, while convenient, Jupyter notebooks can be hard to reason about exactly because of this retention of state, since the state of your environment may have changed in a non-linear fashion (or worse yet, left in an inconsistent state) after re-evaluation of an earlier cell. It's entirely possible to have a saved notebook that can't be successfully evaluated after relaunching the kernel. Since production-grade code is supposed to be easily tested and reviewed, as we’ve learned as an industry, this isn’t desirable at all.
It can also be difficult to keep track of the exact versions of dependencies you've used during machine learning development. For instance, a model that worked fine with a certain version of TensorFlow might not run at all with a newer one down the line, and it's tedious to try and figure out what exactly was being run at the time of exploration.
Jupyter Notebook in Production. Or not..
Let's assume you've played nice and been fastidious enough to not run into these problems – all your dependencies are locked down and you've found out that you can actually run your notebook non-interactively with
jupyter nbconvert --to notebook --execute notebook.ipynb and maybe pipe the output into a file, for tracking results. You'll inevitably want to run your training code using different parameters (say, learning rates, network structures, etc.);
jupyter nbconvert --execute isn't really conducive for that, and editing the notebook, or maybe a separate configuration file, to change constants is just silly, too.
These are some of the reasons why we advocate for using regular, linear Python scripts instead of Jupyter/IPython notebooks when going from initial exploration work to something resembling production. Another thing is that you get to develop using your favorite editor/IDE, be it vim or Emacs or VSCode or PyCharm (which, by the way, has an excellent Scientific Mode – and support for notebooks too ), instead of being confined to a browser. Switching from notebooks to regular code also lets you refactor your solution to a more modular, more easily testable and reviewable package.
Of course there are drawbacks; development is less interactive, and since there is no persistent state, everything needs to be evaluated or loaded from scratch at every invocation of your script. On the other hand this property makes you think about preprocessing your data to a faster-to-load format earlier. When you extract the preprocessing code from other code, it becomes easier to maintain and more reproducible, as well.
EDIT: @joelgrus apparently talked about this very same thing at JupyterCon last week! The slides are hilarious, I suggest checking them out.
Summary of Notebooks vs. Scripts
However, if your machine learning project is in a stage where you are not ready to fully move on from the expiration phase but want to have version control for machine learning experiments, we have a solution for you. We at Valohai build a Jupyter Notebook Extension that keeps a full audit trail of your notebook experiments and allows data scientists to run experiments asynchronously . You might want to check it out!
*image of stairs is derived from https://unsplash.com/photos/pKvmGR4qHrg