Valohai is a deep learning platform that helps you execute on-demand experiments in the cloud with full version control. Jupyter Notebook is a popular IDE for the data scientist. It is especially suited for early data exploration and prototyping.
The data scientists tend to waste a lot of time just waiting for the training results when the optimal way is to work asynchronously and iterate with different approaches in parallel. To read more about asynchronous workflows in deep learning, go to Asynchronous Workflow in Data Science blog.
We at Valohai are developing an extension for Jupyter Notebooks, which is specially optimized to provide full version control for deep learning experiments, and a smooth asynchronous workflow without additional technical hassle.
This tutorial describes what the Valohai + Jupyter Notebook combination can do, and is divided into four steps:
Starting a new experiment,
Checking experiment results, and
Version controlling your trainings.
Let’s take a look at how the extension works.
1. Starting a new experiment
Starting a new cloud experiment
When we create a new experiment, the Notebook is uploaded to the cloud where new server instance goes up. Once it has downloaded the training data, it executes all the cells in the notebook from top to bottom. A small rectangle appears to top-right corner to signify an experiment running in the cloud.
2. Monitoring experiments
Monitoring experiment status and logs
Here we have three experiments running. We can monitor the status and log outputs by hovering the experiment gizmo. Blue color means that the experiment is currently running. Green is for finished experiments. If an exception has occurred or the user manually stops the experiment, the gizmo turns red, and the server is immediately shut down in the cloud.
3. Experiment results
Downloading finished notebook from the cloud
Once the experiment is finished, we can click the Notebook button and download the resulting notebook back to another browser tab. If we like the results, we can continue working on the notebook right away, perhaps make a few changes and start another experiment in the cloud!
4. Version control
Versioned experiments in Valohai machine learning platform
The experiments are version controlled and fully reproducible in the Valohai machine learning platform. Training data, notebook, docker image, hyperparameters, cloud server type, cost, username, and notes are all stored to keep your work safe and reproducible for years to come.