Valohai Logo


Machine Orchestration, Version Control and Pipeline Management for Deep Learning

Curious about what we've been working on lately?
Check out latest patch notes (January 21, 2019) or browse all patch notes here

Zero Setup Infrastructure

Train your models in the cloud or your own server-farm with the click of a button, the call of an API or with a one-liner in the command line. Utilizing the right amount of processing units, get results and save time & money.

Scale fast

Valohai supports massive scale concurrency on top of AWS, Microsoft Azure, Google Cloud Platform and on-premises hardware (e.g. OpenStack). Just click a button and launch your code within a Docker container running on your hardware of choice.

Automatic Version Control

Fulfil any regulatory compliance without any added work. Valohai automatically keeps track of all your experiments so you can always answer the question of how the model was trained, from data to parameters and statistics to algorithm.

Pipeline Management

Don’t worry about environments, configurations or shutting down servers when your training is done. Concentrate on trials & mastering your models!


We believe that version control is the only way to achieve reproducibility, regulatory compliance, an audit trail and quick results.

Valohai execution details

Select a deployed model and trace back to the its hyperparameters, training data, script version, associated cost, sibling models and team members involved in training the model. Do it today or 10 years from now.


Valohai integrates with any runtime you have and runs any machine learning code you write. Unlike other deep learning tools, Valohai doesn’t tie you down to one vendor (not even itself, as even the configuration format is open source).

Run your TensorFlow, Keras, CNTK, Caffe, Darknet, DL4J, PyTorch, MXNet or anything from bash scripts to C-code in your Docker wrapper of choice. Store your training data and labels in an Azure Blob, AWS S3 bucket or your own FTP server. Access your code in any public or private Git repository and run it on your cloud or on-premises hardware of choice.

alt text


Everything in Valohai is built API first, meaning that you can easily integrate your ML pipeline into your existing Software pipeline, e.g. through Jenkins or any other Continuous Integration platform.


When building deep learning models at scale, you want to use industry best-practices from leaders such as Uber, Netflix, AirBnB and Facebook.

Valohai brings the same tools to your fingertips, that these powerhouses use to manage their internal machine learning pipelines (Uber Michelangelo, AirBnB BigHead, Facebook ML etc).

Valohai’s streamlined machine learning pipeline ensures that steps integrate together, regardless of who has written it or which language or framework was used for it. Generate images with Unity, transform in custom C-code, train with TensorFlow in Python, Deploy to a Kubernetes cluster. Everything works!


Get visual feedback on everything from a single model’s performance to convergence of several parallel hyperparameter sweeps. See how your parameter sweeps are progressing and compare competing models by accuracy, depth or any custom parameter. Instead of manually launching models and keeping track of CSV files, you’ll see everything in real-time as your trainings progress. You can also output custom parameters into stdout and see it graphically in the Valohai web interface.


Valohai lets you scale up vertically and horizontally to do distributed learning and parallel hyperparameter sweeps at the speed of light (in an ethernet cable). Run your model in parallel on a hundred GPUs or tell Valohai to sweep through different hyperparameters to find the best model for your data in parallel on tens of TPUs. Valohai is built for finding and optimizing your model for big data and immense models that scale with you, as you grow from data exploration to production.


Give your data science team full transparency into how models have been trained throughout history and what every team members is working on today.

Share projects

Assign team members to projects, tag eachother in notes and collaborate more efficiently.

Share results

Automatically share all experiments within a project with the rest of the team. View models in real-time as they converge and spin off new trainings.

Share models

Trace back from a deployed model to its training data, hyperparameters, code, environment (software & hardware) and more.

Protect your data

Protect your business and your customers' data by hosting all of your assets in your own cloud environment or on-premise data storage. Valohai supports any web storage from Amazon S3, Azure Blob Storage, an HTTP endpoint or a directory in your intranet.