Continuous Integration in Automotive Machine Learning Development
Joanna Purosto / November 04, 2019
What is continuous integration?
Continuous Integration (CI) in software development is the process of testing that a change in one place doesn’t break something else. Continuous Delivery (CD), on the other hand, is an extension to CI where every change in the code is also deployed. Both are and have been core parts in the advancements of Extreme Programming , i.e. rapid small-batch development. This, on its hand, has been the main contributor to advancements in rapid software development.
Machine learning (ML) is completely different from software development. In ML, we don’t anymore have static code that is tested in a silo, but we build our models out of code and data. And we usually build hundreds of models before choosing the optimal one (aka. hyperparameter sweeps).
The basic needs and concepts are, however, the same, but the practical workflows require a different set of tools. What you want to ensure with the tools is rapid development that is the sum of CI/CD, DevOps, and version control tools.
The Valohai ML platform is a CI/CD platform for machine learning. Read an example on how to train a self-driving model on Valohai.
CI in Automotive Machine Learning Development
Despite being a self-evident method in software development, these best practices have not found their way to machine learning development. The use of new machine learning techniques is often initiated as proof-of-concept (PoC) projects, and companies end up in a situation where they have multiple PoCs on different fronts using different tools and working methods.
One of your teams might be building machine vision models to detect pedestrians with Convolutional Neural Networks using Tensorflow and sharing the results in their Slack channel. Another team might be building a predictive maintenance model to estimate the wear and tear of parts of the vehicle. This team uses Dropbox to share their results, and different versions of the models are stored locally on each data scientist’s computer. From a manager’s point of view, it is tough to keep up what is the state of each project.
For the production phase, automotive companies need to be able to inject every workflow into a single company-wide infrastructure. The vehicle must be able to access every model with ease, and data scientists need to be able to update the intelligent models behind the vehicle on the go.
Without proper tooling, the road for a machine learning model from PoC to the vehicle requires heavy lifting from the IT department. The initial setup is not nearly enough, and the whole machine learning infrastructure needs constant updates while technologies and working methodologies improve. E.g. prototyping in Jupyter notebooks compared to shared git projects with terabytes of data in the cloud both require the same hardware, CI, and version-control but are completely different in terms of tooling.
These challenges are in no way unique to the automotive industry – but given the need for rapid advancements while people’s lives are at stake – the challenges in automotive are of utter importance.
What is preventing automotive industry to take a leap in machine learning adoption?
A study conducted by Capgemini in early 2019 saw that, despite the recent advances in machine learning, only 10% of the major automotive players use AI at scale.
Many companies in the automotive sector, large and small, are experimenting with new ways to approach machine learning. But problems arise when the intelligent AI engines are put into production. The major reason for slow adoption is the infrastructure around machine learning development. Because of the automotive industry’s need for rapid development and audit-ability, things like machine orchestration and version control needs to be automated. This is where machine learning tools, like the Valohai platform, come into play.
Read the Automotive Machine Learning in Production -eBook to learn how the Valohai Machine Learning platform helps your team to build models 10X faster.