MLOps for Earth Observation
From mapping land use to monitoring global supply chains — Opportunities to apply machine learning are plenty but how to continuously develop and serve machine learning models remains as a barrier for many. MLOps seeks to address the operationalization of ML with a set of tools and best practices.
Earth observation poses unique MLOps challenges.
Here’s a few we’ve identified with our partners.
Data sets are GIGANTIC.
When dealing with orbital data, your local machine will grind to a halt very quickly. To train machine learning models with satellite imagery you’ll need access to powerful cloud compute, and preferably you’ll want to have the data as close to the computation as possible.
ML Pipelines get complicated.
Creating a data set for training a model often can have many preprocessing steps like creating cloud-free composite imagery. You’ll want to build a pipeline that can automate all steps, no matter what programming languages are used.
Evaluation requires rigor.
In earth observation, evaluating whether your model works overall or for your specific training data only is a key issue. You’ll often want to evaluate your model not just against your training data but also benchmarking data sets that might be for example more geodiverse.
Versioning is key.
Data is always changing and therefore versioning is critical to understanding if your models stop performing as expected. Whether it’s because of outdated labels or changes in source quality, you’ll want to be able to pinpoint when and how a model was trained.
The MLOps platform for deep learning and unstructured data sets.
Valohai is an MLOps platform for teams that work with the most demanding machine learning problems. The platform makes it easy to build pipelines that automate everything from data extraction to model deployment.
Valohai manages cloud infrastructure so a data scientist can run any kind of cloud instance with a single click. Our platform can also be installed on any private cloud or on-premise setup.
Valohai is technology agnostic to ensure that data science teams can integrate pre-processing, training, evaluation and deployment steps together into a cohesive pipeline rather than running steps separately.
Valohai offers powerful capabilities to build pipelines that cover many training runs with different data sets. Pipelines can be dynamic to evaluate models f.ex. against previously trained models.
Valohai automatically versions every model but also what code, data and parameters were used to train them to ensure full reproducibility.
Improving smart-forestry through machine learning
OBJECT DETECTION, AFRICA