Using DVC to version control your ML experiment data

In this blog post we will explore how you can use DVC for your data version control and how you can automate your data version control with and without DVC inside the Valohai platform.
Machine Learning in the cloud vs on-premises

Itâs a running joke among developers that the cloud is just a word for somebody elseâs computer. But the fact remains, that by leveraging the cloud you can reap benefits that you couldnât achieve with your on-premises server farm.
Three ways to categorize machine learning platforms

Machine learning (ML) platforms take many forms and usually solve only one or a few parts of the ML problem space. So how do you make sense of the different platforms that all call themselves ML platforms?
Machine learning is a zero-sum game

Only the companies that invest into machine learning today will exist 10 years from now. The ones that look to the sidelines will be eaten by their competition.
Building a data catalog for machine learning

They say data is the new gold. But without a data catalog, your data is just scattered around like random nuggets of gold in a desert full of rocks, pebbles and sand. Data catalogs help you keep track of the data you have but also, in the case of machine learning models, what data has affected which model. Data brings meaning to machine learning because unlike software, machine learning models are 90% data and 10% code.
Automatic Data Provenance for Your ML Pipeline

We all understand the importance of reproducibility of machine learning experiments. And we all understand that the basis for reproducibility is tracking every experiment, either manually in a spreadsheet or automatically through a platform such as Valohai. What you canât track what youâve done itâs impossible to remember what you did last week, not to mention last year. This complexity is further multiplied with every new team member that joins your company.
Patenting Artificial Intelligence â What's It Really About?

Software patents raised a lot of hairs twenty years ago, mainly because while governments are slow to react to change, software evolves rapidly, and patents thus live on for too long in comparison to hardware. Letâs in this blog post take a look at how AI patents are similar and different from software patents and what challenges can be seen in AI patenting.
Machine Learning Infrastructure Lessons from Netflix

Ville Tuulos, machine learning infrastructure architect, was the first to publicly dissect Netflixâs Machine Learning infrastructure at QCon in November 2018 in San Francisco. If you havenât seen the talk yet, read the summary of his talk here! All the pictures used here, are from Ville's presentation.
Multi-Cloud Data & Infrastructure Solution for Machine Learning

SwiftStack and Valohai, in joint partnership, announce the worldâs first peta-scale ML solution that covers everything from computation to data management in a multi-cloud environment. The solution provides a global namespace removing silos and enabling universal access to all your data in all your machine learning use-cases. It has built-in support for Azure, Google Cloud, AWS and SwiftStack.
Michelangelo â Machine Learning Infrastructure at Uber

When we founded Valohai two years ago, we were lucky to make friends with team leads for Uberâs Michelangelo machine learning platform. Michelangelo has been an inspiration in building Valohai for the other 99.999âŠ% of companies that arenât Uber but still need to speed up their machine learning through automation.
Kubeflow as Your Machine Learning Infrastructure

By now youâve surely heard about Kubeflow, the machine learning platform based out of Google. Kubeflow basically connects TensorFlowâs ML model building with Kubernetesâ scalable infrastructure (thus the name Kube and Flow) so that you can concentrate on building your predictive model logic, without having to worry about the underlying infrastructure. At least in theory.
Speeding up Deep Learning with PowerAI

Just lately weâve been playing around with IBM PowerAI in order to ensure our customers can leverage it in large-scale on-premise training. PowerAI in itself is IBMâs solution for deep learning consisting of software and hardware to help you quickly train deep learning models. Today weâre happy to announce that Valohai fully supports PowerAI and our customers can start using it!