Using DVC to version control your ML experiment data
In this blog post we will explore how you can use DVC for your data version control and how you can automate your data version control with and without DVC inside the Valohai platform.
Machine Learning in the cloud vs on-premises
It’s a running joke among developers that the cloud is just a word for somebody else’s computer. But the fact remains, that by leveraging the cloud you can reap benefits that you couldn’t achieve with your on-premises server farm.
Three ways to categorize machine learning platforms
Machine learning (ML) platforms take many forms and usually solve only one or a few parts of the ML problem space. So how do you make sense of the different platforms that all call themselves ML platforms?
Machine learning is a zero-sum game
Only the companies that invest into machine learning today will exist 10 years from now. The ones that look to the sidelines will be eaten by their competition.
Building a data catalog for machine learning
They say data is the new gold. But without a data catalog, your data is just scattered around like random nuggets of gold in a desert full of rocks, pebbles and sand. Data catalogs help you keep track of the data you have but also, in the case of machine learning models, what data has affected which model. Data brings meaning to machine learning because unlike software, machine learning models are 90% data and 10% code.
Automatic Data Provenance for Your ML Pipeline
We all understand the importance of reproducibility of machine learning experiments. And we all understand that the basis for reproducibility is tracking every experiment, either manually in a spreadsheet or automatically through a platform such as Valohai. What you can’t track what you’ve done it’s impossible to remember what you did last week, not to mention last year. This complexity is further multiplied with every new team member that joins your company.
Patenting Artificial Intelligence – What's It Really About?
Software patents raised a lot of hairs twenty years ago, mainly because while governments are slow to react to change, software evolves rapidly, and patents thus live on for too long in comparison to hardware. Let’s in this blog post take a look at how AI patents are similar and different from software patents and what challenges can be seen in AI patenting.
Machine Learning Infrastructure Lessons from Netflix
Ville Tuulos, machine learning infrastructure architect, was the first to publicly dissect Netflix’s Machine Learning infrastructure at QCon in November 2018 in San Francisco. If you haven’t seen the talk yet, read the summary of his talk here! All the pictures used here, are from Ville's presentation.
Multi-Cloud Data & Infrastructure Solution for Machine Learning
SwiftStack and Valohai, in joint partnership, announce the world’s first peta-scale ML solution that covers everything from computation to data management in a multi-cloud environment. The solution provides a global namespace removing silos and enabling universal access to all your data in all your machine learning use-cases. It has built-in support for Azure, Google Cloud, AWS and SwiftStack.
Michelangelo – Machine Learning Infrastructure at Uber
When we founded Valohai two years ago, we were lucky to make friends with team leads for Uber’s Michelangelo machine learning platform. Michelangelo has been an inspiration in building Valohai for the other 99.999…% of companies that aren’t Uber but still need to speed up their machine learning through automation.
Kubeflow as Your Machine Learning Infrastructure
By now you’ve surely heard about Kubeflow, the machine learning platform based out of Google. Kubeflow basically connects TensorFlow’s ML model building with Kubernetes’ scalable infrastructure (thus the name Kube and Flow) so that you can concentrate on building your predictive model logic, without having to worry about the underlying infrastructure. At least in theory.
Speeding up Deep Learning with PowerAI
Just lately we’ve been playing around with IBM PowerAI in order to ensure our customers can leverage it in large-scale on-premise training. PowerAI in itself is IBM’s solution for deep learning consisting of software and hardware to help you quickly train deep learning models. Today we’re happy to announce that Valohai fully supports PowerAI and our customers can start using it!