What every data scientist should know about the command lineJuha Kiili
Almost any programming language in the world is more powerful than the command line. Why would you even bother doing anything on it? Don't be fooled: the modern command line is rocking like never before!
Experimentation at Scale: a Q&A with Serg Masís from SyngentaJuha Kiili
Syngenta is a leading provider of agricultural science and technology focused on seed and crop protection products aiming to improve global food security by enabling millions of farmers to make better use of available resources.
Docker for Data Science: What every data scientist should know about DockerJuha Kiili
Docker isolates the software from all other things on the same system. A program running inside a "spacesuit" generally has no idea it is wearing one and is unaffected by anything happening outside.
What every data scientist should know about Python dependenciesJuha Kiili
Dependency management is the act of managing all the external pieces that your project relies on. It has the risk profile of a sewage system. When it works, you don't even know it's there, but when it fails, it becomes very painful and almost impossible to ignore.
Git for Data Science: What every data scientist should know about GitJuha Kiili
Git is a tool most software developers have used daily for a decade, and with data scientists becoming an integral part of R&D teams, Git is every day for them as well. We've listed a few helpful tips on using Git for your ML work and avoiding the common pitfalls.
Product Update: Human Validation and Confusion MatricesJuha Kiili
We’ve recently introduced two features that make building trusted and validated models easier: human validation steps and confusion matrices.
Product Update: Spark as a First-Class CitizenJuha Kiili
Support for Spark has been one of the most requested features as Spark has become almost ubiquitous for data scientists and engineers working with structured data. We’ve heard the calls and Valohai now supports Spark natively.
Building a YOLOv3 pipeline with Valohai and Superb AIJuha Kiili
This article shows an example of a pipeline that integrates Valohai and Superb AI to train a computer vision model using pre-trained weights and transfer learning. For the model, we are using YOLOv3, which is built for real-time object detection.
Product Update: Kubernetes, Spot Instances & Python Utility LibraryJuha Kiili
It's time for an update on what's been happening under the hood of the Valohai platform. We'd like to highlight three major features we've added in the past two months: Support for Kubernetes and Spot instances and the Valohai Python utility library.
Superb Meets Valohai: An End-to-End Solution for Developing Computer Vision ApplicationsJuha Kiili
Computer vision is one of the most disruptive technologies of the recent decade. To develop computer vision systems requires massive, upfront investments. Or it used to, before Superb met Valohai.
How We Trained 277M Models for the Black-Box Optimization ChallengeJuha Kiili
Valohai MLOps platform provided the infrastructure for the Black-Box Optimization Challenge for the NeurIPS 2020 conference. The competition was organized together with Twitter, Facebook, SigOpt, ChaLearn, and 4paradigm.
Updates for Valohai Powered NotebooksJuha Kiili
Valohai is the enterprise-grade machine learning platform for data scientists that build custom models by hand. In addition to writing code with classic IDEs like PyCharm or VSCode, we also have native support for data scientists preferring to use Jupyter notebooks.
Self-Driving with ValohaiJuha Kiili
One of the hottest areas of application for deep learning is undoubtedly self-driving cars. We’ll go through the problem space, discuss its intricacies and build a self-driving solution utilizing the Unity game engine, training a neural network on top of the Valohai platform. Regardless of the technologies used, you’ll get an understanding of the basics as well as the code to tweak for yourself.
Valohai's Jupyter Notebook ExtensionJuha Kiili
Valohai is a deep learning platform that helps you execute on-demand experiments in the cloud with full version control. Jupyter Notebook is a popular IDE for the data scientist. It is especially suited for early data exploration and prototyping.
Asynchronous Workflows in Data ScienceJuha Kiili
Pointlessly staring at live logs and waiting for a miracle to happen is a huge time sink for data scientists everywhere. Instead, one should strive for an asynchronous workflow. In this article, we define asynchronous workflows, figure out some of the obstacles and finally guide you to a next article to look at a real-life example in action in Jupyter Notebooks.
From Zero to Hero with Valohai CLI, Part 2Juha Kiili
Valohai executions can be triggered directly from the CLI and let you roll up your sleeves and fine-tune your options a bit more hands-on than our web-based UI. In part one, I showed you how to install and get started with Valohai’s command-line interface (CLI). Now, it’s time to take a deeper dive and power up with features that’ll take your daily productivity to new heights.
From Zero to Hero with Valohai CLI, Part 1Juha Kiili
As new Valohai users get acquainted with the platform, many fall in love our web-based UI - and for good reason. Its responsive, intuitive and gets the job done with just a few clicks. But don’t be fooled into thinking that’s the end of the interface conversation. We know it takes different [key]strokes for different folks, so Valohai also includes a command-line interface (CLI) and the REST API.
TensorBoard + Valohai TutorialJuha Kiili
One of the core design paradigms of Valohai is technology agnosticism. Building on top of the file system and in our case Docker means that we support running very different kinds of applications, scripts, languages and frameworks on top of Valohai. This means most systems are Valohai-ready because of these common abstractions. The same is true for TensorBoard as well.
Automatic Version Control Meets Jupyter NotebooksJuha Kiili
Running a local notebook is great for early data exploration and model tinkering, there’s no doubt about it. But eventually you’ll outgrow it and want to scale up and train the model in the cloud with easy parallel executions, full version control and robust deployment. (Letting you reproduce your experiments and share them with team members at any time.)
Reinforcement Learning Tutorial Part 3: Basic Deep Q-LearningJuha Kiili
In this third part, we will move our Q-learning approach from a Q-table to a deep neural net.
Reinforcement Learning Tutorial Part 2: Cloud Q-learningJuha Kiili
In this second part takes these examples, turns them into Python code and trains them in the cloud, using the Valohai deep learning management platform.
Reinforcement Learning Tutorial Part 1: Q-LearningJuha Kiili
This is the first part of a tutorial series about reinforcement learning. We will start with some theory and then move on to more practical things in the next part. During this series, you will not only learn how to train your model, but also what is the best workflow for training it in the cloud with full version control using the Valohai deep learning management platform.
PocketFlow with ValohaiJuha Kiili
PocketFlow is an open-source framework from Tencent to automatically compress and optimize deep learning models. Especially edge devices such as mobile phones or IoT devices can be very limited on computing resources so sacrificing a bit of model performance for a much smaller memory footprint and lower computational requirements is a smart tradeoff.