Blog / Stop paying for the compute resources that you’re not using anymore
Stop paying for the compute resources that you’re not using anymore

Stop paying for the compute resources that you’re not using anymore

by Alexander Rozhkov | on July 01, 2024

What are the main cost drivers behind most machine-learning initiatives? We’d all be happy if we could name just a few. But sadly, the list goes on and on. And while some cost drivers steal the spotlight as the usual suspects, underutilized compute resources often fly under the radar.

What’s worse is that the peak use of machines can change over time as inputs and other variables change as well. So, even if you carefully match the size of your ML workloads to the maximum capacity of your machines, resource underutilization can go unnoticed until it’s too late (for example, when your CFO confronts you about those large cloud bills).

If you ever want to break the ice with a machine learning engineer, ask them about the most expensive computation they have accidentally left running.

At this point, it’s becoming very tempting to ask: What if there was a way to identify underutilized machines? After all, you can allocate them to other tasks and save tons of cash as a result.

Well, that’s exactly where we put our time and effort when developing one of our newest features in the Valohai MLOps platform.

*drum roll*

We’re excited to announce our new notification system. Its main goal is to alert all Valohai users when their ML workloads underutilize compute resources.

Here’s how it works:

  • The Valohai MLOps platform monitors the CPU, GPU, and memory usage of your machine at all times.
  • If any of your machines operate below 50% capacity, you'll receive an alert identifying these exact machines.
  • You can configure future alerts to your liking (e.g. in-app or via email or Slack).

The view of underutilization alerts, peak use per machine, and other details in the Valohai MLOps platform. The view of underutilization alerts, peak use per machine, and other details in the Valohai MLOps platform.

The key benefit of this feature is that it allows you to detect underutilized compute resources and allocate them to other tasks. We designed this feature to help you avoid hidden and unnecessary costs and optimize your machine-learning operations.

But this notification system is one of many upcoming features, which is our commitment to enabling ML Pioneers to maximize resource utilization and optimize operational efficiency. We’re looking forward to announcing these new features over the coming weeks.

As of today, this notification system is available to all Valohai users in the latest release. If you have a self-hosted installation air-gapped inside of your network, please contact our Customer Success team.

If you’re not a Valohai user yet, you can get started by booking a meeting with us below.

Start your Valohai trialTry out the MLOps platform for 14 days