Blog / TensorBoard + Valohai Tutorial
TensorBoard + Valohai Tutorial

TensorBoard + Valohai Tutorial

Juha Kiili

One of the core design paradigms of Valohai is technology agnosticism. Building on top of the file system and in our case Docker means that we support running very different kinds of applications, scripts, languages and frameworks on top of Valohai. This means most systems are Valohai-ready because of these common abstractions. The same is true for TensorBoard as well.

Read more about Valohai open standards ➜

What is TensorBoard?

Tensorboard is a debugging and visualization tool for TensorFlow models. Any TensorFlow model can be made to serialize its graph and execution summaries into physical files called “event files” or “event log files”. Opening these files in TensorBoard will provide insights and debugging information for you about the model and its training execution.

Training insights in TensorBoard

TensorBoard is actually a tiny web server serving the visualizations to your browser. It’s installed with TensorFlow so it’s likely that you already have it. In case you don’t, check out Tensorflow installation .

To start TensorBoard, you need to pass the path to your event log files to it. It will also scan log files that are in its subdirectories. A best practice is that each execution’s logs are stored in a separate subfolder.

Once TensorBoard is running, open the URL that it prints out.

For example:

tensorboard --logdir=/tmp/logs

Local example of visualizing TensorFlow model with TensorBoard

A TensorFlow model doesn’t automatically output event files. It needs to be slightly modified to do that. Here’s a simple example writing out a graph and a summary for a single variable for 10 iterations.

Let’s run this with python and then opening it in TensorBoard:


tensorboard --logdir=logs

Open the URL provided by the TensorBoard command in your browser and you’ll see this:

TensorBoard Tutorial training logs

Valohai CLI

Valohai is a deep learning management platform that automatically stores every detail of your experiments, like hyperparameters, code, model, and more. Combining Valohai’s automatic version control with TensorBoard’s powerful visualization will make you a machine learning super star.

Before we can integrate TensorBoard with Valohai, you need to install the Valohai Command-Line tools (CLI), login from the terminal and create a new Valohai project.

pip install valohai-cli

vh login

vh project create --name=tensorboard-example --link

If you don't have a Valohai account, create one here .

TensorBoard with Valohai

Time to take our original example code and add some boilerplate to integrate it with Valohai.

In a classic Valohai execution, all the files in /valohai/outputs are stored after the execution is done. Here we want to be more interactive, as one often wants to use the TensorBoard while the execution is still running. The trick is that once a file is set as read-only, Valohai will assume no changes will be made to it and upload it immediately, instead of waiting for the execution to finish.

Another requirement for a real-time TensorBoard with Valohai is to run the execution with the --sync flag. It means that the CLI begins constantly checking for new outputs and downloading them to the local machine. Note that you can also sync with any running execution using the vh execution outputs command.

Lastly, it is important to get logs of each execution into a separate subfolder, regardless of whether they are executed locally or in the cloud. This example code will cover both cases nicely.

Now let’s first run our example locally, then on Valohai and finally start TensorBoard to see the logs of both:


vh execution run -a train --sync=logs/{counter}

tensorboard --logdir=logs

Our end result will look like this:

End result of Valohai + TensorBoard tutorial training

That’s it! All the code found in this example can also be found in the GitHub repository, which can also be linked straight to your Valohai project for easy testing.

Start your Valohai trialTry out the MLOps platform for 14 days