Tracking Experiments: Vertex AI Experiments and Kubeflow

Tracking Experiments: Vertex AI Experiments and Kubeflow

From messy notebooks to organized experiments. Learn how to use Vertex AI Experiments to log parameters and metrics, and how Kubeflow Pipelines can automate your experimentation process.

From Ad-Hoc to Organized

Running experiments in a notebook is fast, but it's not reproducible. How do you remember what you tried last week? How do you compare the results of 50 different runs? This is where Vertex AI Experiments and Kubeflow Pipelines come in.


1. Vertex AI Experiments: Your Lab Notebook

Vertex AI Experiments provides a centralized place to track your experiments. You can log:

  • Parameters: The inputs to your experiment (e.g., learning rate, dropout rate).
  • Metrics: The outputs of your experiment (e.g., accuracy, loss).
  • Artifacts: The "things" created by your experiment (e.g., datasets, models).

This allows you to easily compare different runs and see what's working.

Example: Logging with the Vertex AI SDK

from google.cloud import aiplatform

# Initialize the experiment
aiplatform.init(experiment="my-experiment")

# Start a new run
with aiplatform.start_run("my-run") as run:
    # Log parameters
    run.log_params({"learning_rate": 0.01, "dropout": 0.2})

    # ... train your model ...

    # Log metrics
    run.log_metrics({"accuracy": 0.95, "loss": 0.1})

    # Log artifacts
    run.log_artifacts({"model_path": "/path/to/my/model"})

2. Kubeflow Pipelines (KFP): Automating Experiments

Manually running experiments is tedious. Kubeflow Pipelines allows you to define your ML workflow as a graph of components, and then run it as a pipeline. This is perfect for automating experiments.

KFP for Experimentation

You can create a pipeline that takes a set of hyperparameters as input, trains a model, and then logs the results to Vertex AI Experiments. This allows you to run many experiments in parallel and automatically track the results.

Example: KFP Component

from kfp.v2 import dsl

@dsl.component
def train_model(
    learning_rate: float,
    dropout: float,
    metrics: dsl.Output[dsl.Metrics],
):
    # ... train your model ...

    # Log metrics
    metrics.log_metric("accuracy", 0.95)

You can then create a pipeline that calls this component with different hyperparameters.


3. When to Use What

  • Vertex AI Experiments: Use for all your experiments, whether you're running them manually in a notebook or automatically with a pipeline.
  • Kubeflow Pipelines: Use when you want to automate your experiments and run many experiments in parallel.

Knowledge Check

?Knowledge Check

You are running a series of experiments to find the best hyperparameters for your model. You want to be able to easily compare the results of different runs and see which hyperparameters produced the best results. What should you use?

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn