The "Why" and "How" of Your Model

In a production ML system, it's not enough to just have a model. You also need to know where it came from. What data was it trained on? What code was used to train it? What were its evaluation metrics?

Vertex AI ML Metadata helps you answer these questions by automatically tracking the metadata and lineage of your ML workflows.

1. Lineage Tracking

Lineage tracking is the process of recording the relationships between the different artifacts and executions in your ML workflow. This allows you to see the entire history of your model, from the raw data to the final deployed model.

For example, a lineage graph might show that:

A specific dataset was used to train a specific model.
A specific model was used to generate a specific set of predictions.
A specific set of predictions was used to generate a specific set of evaluation metrics.

2. Querying the Metadata

You can use the ML Metadata API to query the metadata and lineage of your ML workflows. This allows you to:

Find all the models that were trained on a specific dataset.
Find the dataset that was used to train a specific model.
Find the pipeline run that produced a specific model.

Example: Querying Lineage

from google.cloud import aiplatform

# Get the model resource
model = aiplatform.Model("projects/123/locations/us-central1/models/456")

# Get the training execution that created this model
lineage = model.lineage()

# Print the input datasets
for event in lineage.events:
    if event.type == "INPUT":
        print(event.artifact_uri)

3. Benefits of Lineage Tracking

Reproducibility: You can use the lineage graph to reproduce any artifact in your system, such as a model or a set of predictions.
Debugging: You can use the lineage graph to debug issues in your system. For example, if you find a bug in your model, you can use the lineage graph to trace it back to the data or code that caused it.
Governance: You can use the lineage graph to audit your system and ensure that it complies with all your governance requirements.

Knowledge Check

Error: Quiz options are missing or invalid.

Establishing Metadata Tracking and Lineage