Establishing Metadata Tracking and Lineage
·ProfessionalEngineeringCertifications

Establishing Metadata Tracking and Lineage

How to establish metadata tracking and lineage for your ML workflows using Vertex AI ML Metadata.

The "Why" and "How" of Your Model

In a production ML system, it's not enough to just have a model. You also need to know where it came from. What data was it trained on? What code was used to train it? What were its evaluation metrics?

Vertex AI ML Metadata helps you answer these questions by automatically tracking the metadata and lineage of your ML workflows.


1. Lineage Tracking

Lineage tracking is the process of recording the relationships between the different artifacts and executions in your ML workflow. This allows you to see the entire history of your model, from the raw data to the final deployed model.

For example, a lineage graph might show that:

  • A specific dataset was used to train a specific model.
  • A specific model was used to generate a specific set of predictions.
  • A specific set of predictions was used to generate a specific set of evaluation metrics.

2. Querying the Metadata

You can use the ML Metadata API to query the metadata and lineage of your ML workflows. This allows you to:

  • Find all the models that were trained on a specific dataset.
  • Find the dataset that was used to train a specific model.
  • Find the pipeline run that produced a specific model.

Example: Querying Lineage

from google.cloud import aiplatform

# Get the model resource
model = aiplatform.Model("projects/123/locations/us-central1/models/456")

# Get the training execution that created this model
lineage = model.lineage()

# Print the input datasets
for event in lineage.events:
    if event.type == "INPUT":
        print(event.artifact_uri)

3. Benefits of Lineage Tracking

  • Reproducibility: You can use the lineage graph to reproduce any artifact in your system, such as a model or a set of predictions.
  • Debugging: You can use the lineage graph to debug issues in your system. For example, if you find a bug in your model, you can use the lineage graph to trace it back to the data or code that caused it.
  • Governance: You can use the lineage graph to audit your system and ensure that it complies with all your governance requirements.

Knowledge Check

?Knowledge Check

You are investigating a model failure. You suspect that the model was trained on a corrupted dataset. How can you use ML Metadata to confirm your suspicion?

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn