Module 11 Lesson 2: Python/ML Pipeline
·DevOps

Module 11 Lesson 2: Python/ML Pipeline

Model management at scale. Explore how a data science team uses GitLab CI/CD to build, test, and deploy a Python microservice with GPU support and automated model versioning.

Module 11 Lesson 2: Case Study: Python/ML Microservice

In this study, we look at "Predictly," an AI startup using Python, PyTorch, and FastAPI.

1. The Requirement

  • Large Dependencies: The image is 4GB (due to CUDA/PyTorch).
  • Validation: Must test the model "Accuracy" before every deployment.
  • Deployment: To a Kubernetes cluster with NVIDIA GPUs.

2. The Blueprint (.gitlab-ci.yml)

stages:
  - build
  - validate
  - push

# Use Kaniko for faster, cached image builds (Module 7)
build-image:
  stage: build
  image: { name: gcr.io/kaniko-project/executor, entrypoint: [""] }
  script:
    - /kaniko/executor --context $CI_PROJECT_DIR --destination $CI_REGISTRY_IMAGE/model:$CI_COMMIT_SHA --cache=true

# Model Accuracy Gate (Quality Gate - Module 5)
check-accuracy:
  stage: validate
  image: $CI_REGISTRY_IMAGE/model:$CI_COMMIT_SHA
  tags: [gpu-runner] # Must run on a server with an NVIDIA card
  script:
    - python validate_model.py --threshold 0.95
  # If accuracy is 94%, the build fails and we don't push to prod!

3. Analysis

  • Kaniko Caching: Since the AI libraries are huge, standard docker build would take 10 minutes. Kaniko's remote cache (Module 7, Lesson 5) reduces this to <2 minutes.
  • Accuracy Gate: This is a "Continuous Testing" (CT) pattern. We don't just check if the code runs; we check if the "Logic" of the AI is still correct.
  • Specific Runners: By using the gpu-runner tag, the team saves money by running standard jobs (like linting) on cheap CPUs and only using the expensive GPU servers for the actual validation.

Exercise: The AI Auditor

  1. Every time the model is updated, the team wants to save the "Training Graph" (.png). Which keyword should they use? (Artifacts or Cache?)
  2. If the validate_model.py script needs 10GB of RAM but the runner only has 2GB, what will happen? (Review Module 13).
  3. Why is it better to build the image BEFORE running the accuracy test? (Hint: The image contains the environment).
  4. Search: What is DVC (Data Version Control) and how can you use it inside a GitLab pipeline?

Summary

This case study shows how CI/CD solves the "Dependency Hell" of Data Science. By automating the build and validation on specialized hardware, Predictly ensures that their models are always high-quality and production-ready.

Next Lesson: Retrofitting the past: Case Study: Legacy PHP Conversion.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn