
The Capstone Project - Part 2: Implementation and Automation
Build the machine. Implement the Helm charts, GitOps workflows, and CI/CD pipelines needed to deploy the OmniVision platform across a global, multi-cluster environment.
The Capstone Project: Part 2 - Building the Automation Engine
In Part 1, we designed the OmniVision AI architecture on paper. We have our EKS and GKE clusters, our Zero-Trust security model, and our scaling strategy. Now, it is time to build.
In a modern enterprise, we never "manually" deploy anything. Everything must be declarative. If a cluster in Oregon is destroyed, we should be able to recreate it perfectly by simply pointing our automation at a Git repository.
In this second part of the Capstone, we will master the Implementation phase. We will build a unified Helm Chart for OmniVision, structure our GitOps Repository, and create a GitHub Actions Pipeline that builds, scans, and deploys our AI agents to the world.
1. The Unified Helm Chart: One Blueprint, Many Clouds
We will create a single chart named omnivision-platform.
Instead of hardcoding AWS or GCP specific values, we use Toggles.
values.yaml structure:
global:
cloudProvider: "aws" # or "gcp"
imageRegistry: "<account>.dkr.ecr.us-east-1.amazonaws.com"
aiWorker:
image: "omnivision-worker"
tag: "latest" # Overwritten by CI
resources:
limits:
nvidia.com/gpu: 1
# Use cloud-specific affinity
nodeAffinity:
{{- if eq .Values.global.cloudProvider "aws" }}
key: "eks.amazonaws.com/nodegroup"
operator: In
values: ["gpu-workers"]
{{- else }}
key: "cloud.google.com/gke-accelerator"
operator: In
values: ["nvidia-tesla-t4"]
{{- end }}
2. The GitOps Repository Structure
We follow the Folder-per-Environment pattern for ArgoCD (Module 11.4).
omnivision-gitops/
infrastructure/
cluster-aws/
argo-app.yaml
values-overrides.yaml # AWS specific IDs
cluster-gcp/
argo-app.yaml
values-overrides.yaml # GCP specific IDs
apps/
omnivision-ui/
omnivision-worker/
By pushing a change to the values-overrides.yaml in the cluster-aws folder, ArgoCD will immediately sync the Oregon cluster while leaving the Belgium cluster untouched, allowing for Regional Testing.
3. The CI/CD Pipeline: The "Secure Conveyor Belt"
Our GitHub Actions pipeline must be a fortress.
The Stages:
- Build: Create the Docker image using Buildx for cache optimization.
- Scan: Run Trivy (Module 10.4). Stop the build if high vulnerabilities are found.
- Sign: Use Sigstore Cosign (Module 10.4) to digitally sign the image.
- Update Git: The pipeline then automatically creates a Pull Request in the
omnivision-gitopsrepo to update theimage.tagto the new version.
sequenceDiagram
participant Dev as Developer
participant CI as GitHub Actions
participant ECR as ECR / GCR
participant Git as GitOps Repo
participant Argo as ArgoCD
Dev->>CI: Push Code
CI->>CI: Build & Scan (Trivy)
CI->>ECR: Push Signed Image
CI->>Git: Update image.tag=v1.2
Git->>Argo: Webhook Trigger
Argo->>Argo: Diff & Sync
Argo->>ECR: Pull Signed Image
Argo-->>Dev: Deployment Successful
4. Implementing the sidecar injection
For deep observability, we need to ensure every worker pod has a FluentBit agent to ship logs to Loki (Module 9.3).
Instead of developers adding this to their Helm chart, we use a Mutating Admission Webhook (Module 12.4). We install the FluentBit Operator.
- We label our application namespaces
logging=enabled. - The Operator sees the label and automatically injects the sidecar into every pod.
- This ensures 100% Log Coverage regardless of who deployed the app.
5. Practical Example: The "Canary" Rollout Manifest
Combining what we learned in Module 11.5, our Argo Rollout will handle the promotion between versions.
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: omnivision-worker
spec:
strategy:
canary:
analysis:
templates:
- templateName: gpu-health-check
steps:
- setWeight: 5
- pause: { duration: 10m }
- setWeight: 50
- pause: { duration: 30m }
If the GPU Health Check (which checks for CUDA errors in Prometheus) fails during the 5% phase, the rollout Automatically Reverts.
6. Next Steps
You have built the automation. The "Machine" is now capable of deploying code globally with a single Git commit. In the Final Part of the Capstone Project, we will focus on Operational Excellence: Setting up the Grafana dashboards, performing a Disaster Recovery drill, and the final course graduation.
Your Thinking Exercise:
If your CI/CD pipeline correctly signs an image but a hacker manages to push a malicious, unsigned image with the same name to your ECR registry, will your cluster run it? (Hint: Re-read Part 1 on Admission Controllers).
7. SEO Metadata & Keywords
Focus Keywords: Kubernetes GitOps repository structure, building Helm charts for multi-cloud, automating Trivy scans in GitHub Actions, Sigstore Cosign Kubernetes tutorial, ArgoCD multi-cluster sync, K8s sidecar injection for logging.
Meta Description: Move from theory to code. Learn how to implement a professional automation engine for your global Kubernetes platform, mastering Helm, GitOps, and secure CI/CD pipelines to deploy AI services with zero manual effort.