Development Environment: Scaling & Compute

Development Environment: Scaling & Compute

Choosing the right hardware for development. When to use a local GPU vs a remote cluster, and how to define custom containers.

Laptop vs Cloud

Step 2 of Prototyping: "Is my computer big enough?"


1. Vertical Scaling (The Notebook)

If your dataframe fits in RAM (e.g., 10GB), you process it in the notebook. If it is slightly larger (e.g., 100GB), you can:

  1. Resize the VM: In User-Managed Notebooks, you can stop the VM, change machine type from n1-standard-4 to m1-ultramem-40, and restart.
  2. Attach GPU: Add a T4 or V100.

2. Horizontal Scaling (The Executor)

The Executor feature in Workbench allows you to "Offload" a cell.

  • You confirm your code works on a sample (100 rows).
  • You click "Execute on Cluster."
  • Managed Notebooks spins up a Dataproc cluster or a Training Job, runs the notebook on the full dataset, and saves the output.

Why? It keeps your development environment cheap (small VM) while allowing bursts of high power for heavy jobs.


3. Custom Containers

Sometimes pip install isn't enough. You need system libraries (e.g., libsound, ffmpeg).

  1. Dockerfile: Write a Dockerfile extending the Deep Learning Container.
  2. Build: gcloud builds submit.
  3. Use: When creating the Notebook (or Training Job), specify "Custom Container Image."

Knowledge Check

Error: Quiz options are missing or invalid.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn