The Architect's Choice

Before you write code, you must choose the architecture. The exam will give you a scenario and ask: "Which algorithm usually performs best?"

1. The Algorithm Cheat Sheet

Data Type	Problem	Best Architecture	Why?
Tabular	Classification/Regression	XGBoost / LightGBM (Trees)	Handles unscaled features well; interpretable.
Tabular	Very Complex / Massive	Deep Neural Network (Wide & Deep)	Can learn crossed features.
Image	Object Detection	CNN (ResNet, EfficientNet)	Spatial invariance (a cat is a cat even if rotated).
Text	Translation / Summary	Transformer (BERT, T5, Gemini)	Attention mechanism handles long-range context.
Time Series	Forecasting	LSTM / ARIMA+	Remembers history (sequence dependence).
Recommendation	User Preference	Two-Tower Model / Matrix Factorization	Handles sparse data (users usually only buy 1% of items).

2. Transfer Learning

Exam Rule: Never train from scratch if you can Fine-Tune.

Training from Scratch: Requires millions of images. Takes weeks.
Transfer Learning: Take a Google model (trained on ImageNet), freeze the early layers, and just retrain the last layer on your images. Requires ~100 images. Takes minutes.

Vertex AI Model Garden is the library where you find these base models.

3. Designing for Interpretability

Sometimes accuracy isn't the goal. In Banking and Health, Explainability is the law.

White Box Models: Linear Regression, Decision Trees. (Easy to explain: "You were denied because Age < 18").
Black Box Models: Deep Neural Networks. (Hard to explain).

If the exam scenario says "Must be fully interpretable by non-technical auditors," avoid Deep Learning. Use BigQuery ML Linear Regression or Boosted Trees.

4. Loss Functions

You must tell the model what "Success" looks like.

Problem	Loss Function	Metric
Binary Classification	Binary Cross-Entropy (Log Loss)	Accuracy, AUC, Precision/Recall
Multi-Class	Categorical Cross-Entropy	F1-Score
Regression	MSE (Mean Squared Error)	RMSE, MAE
Outlier-Heavy Regression	Huber Loss / MAE	MAE is robust to outliers; MSE punished them too hard.

5. Visualizing the "Wide & Deep" Model

This is a specific Google architecture often tested. It combines memorization (Wide) with generalization (Deep).

graph TD
    Input[Input Features]
    
    subgraph "Wide (Memorization)"
    Linear[Linear Model]
    end
    
    subgraph "Deep (Generalization)"
    Dense1[Dense Layer] --> Dense2[Dense Layer]
    end
    
    Input --> Linear
    Input --> Dense1
    
    Linear --> Add[Combine]
    Dense2 --> Add
    Add --> Output[Sigmoid Output]

6. Summary

Tabular: Trees (XGBoost).
Perceptual (Vision/Text): Deep Learning (CNN/Transformers).
Scarcity: Use Transfer Learning.
Regulation: Use Interpretable Models.

In the next lesson, we scale up. How do we train these models on Terabytes of data? Distributed Training.

Knowledge Check

Error: Quiz options are missing or invalid.

Model Architecture Design: Choosing the Right Brain