
Hosting for Applications and Services
Where does the code live? Deploying your AI backend to Google Cloud Run, Vercel, or AWS Lambda.
Hosting for Applications and Services
You have main.py. Where do you run it?
1. Google Cloud Run (Recommended)
Since you are already in the Google ecosystem:
- What: Serverless containers.
- Pros: Auto-scales to zero (cheap). Native integration with GCP secrets.
- How:
gcloud run deploy my-ai-service --source .
2. Vercel / Next.js (Edge Functions)
If you use JS SDK.
- Warning: Vercel has a 10-second timeout on the free tier. Gemini Pro often takes >10s. You might get timed out.
- Fix: Use Streaming (keeps connection open) or upgrade to Vercel Pro.
3. AWS Lambda
- Pros: Cheap.
- Cons: Cold starts might add latency to an already slow AI response.
Summary
Cloud Run is usually the sweet spot for Python AI services. It handles long timeouts gracefully and scales well.
In the next lesson, we discuss Version Management.