
Deploying AI Studio Models as APIs
Turn your Python script into a Microservice. How to wrap Gemini in FastAPI to serve predictions to your frontend.
Deploying AI Studio Models as APIs
Your script runs on your laptop. Now, let's put it on the internet.
The Microservice Pattern
We usually wrap the Gemini logic in a lightweight HTTP server (FastAPI or Flask).
# main.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import google.generativeai as genai
import os
app = FastAPI()
# Configure
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
model = genai.GenerativeModel('gemini-1.5-flash')
class PromptRequest(BaseModel):
text: str
@app.post("/generate")
async def generate(req: PromptRequest):
try:
response = model.generate_content(req.text)
return {"result": response.text}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
Why Middle-Layer?
Why not call Gemini directly from the Frontend?
- Security: Keeps API Key hidden.
- Rate Limiting: You can throttle users who spam.
- Logging: You can save prompts to your DB.
Summary
Build a thin wrapper API. Use FastAPI for async support (handling multiple requests while waiting for Gemini).
In the next lesson, we discuss Hosting.