Error Handling: Retries and Rate Limits

Error Handling: Retries and Rate Limits

Production code must not crash. Implement robust error handling for 429 Rate Limits, 500 Server Errors, and Safety Violations.

Error Handling

Cloud APIs fail. They have rate limits, network blips, and safety filters.

1. Rate Limits (429)

If you send too many requests, Google returns HTTP 429.

  • Fix: Exponential Backoff. Wait 1s, then 2s, then 4s.
  • Library: tenacity or backoff in Python.
from tenacity import retry, wait_exponential, stop_after_attempt

@retry(wait=wait_exponential(multiplier=1, min=2, max=60), stop=stop_after_attempt(5))
def generate_safe(text):
    return model.generate_content(text)

2. Safety Blocks

As discussed, check response.prompt_feedback. If blocked, do not try to access response.text or your app will crash with ValueError.

3. Empty Responses

Sometimes the model returns an empty string (FinishReason: STOP). Treat this as a failure and maybe retry or show a generic error message.

Summary

  • Wrap usages in Try/Except.
  • Use libraries for retries.
  • Never trust the API to be 100% available.

Module 6 Complete! You can now integrate Gemini into any codebase. In Module 7, we unlock the true power: Multimodal Capabilities.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn