Module 8 Lesson 1: Ollama REST API
·AI & LLMs

Module 8 Lesson 1: Ollama REST API

The universal bridge. How to talk to Ollama from any programming language using HTTP requests.

Ollama REST API: Beyond the Terminal

Until now, we've interacted with Ollama through a terminal. But for "AI Engineering," we need to treat Ollama as a service. Because the Ollama server follows standard Web (REST) protocols, you can talk to it from any language—Python, JS, Go, Rust, or even a simple Bash script.

1. The Base URL

By default, the API is hosted at: http://localhost:11434


2. The api/generate Endpoint

Use this for "one-shot" interactions where you don't need a history of the chat.

Request Structure:

{
  "model": "llama3",
  "prompt": "Explain Quantum Physics to a 5-year old.",
  "stream": false
}

CURL Command:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "What is 2+2?",
  "stream": false
}'

3. The api/chat Endpoint

This is what powers modern chat apps. It accepts an array of messages to maintain context.

Request Structure:

{
  "model": "llama3",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "My name is Sudeep." },
    { "role": "assistant", "content": "Nice to meet you, Sudeep!" },
    { "role": "user", "content": "What is my name?" }
  ],
  "stream": false
}

4. Other Useful Endpoints

  • GET /api/tags: Returns a JSON list of every model on your system.
  • POST /api/pull: Allows your app to request a model download programmatically.
  • POST /api/show: Returns the Modelfile and technical parameters of a specific model.

5. Security & Cross-Origin (CORS)

If you are building a website (Frontend) that talks directly to Ollama, you will run into a "CORS Error." To fix this, you must set an environment variable before starting Ollama:

OLLAMA_ORIGINS="http://localhost:3000,http://mysite.com"

This tells Ollama: "It's okay to accept requests from these specific websites."


Key Takeaways

  • The API is what allows Ollama to power other applications.
  • api/generate is for single prompts; api/chat is for conversations.
  • Use stream: false if you want a single JSON response instead of a stream of words.
  • The API runs locally on Port 11434 by default.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn