JSON Mode: AI for Apps

If you are a developer, you don't want "sentences"; you want Data. You want to be able to run JSON.parse() on the model's response so you can put it in a database or show it on a dashboard.

Ollama has a built-in feature called JSON Mode that guarantees the output is valid.

1. Enabling JSON Mode (The easy way)

In the Ollama API, simply add "format": "json" to your request.

CURL Example:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Extract the names and ages: Sudeep is 30, Alex is 25",
  "format": "json",
  "stream": false
}'

2. In your Modelfile

You can also force a custom model to always be in JSON mode:

FROM llama3
PARAMETER format json
SYSTEM "You extract contact info. Respond with {name: string, phone: string}."

3. The "Schema" Trick

Even with JSON mode on, the model might choose weird names for the keys ("person_name" vs "name"). To stop this, always provide the keys in your prompt.

Prompt: "Extract the data into this structure: { 'invoice_id': '', 'amount': '', 'currency': '' }"

4. Why JSON Mode is Better than "Prompting"

Without JSON mode, the model might say: "Sure, here is your JSON object: { ... }" The "Sure, here..." part will break your code because it’s not valid JSON.

With JSON Mode, Ollama intercepts the response and only allows valid JSON tokens to pass through to your application.

5. Performance Note

JSON mode can slightly slow down the initial "Time to First Token" as the model has to think harder about following your specific schema. For simple extracts, it is almost unnoticeable.

Key Takeaways

Use format: "json" in the API or Modelfile for guaranteed parsable results.
Always provide a sample schema in your prompt so the keys are consistent.
JSON mode is the foundation for building API-driven local AI apps.
It eliminates the "Sure, here is your JSON" preamble that breaks traditional parsing scripts.

Module 9 Lesson 4: JSON and Structured Output