Supported Architectures: Is Your Model Compatible?

Ollama is built on llama.cpp. While the name suggests it only runs "Llama" models, it actually supports dozens of different mathematical structures (architectures). If you find a model on Hugging Face, you need to check if its "Architecture" is supported before you try to import it.

1. The "Big" Supported Architectures

These work 100% and are extremely fast:

Llama / Llama 2 / Llama 3: The standard.
Mistral / Mixtral: High-efficiency attention.
Gemma: Google's transformer variant.
Falcon: Primarily used for technical and scientific data.
StarCoder: Designed for programming.

2. MoE (Mixture of Experts)

Ollama has excellent support for MoE models like mixtral:8x7b.

In an MoE model, only a small part of the brain "wakes up" for each word.
Challenge: You need enough RAM to store the whole thing, but it only uses a fraction of the compute power. Ollama handles this complex "sparse" math automatically.

3. What is NOT Supported?

Encoder-Only Models (BERT)

Models like BERT or RoBERTa are used for classification (is this email spam?) but they cannot "chat." Ollama is a "Generative" engine, so it generally does not run these classification-only models.

Image Generators (Stable Diffusion)

Ollama is for LLMs (Text). It cannot run Stable Diffusion or Midjourney-style image generators.

Experimental Non-Transformer Models

New architectures like Mamba (State Space Models) are being added to the registry slowly. Before you pull a Mamba model, check the Ollama GitHub Releases to see if that specific version of Ollama supports it.

4. How to Check Compatibility

When looking at a model on Hugging Face:

Go to the "Files" tab.
Open the config.json file.
Look for the field "model_type".
If it says "llama", "mistral", "gemma", or "qwen", you are 99% likely to be successful.

Key Takeaways

Ollama supports the majority of Transformer-based Generative models.
MoE (Mixture of Experts) is supported but requires high RAM.
Encoder-only (BERT) and Image models (Stable Diffusion) are not supported.
Check the model_type in the config.json on Hugging Face before you waste time downloading 40GB of data.

Module 6 Lesson 3: Supported Architectures