Decoding Model Naming and Tags

When you see a command like ollama run llama3:8b-instruct-q4_K_M, it looks like a cat walked across a keyboard. However, every character in that tag has a specific meaning. Mastering this "code" will help you choose the right model for your specific hardware.

The Structure of a Tag

A model name follows this pattern: [NAME] : [PARAMETER SIZE] - [FLAVOR] - [QUANTIZATION]

1. Parameter Size (The "B")

8b, 70b, 7b, 3b.

B stands for Billion.
It tells you how many "connections" (parameters) the model has.
Rule: More parameters = More Intelligence = More RAM needed.

2. The Flavor (The "What it does")

Instruct: Fine-tuned for chat and following instructions. (Use this for 99% of tasks).
Text: The raw model. It's good for "predicting the next word" in a book, but bad at answering questions.
Vision: Can "see" images. You can upload a photo and ask questions about it.
Code: Specialized training for writing Python, JS, C++, etc.

3. Quantization (The Compression)

q4, q5, q8, fp16.

This tells you how much the model has been "squashed" to fit on your disk.
q4 (4-bit): The industry standard. High speed, small size.
q8 (8-bit): Better intelligence, but twice the size of q4.
fp16: No compression. Extremely large and slow, but the "truest" version of the model.

The "Latest" Trap

If you run ollama run llama3, you are implicitly running llama3:latest.

Warning: The latest tag is a moving target. If the Ollama team updates the registry, your latest model might change features overnight. For production software, always specify a full tag (e.g., llama3:8b).

Common Examples Decoded

Tag	What it means
`llama3:8b`	The standard 8 billion parameter Llama 3 model.
`llama3:70b-instruct-q4_K_M`	The massive 70B model, tuned for instructions, compressed using "K-Medium" 4-bit math.
`llava:7b-v1.6-mistral-q4_0`	A vision-capable model based on Mistral 7B.
`phi3:mini`	The smallest, fastest version of Microsoft's Phi-3 model.

How to Check Tags on Your Machine

Run ollama list in your terminal. Look at the TAG column. This reflects exactly what version you have locally.

Key Takeaways

The Name is the model series.
The Tag (after the colon) specifies size, purpose, and compression.
Instruct models are what you usually want for chat applications.
Parameter count (B) determines the RAM/VRAM requirement.
Avoid using latest in production scripts; be specific!

Module 3 Lesson 2: Model Naming and Tags