Module 3 Lesson 2: Model Naming and Tags
Decoding the colon. Understanding what 'llama3:8b-instruct-q4_K_M' actually means.
Decoding Model Naming and Tags
When you see a command like ollama run llama3:8b-instruct-q4_K_M, it looks like a cat walked across a keyboard. However, every character in that tag has a specific meaning. Mastering this "code" will help you choose the right model for your specific hardware.
The Structure of a Tag
A model name follows this pattern:
[NAME] : [PARAMETER SIZE] - [FLAVOR] - [QUANTIZATION]
1. Parameter Size (The "B")
8b, 70b, 7b, 3b.
- B stands for Billion.
- It tells you how many "connections" (parameters) the model has.
- Rule: More parameters = More Intelligence = More RAM needed.
2. The Flavor (The "What it does")
- Instruct: Fine-tuned for chat and following instructions. (Use this for 99% of tasks).
- Text: The raw model. It's good for "predicting the next word" in a book, but bad at answering questions.
- Vision: Can "see" images. You can upload a photo and ask questions about it.
- Code: Specialized training for writing Python, JS, C++, etc.
3. Quantization (The Compression)
q4, q5, q8, fp16.
- This tells you how much the model has been "squashed" to fit on your disk.
- q4 (4-bit): The industry standard. High speed, small size.
- q8 (8-bit): Better intelligence, but twice the size of q4.
- fp16: No compression. Extremely large and slow, but the "truest" version of the model.
The "Latest" Trap
If you run ollama run llama3, you are implicitly running llama3:latest.
Warning: The latest tag is a moving target. If the Ollama team updates the registry, your latest model might change features overnight. For production software, always specify a full tag (e.g., llama3:8b).
Common Examples Decoded
| Tag | What it means |
|---|---|
llama3:8b | The standard 8 billion parameter Llama 3 model. |
llama3:70b-instruct-q4_K_M | The massive 70B model, tuned for instructions, compressed using "K-Medium" 4-bit math. |
llava:7b-v1.6-mistral-q4_0 | A vision-capable model based on Mistral 7B. |
phi3:mini | The smallest, fastest version of Microsoft's Phi-3 model. |
How to Check Tags on Your Machine
Run ollama list in your terminal. Look at the TAG column. This reflects exactly what version you have locally.
Key Takeaways
- The Name is the model series.
- The Tag (after the colon) specifies size, purpose, and compression.
- Instruct models are what you usually want for chat applications.
- Parameter count (B) determines the RAM/VRAM requirement.
- Avoid using
latestin production scripts; be specific!