Module 3 Lesson 1: What Embeddings Are

In Module 2, we saw that LLMs turn words into Token IDs (just basic numbers). But IDs don't have meaning. If a "Big Apple" is token 50 and a "Large Fruit" is token 902, the computer wouldn't know they are related.

Embeddings solve this. They are the magical bridge that turns a single ID into a list of numbers (a vector) that represents Meaning.

1. Mapping Meaning to a Map

Imagine a giant map of every concept in the world.

Fruits are in one neighborhood.
Cities are in another.
Emotions are in a third.

In this map, words that are similar are placed close together. An embedding is simply the "GPS coordinates" for that word on this conceptual map.

Example (Simplified): Let's give words two coordinates: [Sweetness, Size]

Apple: [0.9, 0.2]
Peach: [0.85, 0.21] (Very close to Apple!)
Car: [0.0, 0.9] (Very far away!)

2. The Multi-Dimensional Reality

In our example, we used 2 numbers. In real LLMs like GPT-4, embeddings use thousands of numbers (dimensions).

Each dimension might represent a subtle concept like:

Is it alive?
Is it plural?
Is it formal?
Is it related to technology?

By having 1,536 or more dimensions, the model can capture incredibly complex nuances—the difference between "Slightly annoyed" and "Seething with rage."

graph TD
    Token["Token ID: 'Dog'"] --> Embedding["Embedding Engine"]
    Embedding --> Vector["[0.12, -0.45, 0.88, ... 1536 dims]"]
    Vector --> Space["Point in High-Dimensional Semantic Space"]

3. Why Vectors Matter

Because embeddings are numbers, we can do Math with them. The most common math operation is calculating the "Distance" between two vectors.

If the distance is small, the LLM knows the two phrases mean nearly the same thing, even if the words are different. This is how AI search engines find "How to fix a laptop" when you type "repair portable computer."

4. Visualizing the Space

While we can't see 1,536 dimensions, AI researchers use tools to "squash" them down to 2D or 3D. When you do this, you see beautiful clusters:

All the countries cluster together.
All the programming languages cluster together.
All the synonyms of "Happy" cluster together.

Lesson Exercise

Goal: Conceptually map three items.

Pick a category (e.g., Vehicles).
Pick three items (e.g., Bicycle, Motorcycle, Cargo Ship).
Assign them two coordinates [Speed, Weight] from 0.0 to 1.0.
Notice how the Bicycle and Motorcycle are "closer" than the Cargo Ship.

Summary

In this lesson, we established:

Embeddings represent meaning as numerical coordinates (vectors).
Similar concepts have vectors that are physically close to each other in "Embedding Space."
High-dimensional vectors allow LLMs to capture subtle human nuances.

Next Lesson: We'll look at the "How." How did the model learn where to put "Apple" on the map? We'll explore How Embeddings Are Learned.