Module 2 Lesson 1: Why Computers Cannot Understand Text Directly

When you read the word "Apple," your brain immediately thinks of a red fruit, a tech company, or perhaps a snack. You see letters—A, P, P, L, E—and you associate them with meaning.

A computer sees none of that.

To a computer, everything is a number. In this lesson, we will explore why this "Numerical Gap" exists and why it is the root of everything we do in Large Language Models.

1. The Computer as a Giant Calculator

At its core, a computer (and specifically its CPU or GPU) is a high-speed calculator. It performs billions of operations like:

A + B = C
If A > B then X

These operations only work on Numbers. You cannot "add" a word to another word mathematically. You can concatenate them (stick them together), but you can't perform the complex matrix multiplication that a neural network requires unless those words are converted into numerical values.

2. Text vs. Meaning

The second challenge is that the symbols we use for text are arbitrary.

The English word "Dog"
The Spanish word "Perro"
The French word "Chien"

To a human, these represent the same fuzzy animal. To a standard computer string-processor, these are three completely different sets of bytes. A computer has no "inherent" understanding that a "Dog" is related to a "Puppy" or a "Bone" just by looking at the letters.

3. The Mapping Problem

To bridge this gap, we need a way to Map characters and words to numbers.

Historically, we used systems like ASCII or Unicode. In these systems, every character has a number:

A = 65
B = 66
C = 67

While this helps the computer store the text, it still doesn't help it understand the text. If you add A (65) and B (66), you get 131. Does 131 mean anything? No.

Systems like ASCII tell the computer what the letter looks like on a screen, but they don't tell the computer what the letter means in a sentence.

graph TD
    Human["Human sees: 'Hello'"] --> Brain["Brain Association: Meaning/Greeting"]
    Computer["Computer sees: 'Hello'"] --> Bytes["Bytes: 72, 101, 108, 108, 111"]
    Bytes --> Math["Math Ops: Can't calculate 'Greeting + User'"]

4. The Need for a Middle Ground

To build an LLM, we need a more sophisticated mapping system that:

Breaks text into manageable pieces (Tokens).
Assigns each piece a unique numerical ID.
Places those IDs into a space where the computer can calculate "similarity."

Lesson Exercise

The ASCII Experiment: Look up an "ASCII Table" online. Find the numbers for your first name.

Observation: Try to imagine if you could teach a math formula to recognize that "John" and "Johnny" are essentially the same person just by looking at those numbers. It's much harder than it looks!

Summary

In this lesson, we established that:

Computers only understand numbers and math.
Text is a series of human symbols that have no inherent mathematical relationship.
We need a way to translate text into numbers that actually represent "meaning" or "context."

Next Lesson: We dive into the specific way LLMs solve this: Tokenization. We will learn how models like GPT-4 break sentences into tiny sub-word pieces to maximize their "numerical memory."