Module 2 Lesson 4: Hallucinations and Bias
Common failure modes. Why AI makes things up and how to detect biased or incorrect outputs.
The Flaws: Hallucinations and Bias
LLMs are impressive, but they are not truth machines. Because they are statistical engines (Lesson 2), they suffer from two major problems: Hallucinations and Bias.
1. Hallucinations: Confident Lies
A "Hallucination" is when an AI generates an answer that sounds perfect but is factually wrong.
- Why it happens: The model is trying to predict the most likely next word, not the most truthful one. If the model is 51% sure the capital of California is San Francisco (because it's a "Big City"), it might just say it with total confidence.
2. Algorithmic Bias
AI models are "Prejudiced" by the data they are trained on.
- The Problem: If 90% of the computer science books in the training data were written by men, the AI might unconsciously associate the word "Software Engineer" with the pronoun "He."
- The Risk: Bias can lead to unfair hiring, offensive content, or the erasure of different cultures.
Common Failure Modes
| Failure Mode | Definition | Real-world Example |
|---|---|---|
| Hallucination | Factually incorrect but fluent. | Making up a fake legal case name. |
| Stereotyping | Generalizing based on training data. | Assuming a "CEO" is always male. |
| Logic Gaps | Failing at simple arithmetic. | Saying 1,000 + 1,000 = 3,000. |
| Sycophancy | Agreeing with the user too much. | Confirming a user's wrong belief. |
💡 Guidance for Learners
Trust, but verify. Never use an LLM for factual research without a second source. Treat the LLM as a highly intelligent but occasionally drunk intern.
Visualizing the Hallucination Risk
graph TD
User[Query: 'What is the speed of a unicorn?'] --> AI[AI Engine]
AI --> Logic{Is this a logical fact?}
Logic -->|No| Stat[Statistically likely words regarding speed]
Stat --> Result['A unicorn flies at 50mph.']
Result --> Warning[Hallucination Detected!]
Summary
- Hallucinations are fluent, confident, but false statements.
- Bias is a reflection of the "Prejudices" found in internet-scale data.
- Verification is the responsibility of the human user.