Module 1 Lesson 1: What ChatGPT Is and How It Works
An introduction to ChatGPT, Large Language Models (LLMs), and the underlying Transformer architecture.
What ChatGPT Is and How It Works
ChatGPT is a state-of-the-art conversational AI developed by OpenAI. It belongs to a class of models known as Large Language Models (LLMs). Unlike traditional software that follows rigid rules, ChatGPT learns patterns from vast amounts of text data to predict the next word in a sequence.
1. The Core Architecture: The Transformer
The project behind ChatGPT is based on the Transformer architecture, first introduced by Google in 2017. Transformers allow the model to process words in relation to all other words in a sentence, rather than just one by one.
How it Thinks
When you give ChatGPT a prompt, it doesn't "know" facts the way humans do. Instead, it uses Attention Mechanisms to weigh the importance of different parts of your input.
graph TD
Input[User Prompt] --> Tokenizer[Tokenization: Text to Numbers]
Tokenizer --> Transformer[Transformer Layers: Context & Attention]
Transformer --> Predictor[Probability Distribution: Next Token]
Predictor --> Output[Generated Response]
Output --> Feedback[Iterative Feedback Loop]
2. Training Phases
ChatGPT goes through three main stages of training:
- Pre-training: Learning grammar, facts, and reasoning from the internet.
- Supervised Fine-Tuning (SFT): Learning to follow instructions from human-provided examples.
- RLHF (Reinforcement Learning from Human Feedback): Humans rank different responses, teaching the model what is helpful, safe, and honest.
3. The Interface
The ChatGPT web interface is designed for simplicity:
- Sidebar: History of your previous conversations.
- Model Selector: Switch between versions (e.g., GPT-4o, GPT-3.5).
- Text Box: Where you "prompt" the AI.
Hands-on: Explore the Web Interface
- Open chat.openai.com.
- Look at the sidebar and settings.
- Try asking a simple question like, "Explain gravity to a 5-year old."
- Notice how you can "regenerate" the response or give feedback (thumbs up/down).
Key Takeaways
- ChatGPT is a predictive engine, not a database.
- It uses the Transformer architecture to understand context.
- RLHF is why it feels more human-like than previous AI models.