
The Watchmen: CloudWatch and CloudTrail
Master the tools of observation. Learn the difference between monitoring performance with CloudWatch and auditing actions with CloudTrail.
Observation is Governance
In the AWS shared responsibility model, you are responsible for monitoring your own applications. AWS provides two essential services that act as the "Black Box Recorders" for your AI project.
On the AWS Certified AI Practitioner exam, you will be given scenarios and asked to pick between CloudWatch and CloudTrail. Half the students get these confused! Let's ensure you aren't one of them.
1. AWS CloudTrail (The Auditor)
CloudTrail is for "WHO did WHAT." Every time someone makes an API call in your AWS account (e.g., "Create a SageMaker Notebook" or "Invoke Bedrock Model"), CloudTrail records it.
- Focus: Governance, Compliance, Operational Auditing.
- Example Log: "The user 'Admin-John' deleted the production model 'Cancer-Detect-v1' at 2:00 PM from IP address 1.2.3.4."
The Memory Hook: CloudTrail is the Security Guard who has a clipboard and writes down everyone who enters or leaves the building.
2. Amazon CloudWatch (The Observer)
CloudWatch is for "HOW is the system FEELING." It tracks metrics like CPU usage, Latency (Delay), and Error Rates.
- Focus: Performance, Health, Dashboards, and Alarms.
- Example Metric: "Your Bedrock model is taking an average of 4 seconds to respond (Latency is high)."
- CloudWatch Logs: This is where the actual "Output" of your code goes (e.g., prints and errors from your script).
The Memory Hook: CloudWatch is the Dashboard in a car. It tells you how fast you are going and if the engine is getting too hot.
3. Comparison for AI Scenarios
| Question | Tool |
|---|---|
| "Which model is costing us the most money today?" | CloudWatch (Metric: Tokens processed) |
| "Who authorized the deletion of our training data?" | CloudTrail (Action: DeleteObject) |
| "The AI is responding too slowly. Why?" | CloudWatch (Metric: Model Latency) |
| "We need to prove that we encrypted our data." | CloudTrail (Action: CreateKey) |
4. Visualizing the Watchmen
graph TD
A[Data Scientist] -->|Action: InvokeModel| B[AWS API]
B -->|Log Action| C[AWS CloudTrail]
B -->|Execute AI| D[Bedrock/SageMaker]
D -->|Send Metrics: Tokens/Errors| E[Amazon CloudWatch]
E -->|If Errors > 5%| F[CloudWatch Alarm: ALART THE TEAM]
C -->|Retention Policy| G[S3 Bucket: Audit History]
5. Summary: Operational Excellence
- If the question mentions "Compliance," "Auditing," or "Identity," choose CloudTrail.
- If the question mentions "Performance," "Errors," "Latency," or "Monitoring," choose CloudWatch.
Exercise: Identify the Tool
A company's AI chatbot suddenly stops responding to customers. The engineering team needs to see if the "Error Rate" has increased in the last 10 minutes. Which service should they look at?
- A. AWS CloudTrail.
- B. Amazon CloudWatch.
- C. Amazon Inspector.
- D. AWS IAM.
The Answer is B! "Error Rate" is a performance metric, which is the domain of Amazon CloudWatch.
Knowledge Check
?Knowledge Check
Which AWS service would you use to see 'Who' called a specific Amazon Bedrock API and 'When' they called it?
What's Next?
Watching the system is great, but does it follow the "Global Rules"? In the next lesson, we see how AWS helps you pass legal audits. Find out in Lesson 3: Regulatory standards and AWS compliance reports.