
Module 6 Lesson 5: Rollbacks and Error Recovery
The emergency exits. Learn how to perform instant rollbacks when a deployment goes wrong and how to automate recovery using GitLab's built-in tools.
Module 6 Lesson 5: Rollbacks and Error Recovery
No matter how many tests you have, things will eventually break. A senior DevOps engineer is judged not by whether they have outages, but by how Fast they fix them.
1. The Manual Rollback (The "Easy" Button)
Inside the GitLab Operate -> Environments page:
- Next to every successful deployment, there is a "Rollback" button.
- Clicking this simply re-runs the "Deploy" job of the previous successful pipeline.
- Tip: This is why "Immutable Artifacts" (Module 3) are so important. You need that old version to still exist on the server to roll back to it!
2. Automated Rollbacks (V14+)
For Kubernetes and certain cloud environments, GitLab can detect an error and roll back automatically.
- If the "Health Check" (Module 5) fails for 2 minutes after a deployment, GitLab fires a webhook to the orchestrator to revert to the previous image.
3. "Fix Forward" vs "Roll Back"
- Roll Back: Reverting to the old version. (Best for "Site is Down" emergencies).
- Fix Forward: Pushing a new, quick fix to the
mainbranch. (Best for "Typo in a button" or small UI bugs).
4. The "Post-Mortem"
Once the site is back up, you must use the GitLab Audit Events and Pipeline Logs to find out:
- Why did the tests pass but the deployment failed?
- Was it a server configuration issue?
- How can we add a new "Quality Gate" (Module 5) to ensure this specific bug NEVER happens again?
Exercise: The Emergency Drill
- Imagine your "Production" deployment script just deleted the
/var/wwwfolder. What is your 5-second plan to fix it? - Go to Operate -> Environments and find the "Rollback" button. Research: Does the rollback re-run the
teststage too? - Why is the "Roll Back" strategy better for user experience than leaving the site "Broken" while you try to fix it?
- Search: How do you use the
when: on_failurekeyword in a.gitlab-ci.ymlto send an alert to a phone?
Summary
You have completed Module 6: Deployment Strategies. You now have the skills to get code to the user using SSH, manage complex staging/production pipelines, use advanced patterns like Blue-Green, and recover instantly when things go wrong.
Next Module: Container transformation: Module 7: Containerized Pipelines.