When a breach happens, the difference between a contained incident and a catastrophic failure comes down to preparation and process. This presentation covers the NIST IR framework end-to-end.
Average time to identify a breach: 204 days. Average time to contain: 73 days. Structured IR cuts both numbers dramatically.
Preparation is the only phase you control before the incident. Every hour invested here saves ten during response.
The biggest mistake in detection: alert fatigue. When everything is critical, nothing is.
Critical balance: contain fast enough to limit damage, but don't tip off the attacker before you understand their full footprint.
| Phase | Actions | Key Metric |
|---|---|---|
| Recovery | Restore from clean backups, verify system integrity, phased reconnection to production | Mean time to recover (MTTR) |
| Monitoring | Enhanced monitoring of affected systems for 30-90 days post-incident | Recurrence rate |
| Post-mortem | Blameless review within 72 hours. Timeline, root cause, what worked, what didn't | Action items completed |
| Improvement | Update playbooks, retrain staff, patch process gaps, test fixes | Time to close action items |
A post-mortem that doesn't produce measurable action items with owners and deadlines is just a meeting.
Your IR team must include non-technical roles. Legal decides notification timelines. PR controls the narrative. HR handles insider threats.
On June 27, 2017, the NotPetya wiper malware destroyed 49,000 laptops, 3,500 servers, and the entire Active Directory infrastructure at Maersk, the world's largest shipping company.
Total damage: $300 million. Operations in 76 ports across 130 countries halted. The company was rebuilt from scratch in 10 days — only possible because a single domain controller in Ghana had been offline during the attack.
Key lessons: Maersk had no segmentation between IT and OT networks. The malware entered through a Ukrainian tax software update (supply chain attack) and spread via EternalBlue and credential harvesting.
The recovery required simultaneous reinstallation of 4,000 servers and 45,000 PCs. Staff worked around the clock using WhatsApp because email was down.
Run tabletop exercises at least quarterly. Rotate scenarios. Include executives — they make the hard calls during a real incident.
The goal isn't to test technical skills. It's to test decision-making, communication, and coordination under pressure.