Tabletops That Find Real Gaps, Not Ones That Flatter the Plan

I've sat through a lot of incident tabletops, and most of them are quietly useless. Not because the people are unprepared — usually the opposite. The room is full of competent engineers and a confident incident commander, the facilitator reads a scenario off a slide, everyone narrates the step they'd take from the runbook, and ninety minutes later we all agree the plan works. We write "no major gaps identified" in the after-action notes and go back to our day. We just paid a dozen senior people to confirm a document.

That's the failure mode I want to name, because it's seductive. A tabletop that flatters the plan feels productive. It generates a clean artifact you can show an auditor or a board. And it teaches you nothing, because you designed it to be survivable. The whole point of a drill is to fail somewhere cheap. If nobody is uncomfortable, you didn't run an exercise — you ran a rehearsal of the happy path.

The real mechanic: you're testing decisions, not procedures

Here's the reframe that changed how I run these. A tabletop is not a test of whether your team knows the steps. It's a test of whether your team knows who decides when the steps run out. Procedures are the easy part — they're written down, they're trainable, and a real incident will blow past them in the first twenty minutes anyway. What actually determines whether an incident goes well is decision-rights: who can declare a Sev1, who can pull a customer-facing service offline, who can authorize talking to a regulator, who owns the call to fail over to a region when the failover itself is risky.

In a calm room, everyone assumes those answers are obvious. They are not. The single most common gap I surface is two people who each believe they own the same decision, or — worse — a decision that everyone assumes someone else owns and nobody actually does. You find this the moment you stop asking "what would you do?" and start asking "who gets to say yes, and what happens if they're asleep?" That second question is where the flinch lives, and the flinch is the finding.

How to design one that actually bites

A few things make the difference between theater and a real exercise. None of them are expensive.

Inject ambiguity, not just badness. A clean "the database is down" prompt gets a clean runbook answer. "Latency is up, error rates are weird in one region, and the dashboard you'd normally trust is one of the things behaving strangely" forces people to act without certainty — which is what real incidents feel like.
Take away a person or a tool mid-exercise. Your on-call lead is unreachable. Your primary alerting channel is the thing that's degraded. The one engineer who understands the legacy payment path is on a plane. Resilience that depends on a named hero isn't resilience; it's a single point of failure wearing a hoodie.
Force a decision with a clock. "You have five minutes to decide whether to fail over, and the failover has a known risk of data inconsistency. Go." Comfortable consensus evaporates under a timer, and that's exactly the condition you want to observe in a room where nothing is actually on fire.

I run two flavors, and they're not interchangeable. The tabletop is discussion-based — cheap, frequent, good for probing decision-rights and communication paths. The game day is the live version: you actually degrade something in a controlled environment and watch the real tooling, the real alerts, and the real muscle memory respond. Tabletops surface broken assumptions about who and how. Game days surface broken assumptions about whether the thing you wrote down even works. You need both, and in a fintech serving more than 1,500 financial institutions, the gap between "the runbook says" and "the system does" is not a gap you want to discover live at two in the morning.

What to do with what you find

The output of a good exercise is not a feeling of confidence. It's a short list of specific, uncomfortable things that didn't hold. The decision nobody owned. The assumption that the backup region was warm when it was actually cold. The escalation path that routed to a person who left the company. The fact that the incident commander and the head of comms had genuinely different ideas about who notifies customers and when.

Treat each one as a defect with an owner and a due date, same as you'd treat a bug. If a finding doesn't turn into a code change, a config change, a documented decision-right, or a removed dependency on one person, the exercise didn't count. And measure yourself on the right thing: not how smoothly the tabletop ran, but how many genuine surprises it produced. A drill that surfaces five broken assumptions is a roaring success. A drill that surfaces zero means you either have an extraordinary program or — far more likely — you ran it too safe.

So here's the challenge. Before your next tabletop, write down the one thing you're most afraid to test — the dependency, the person, the decision you quietly hope never gets stressed. Then make that the scenario. The gaps that flatter the plan will get found eventually. The only question is whether you find them in a conference room, or whether a real incident finds them for you, on its schedule instead of yours.

Incident ResponseSecurity OperationsLeadershipResilience

Case Studies & Practice

Open Source