Fundamentals

03. Core Fundamentals

Root Cause Analysis

Why the obvious answer is almost never the right one.

Most problems are not what they look like on the surface. The real cause is usually one or two layers deeper, and you find it by asking 'but why?' until you get to something small, specific, and fixable.

8 min read 5 steps Plain English
?Surface
1Why?
2Why?
3Why?
4Why?

01. The one-liner

Most problems are not what they look like on the surface. The real cause is usually one or two layers deeper, and you find it by asking 'but why?' until you get to something small, specific, and fixable.

02. Where it came from

The short history

The 5 Whys came out of Toyota in the 1930s, long before the rest of the Lean toolkit was assembled. Sakichi Toyoda (the founder of Toyota, and Taiichi Ohno's mentor) was famous for it on the shop floor. A machine breaks. Someone says, 'It's broken.' Toyoda says, 'But why?' And then 'But why?' again. Roughly five whys in, you stop hearing surface answers and start hearing real ones. The number isn't sacred. Sometimes it's three. Sometimes it's seven. Five is just the rough average it takes to get past the polite answer, past the convenient answer, past the answer that protects you from blame, and into the actual cause.

The other classic root-cause tool, the Fishbone diagram, came out of Japan in the same era. Kaoru Ishikawa was using it at Kawasaki shipyards by the 1960s. He drew a big horizontal arrow with the problem written at the head, then six diagonal lines branching off as places to investigate: people, methods, machines, materials, measurement, and environment. The point isn't the artwork. The point is that brains find it easier to think about a problem when the categories are sitting on a page in front of them. The two techniques are companions. The 5 Whys digs vertically into one chain of cause. The Fishbone spreads horizontally to make sure you haven't missed a whole category. Most real problems need both.

03. How it works at work

The Belt textbook version

Your team missed a deadline last week. The obvious answer is, 'We didn't have enough time.' That's almost certainly not the actual cause. Sit down with a piece of paper and walk it through.

  1. Why did we miss the deadline?

    Because the testing phase took twice as long as we estimated. That's the first answer. It's also the easiest one, and it's the one most teams stop at. Don't stop.

  2. Why did testing take twice as long?

    Because we kept finding bugs we hadn't planned for. Already this is more useful than the first answer. But it's still a symptom, not a cause.

  3. Why were there so many unexpected bugs?

    Because the spec didn't define what was supposed to happen at the edges. Most teams fix things here. Most teams are wrong to fix things here, because the spec isn't the root cause either.

  4. Why was the spec unclear about edges?

    Because we wrote it in a one-hour meeting two days before kickoff, under pressure to start. Now we're getting somewhere. This is starting to look fixable, and it doesn't sound like anyone's fault.

  5. Why do we write specs under pressure?

    Because we promise feature dates before we've actually scoped what's involved. That's the root. It's a process problem, not a people problem. The fix is at the front end, not the back end. The fix is not 'test more' or 'work harder.' The fix is to change when scoping happens.

04. How it works in real life

The secret self-help version

You're tired all the time. The obvious answer is, 'I need more sleep.' Almost everyone stops there, takes a melatonin, and is just as tired in three weeks. Walk it through honestly.

  1. Why am I tired all the time?

    Because I'm only getting six hours of sleep. True. Also unhelpful as a stopping point.

  2. Why only six hours?

    Because I stay up until midnight, even though I want to be asleep by ten. Getting closer.

  3. Why am I awake until midnight?

    Because I'm on my phone in bed for an hour, watching things I won't remember tomorrow. Most people stop here, leave their phone in the kitchen for a week, find their sleep didn't really improve, and conclude that root cause analysis doesn't work. They stopped too early.

  4. Why am I on my phone for an hour at night?

    Because the bed is the first time all day my brain hasn't been doing something. The phone is filling a transition I never gave myself anywhere else.

  5. Why is bed my only quiet space?

    Because I go from the workday straight into family time straight into dinner straight into bedtime. The phone in bed isn't really about the phone. It's the body trying to find rest it didn't get during the day. The actual fix is a twenty-minute decompression window before family time. Outside if possible. Without a screen. The bed-phone problem will mostly fix itself once that gap exists.

05. The assignment

Try it this week

Time

15 minutes

What you need

A piece of paper. A pen. One thing in your life that keeps coming back wrong.

The exercise

Write the symptom at the top of the page. The thing you actually notice. 'I'm always behind.' 'We fight about the same thing every weekend.' 'I can't stick with workouts.' Underneath, write your first honest answer to 'why is this happening?' Then, in your own handwriting, ask 'and why is that?' four more times. Don't stop early because the answer looks too small. Don't stop early because it points at you. The smallness and the discomfort are usually how you know you found it.

Then

The first three whys are almost always defensive. Surface ones, safe ones, ones that blame the calendar or the kids or 'just how things are.' The actual root cause usually shows up at four or five, and it's almost always something you'd never have guessed at the start. Notice this. It's not a bug in the method. It's the whole point.

06. The honesty check

Explain it to a 12-year-old

When something keeps going wrong, the first answer is almost never the real answer. Ask 'but why?' five times in a row. The thing you find at the end is usually small, weirdly specific, and totally fixable. Most adults skip step four, fix the wrong thing, and then wonder why nothing changed.

If you can't say it this way, you don't quite have it yet. That's one of the rules of this site.