Just outside the tiny Japanese village of Aneyoshi, high on a mountainside overlooking the Pacific Ocean, stands an ancient stone, a weather-smoothed oval three feet tall engraved with rune-like markings in an archaic script.
It wouldn’t be surprising to learn that this strange monument had long ago risen spontaneously out of the surrounding woodland, because the words written on it read like an exhortation from nature itself: “Do not build your homes beneath this point”.
It is one of hundreds of so-called tsunami stones dotted through the region, strange, semi-mythical folk memories of the great tidal waves that devastated the region in 869, 1896 and 1933 -- and undoubtedly many other times stretching into pre-history -- killing thousands, flattening homes and devastating lives.
Mostly, the stones give simple advice: if the wave strikes, seek higher ground. Those who heeded the Aneyoshi stone remained safe on March 11 2011, when a tidal wave struck again, because the waters stopped just 100m below it. On the coastal plain beneath it, though, was devastation.
Japan’s modern early warning system of concrete flood-defences, satellite surveillance and automated text-messaging proved inadequate. Three reactors at the Fukushima nuclear power station went into meltdown, towns were reduced to rubble. Fifteen thousand people died, 90 percent of them drowned. The cost of the disaster was calculated at $235bn. If only they’d paid attention to the stones.
Investigations following disasters always find systems failures, engineering design-flaws and human error. But they also find something else: that in hindsight, they were avoidable.
Whether it is the meltdown at the Chernobyl nuclear power station in Ukraine in 1986 that almost made whole swathes of Eastern Europe uninhabitable for centuries, poor testing that resulted in two Boeing 737 Max planes falling from the skies and killing 346 people, the failed risk-management systems that created the 2008 banking crisis or the apparent sluggishness with which Western governments responded to the COVID-19 pandemic, the same question always pops up: why did nobody see it coming, when after the event it seems so inevitable?
The bigger question, of course, is: what can organisations do to prevent disasters occurring in the first place?
It was cold in Florida on the morning of January 28 1986, just 30 degrees Fahrenheit. Ice had formed on the base of the space shuttle Challenger’s launch pad at Cape Canaveral, but the Orbiter was still scheduled to blast off that day. Engineers from Morton Thiokol, the contractor which had built and maintained the solid booster rockets that sat on the sides of the space shuttle, were concerned.
The boosters were built from several lengths and held together by O-rings, rubber hoops that prevented the escape of fuel. But they had never been tested at such low temperatures, and engineers were frantic with worry that they would fail.
Three months earlier one of them, Bob Ebeling, had become so frustrated that management weren’t taking his concerns about the O-rings seriously that he had written a memo with the desperate title “Help!” The night before the launch Ebeling and others held a teleconference with NASA, telling them their worries.
“My God, Thiokol,” responded a NASA man, “When do you want me to launch? Next April?” Another call was held, without the engineers this time, and management decided the launch would go ahead. On the morning of the launch Ebeling told his daughter: 'The Challenger's going to blow up. Everyone's going to die.” He was right.
Recovering a fragment of the Space Shuttle Challenger.
Initial investigations into the disaster, which led to the deaths of seven crew, concentrated on the engineering failures. But looking for the widget that failed is not the way to find the real cause.
“Most of these guys who are dealing with these disasters are probably engineers who were good so they went into management,” says Karlene Roberts, head of California–Berkeley’s Center for Catastrophic Risk Management, who has been studying disasters for over 30 years.
“But these people have engineering backgrounds and so they go back to what their base is and what their understanding is.” Engineers are trained to look for engineering problems, so that is what they find.
The analytical, engineering bias goes deeper than that; it can colour the way investigators see the world. An engineer’s whole outlook tends to be mechanistic, and their intellectual apparatus can be geared towards finding chains that end in a “root cause” of a disaster, what Australian disaster expert (and jet pilot) Sidney Dekker describes this as looking for “bad people, bad decision, broken parts”.
That is why they seem inevitable with hindsight. In the Challenger case, the investigations homed in on Ebeling and poor communication between the engineers and management. The failure can be precisely located, so why did nobody spot it? It feels inexplicable.
But that is to misunderstand the real nature of accidents, and organisations. “Modern process plants are complex, interlocking, intractable systems, designed and run by people -- socio-technical rather than pure technical systems,” Erik Hollnagel and Fiona Macleod, two experts in disasters, wrote in a recent paper. “Linear models are no longer adequate, nor is causal reasoning sufficient.”
Investigators always vow to “leave no stone unturned” when they investigate accidents, but that is irrelevant if they are constrained by the tools they are using. Deadlines, public or political pressure and the need to find someone to blame and punish can all influence an investigation, and push people to find a “cause”.
Hollnagel and Macleod call this the principle of What You Look for is What You Get. When your only tool is a hammer, everything looks like a nail.
Once the focus changes from finding a single cause to understanding the culture that surrounded the failure, things get interesting. The Chernobyl meltdown is a good example.
On Saturday 26 April 1986, reactor number four at the Chernobyl Nuclear Power Plant went into meltdown following a botched test. The initial investigation blamed the engineers who were on duty, and sentenced the ones who hadn’t died from radiation exposure to between two and 10 years in labour camps.
However, a second investigation in 1991 found that this had been unfair and that the main blame lay with the peculiar design of the reactor, which behaved in ways the engineers on duty at the time couldn’t have predicted. But more pertinently, there was a general “lack of safety culture” at both regional and national level. Valery Legasov, chief investigator into the Chernobyl accident, went even further.
“The accident was the inevitable apotheosis of the economic system which had been developed in the USSR over many decades,” he told the International Atomic Energy Agency’s investigation. “Neglect by the scientific management and the designers was everywhere,” he went on. “It is impossible to find a single culprit, a single initiator of events, because it was like a closed circle.”
The system of the USSR might seem uniquely dysfunctional, but some aspects echo the culture that surrounded some of the best-known business disasters. Perhaps the most famous description of the psychology of people living under Soviet communism is the 1978 essay The Power of The Powerless, by Czech playwright (and later president) Vaclav Havel.
He tells the story of a fictional greengrocer who dutifully places a Workers of the world, unite! poster in his shop window, glorifying the regime. He does this “because these things must be done if one is to get along in life”, writes Havel.
The greengrocer and his boss and his boss’s boss, right up the politburo, know that the regime is failing. But criticising it by refusing to display the poster would have very negative consequences for the greengrocer, while doing nothing to change the system. Similar stories emerge from many organisations that failed.
The explosion at Chernobyl’s number 4 reactor emerged from the uniquely dysfunctional culture of the USSR.
At Enron, the energy giant whose collapse in 2001 was one of the most spectacular in corporate history, employees created and manned a fake trading room to impress visiting Wall Street analysts, and saw executives openly cooking the books to hit quarterly targets that would support the share-price.
They said nothing. As we all know, “these things must be done if one is to get along in life”. People have mortgages to pay, and the bigger picture is not their concern. This is precisely the sort of culture that breeds catastrophe.
In 1966 the Welsh mining village of Aberfan suffered one of the most terrible disasters in 20th century peacetime. On a hillside above the village was a vast “tip”, a pile of fine, discarded rubble from the coal mine that employed many of the villagers. Since it had been started in 1958, tip number seven had grown to a height of 34 metres, and contained 229,300 cubic metres of waste. Beneath it was highly porous sandstone criss-crossed with streams and springs.
Villagers had been worried that the tip was unstable, and had asked the National Coal Board to secure it. They didn’t, and following weeks of heavy rain it collapsed at 9.15am on the morning of 21 October, racing down the hillside at speeds of up to 50mph and crashing straight into several buildings, including the village school where the children had just sat at their desks. In total, 116 children and 28 adults were killed.
The Aberfan disaster in 1966 was the result "not of wickness but of ignorance, ineptitude and a failure in communications."
A high-profile inquiry followed, in which the NCB was blamed, but no individuals; it explained that the story of Aberfan was “not of wickedness but of ignorance, ineptitude and a failure in communications”. That might be unsatisfying to our instinct to place blame on a particular person, or a mechanical failure, or lapse in judgment, but it captures the truth of the situation.
The sociologist Barry Turner studied Aberfan for a ground-breaking paper called The Organizational and Interorganizational Development of Disasters in which he created the concept of an “incubation period” during which problems build. He identified “a pervasive institutional set of attitudes, beliefs, and perceptions” in the coal industry, resulting in “collective neglect” of the issue of tip safety.
Because the main focus was on mine safety, information about tip-slips was not passed around the industry even though the information that could have prevented Aberfan was available since the 1930s. It is true that the direct cause of the disaster was that the tip had been built on a stream, the deeper -- and more pertinent -- reasons were cultural.
Signs of a deeper problem
It might seem over-optimistic to ask if there are common characteristics of organisations that incubate disasters, but the people who study disasters for a living say -- surprisingly -- that very similar themes recur.
“We go into an accident situation now and my team members and I will look at each other and say: "This again?" says Karlene Roberts. To the trained eye, the signs of a disaster-incubating culture are everywhere. “One of my team once said to me: ‘When I see a dirty rag on the floor under some pipes, I really get worried’,” says Roberts. If an organisation is lax on the small stuff, it is lax on the big stuff too.
Someone else who has been around disasters for decades is Stuart Diamond, a former New York Times journalist who won a Pulitzer prize for his investigation into the Challenger disaster and also covered the partial Three Mile Island nuclear accident, the Bhopal disaster and Chernobyl, and who has trained people at Google and the US military.
He says that disaster-incubating organisations have three recurring themes. The first, he says, is “a lack of training, causing human error.” Often, there is nothing fundamentally wrong with the engineering of the machinery that has gone wrong, but the people operating it are not up to the job.
“There is an old saying that nuclear power plants are designed by geniuses, but run by idiots,” Diamond says. “A lot of the accidents happened when the least senior people are on duty. I interviewed most of the people on duty the night of the accident in Bhopal, and they were operating this high-technology plant and I wouldn't have trusted them to change my light bulb, or put gas in my tank.”
This is also reflected in the so-called “weekend effect” in healthcare, meaning that people are more likely to die if they go to hospital on a weekend, when more junior staff are on duty. The test that led to the Chernobyl disaster happened 10 hours later than scheduled, when a junior team was on duty.
A second common theme Diamond identifies is that in many disasters management did not sufficiently understand what the worst-case scenario could be. This can stem from the “hubris that comes from being an expert”, or other instincts.
With the shadow of Hiroshima and Nagasaki looming over the Japanese psyche, in the post-war period people were understandably nervous about nuclear energy. The upshot was that the government insisted that nuclear power was absolutely safe and there was no risk.
“Discussing a worst-case scenario was feared because it might bring panic to the citizens. And therefore it was omitted from the regulatory discussions,” says Akihisa Shiozaki, who worked on an investigation into Fukushima.
As Diamond says: “If you consider the worst-case scenario you’ve got to disclose it in public hearings, and when you do that, people either oppose the installation or they make you pay more to make it less risky.”
At Fukushima this reluctance to properly discuss a disaster scenario led to back-up generators, which could have prevented the reactor’s meltdown, being built in the wrong place. When the tsunami struck, they were wiped out by the wave.
Another recurring theme in disasters is that people cut corners because of incentives. Usually monetary ones.
The initial investigations into the crashes of two Boeing 737 Max planes in late 2018 and early 2019, which killed 346 people, focused on engineering failures. It was found that the Manoeuvring Characteristics Augmentation System software had malfunctioned and forced the aircrafts’ noses forwards so far that the pilots lost control and the planes crashed.
Focus then moved onto the way Boeing had saved costs, including outsourcing software development to an Indian subcontractor paying people $9 an hour, and persuading the Federal Aviation Administration that pilots didn’t need expensive flight simulator training before flying the plane.
The root of this, say seasoned Boeing-watchers, was the decision made in 2001 to relocate the company’s 500 top managers to Chicago, thousands of miles from the engineering HQ in Seattle where planes were actually built.
“When the headquarters is located in proximity to a principal business - as ours was in Seattle - the corporate centre is inevitably drawn into day-to-day business operations,” said Philip Murray Condit, CEO at the time. He saw this as a problem. Over time the top-brass lost connection with the engineers. Condit’s successor Harry Stonecipher said that under him Boeing was “run like a business rather than a great engineering firm.”
Over the past two decades, the stock has rocketed and in recent years Boeing made $43bn of share buybacks which benefitted shareholders - including senior management. That money largely came from savings made on the development of the 737 Max. The engineers, stranded in Seattle, were reduced to emailing each other to complain.
“It’s systemic. It’s culture. It’s the fact that we have a senior leadership team that understands very little about the business and yet is driving us to certain objectives,” wrote one. “Sometimes you have to let things fail big so that everyone can identify a problem… maybe that’s what needs to happen instead of continuing to just scrape by.” Even today, Boeing is desperately trying to keep its dividend high, to please investors.
It is fashionable to blame a shareholder focus for all the world’s ills, but blaming it is just another version of the instinct to seek out a single cause, which is an artefact of the causal way many of us are programmed to think. The reality is more complicated. Perhaps the most important concept when it comes to understanding disasters is “the normalisation of deviance”, a term coined by the sociologist Diane Vaughan in her investigation into the Challenger disaster.
At the beginning of the shuttle program engineers were worried about fuel leaks from the boosters. But they soon came to accept them as normal. “They redefined evidence that deviated from an acceptable standard so that it became the standard,” Vaughan writes.
As time went on, the standard kept on changing until, without realising it, they were operating right on the edge of disaster. You might call it risk-creep. Academics who study disasters talk about “normal accidents”, which are not products of abnormal or bad behaviour, but happen even though – or indeed because - the usual procedures were followed.
Large, complex organisations are perhaps best seen as fuzzy, baggy, messy organisms that muddle through. Sidney Dekker has written that “organisations incubate accidents not because they are doing all kinds of things wrong, but because they are doing most things right.”
That is, they negotiate and resolve conflicting goals and individuals make trade-offs between sometimes-conflicting drives like safety, efficiency and profit. A pilot overrides a glitchy system, a crew covers an oil-rig’s fire-alarms so they don’t go off and wake everybody at night, a nurse realises that the recommended dosage is printed wrongly on a medicine, and uses his common sense to administer the right one.
They are all doing their job and keeping the show on the road. The lack of consequences means they get rewarded. This is success. When something goes wrong, though, their actions are then compared to best practice or guidance, and seen through the lenses of causation and blame: they broke the rules, and should be punished. The tragedy of accidents is that incubating a disaster looks just like business as usual. Doing badly looks just like doing well, until it doesn’t.
This feature was first published in Management Today's Summer 2020 digital magazine.
Image credits: Getty Images