Editor’s Note: A single gray rhino is difficult enough to wrangle. But when they come together, gray rhinos create a crash, literally —”crash” is the zoologically correct term for a group of rhinos— but also is the result of many problems, often avoidable, converging. Think of complex systems as crashes of gray rhinos. Complex systems theory helps us to understand how interconnected failures within a system can create major catastrophes and what we can do to head them off. In this article, Chris Clearfield and András Tilcsik, winners of the 2015 Bracken Bower Prize, explain how systemic challenges are increasing and why it is crucial for organizations to recognize warning signals earlier and develop more effective crisis responses. Chris and András founded the Rethink Risk Project, Their forthcoming book, MELTDOWN, will be published by Penguin Press in 2018.
“This was not our drilling rig, it was not our equipment, it was not our people, our systems or our processes.”
– BP CEO Tony Hayward, 13 days after the explosion aboard Deepwater Horizon
Despite Mr. Hayward’s assertion, it was ultimately BP’s failure to manage the myriad risks of deepwater drilling that caused a tragic loss of life, widespread environmental damage, and a bill of upwards of fifty billion dollars. The failure of Deepwater Horizon, and BP’s inability to contain the subsequent oil leak, was not simply a failure—it was a system meltdown.
It’s not hard to find other examples that reveal unexpected fragility in our systems. In the winter of 2009, weather conditions caused the breakdown of Eurostar trains, stranding 2,000 passengers inside the Channel Tunnel and contributing to a transportation standstill across western Europe. In October 2012 emergency generators at New York University’s hospital failed during Hurricane Sandy, forcing the evacuation of critically ill patients. Earlier that same year, incorrectly deployed software at the market maker Knight Capital flooded the stock market with millions of unintended orders and caused Knight to lose nearly half a billion dollars in just 45 minutes. Indeed, the Global Financial Crisis, including the bankruptcy of Lehman Brothers, the near-collapse of AIG, and related liquidity shocks, represents a series of interconnected system failures.
Though these failures look different on the surface, many of the underlying causes are surprisingly similar. Modern systems are steadily becoming both more complex and more interconnected in ways that are not well understood.
In simpler systems, most risks stem from predictable disruptions, and small mistakes tend to have minor and well-understood consequences. In contrast, in many modern systems, small errors can combine in novel ways to yield large failures that are hard to understand even as they unfold. To understand this distinction, [bctt tweet=”Compare the physics of throwing a ball with the dynamics of an avalanche: predictable vs not” username=””]. A ball follows a predictable path, and the harder you throw it, the farther it will go. In contrast, an avalanche can be triggered by a small event that unleashes a wildly more powerful response.
[bctt tweet=”Systemic challenges are proliferating and reshaping the risk landscape.” username=”rethinkrisk”]In a recent survey of C-suite executives, nearly 60% reported that the volume and complexity of the risks they face have increased substantially over the past five years. Objective measures, too, suggest that the physical and financial context in which organizations operate has become radically riskier. According to the IMF, the recent worldwide cost of natural disasters has far outpaced the growth of global GDP. Similarly, since 1973, banking and currency crises around the world have been occurring twice as frequently as they did during the Bretton Woods period, leading some economists to conclude that “there is something different and disturbing about our age.”
As systemic challenges proliferate, there are increasing penalties for failing to address technological complexities, organizational weaknesses, and cognitive challenges that organizations might have been able to safely absorb in the past. As a result, the measures that once served organizations well in managing risk—instituting rules and controls, scenario planning, and bringing in additional expertise—are no longer sufficient.
Smart organizations are increasingly using interventions that help them detect early warning signals, reduce the number of errors that can trigger cascading failures, and develop more effective crisis response capabilities. Tracking near misses, for example, is a powerful way to learn from early signals of potential catastrophe, and there are notable success cases, particularly in aviation and healthcare. Likewise, organizations become more resilient when leaders appoint designated skeptics: devil’s advocates who stress test estimates, explore extreme scenarios, and challenge optimistic assumptions.
It is also fundamentally important to avoid pushing through during a crisis. Sticking to an existing plan even in the face of new, contradictory information has played a key role in a variety of failures, including the Deepwater Horizon oil spill, NASDAQ’s handling of the Facebook IPO, and numerous aviation accidents. While there will always be pressures to continue in the face of uncertainty, executives can foster norms that help organization members overcome the psychological challenge of conceding (temporary) defeat by halting an ongoing process or giving up on a planned course of action. At a trading firm we worked with, for example, one junior trader reported that he had never received as much praise from senior managers as when he stopped an apparently profitable trade after realizing that he did not fully understand it. Such [bctt tweet=”Timely feedback can create norms that, one day, may prevent catastrophe.” via=”no”].
These solutions don’t require large financial investments or expensive technologies. But that does not mean that they are trivial to implement. Organizational cultures often celebrate self-confidence, decisiveness, persistence, accord within a group, and good news. In contrast, reducing the potential for catastrophic failures requires an emphasis on the importance of doubt, hesitation, dissent, and the sharing of bad news. A cultural shift in this direction can be an extremely difficult leadership challenge, especially in high-performing organizations unaccustomed to failure.
Despite these challenges, there are important cases of success. Since the late 1970s, for example, commercial aviation has undertaken radical changes to create an effective risk management culture—reducing hierarchy to encourage dissent, creating designated skeptics, tracking near misses, and fostering norms of stopping. As a result the industry has achieved massive improvements in safety even as aircraft and operations have become significantly more complex. As the risk landscape continues to shift, the ability to implement such interventions will become one of the defining traits of successful organizations.
- Rethinking the Unthinkable: Managing the Risk of Catastrophic Failure - February 8, 2017