A Preoccupation with Failure

In a previous article on "Decision Making" I briefly acknowledged how catastrophic errors can arise in decisions, and the post-mortems of such failures provide quite powerful insights into the interplay of the many issues involved. In this article, I further consider how accidents can happen and how they can be avoided. In a nutshell, it requires a preoccupation with failure by the organization at all levels. I think this is compatible with the HSEQ ambitions of seismic service companies working in various energy industries.


Systems With High Levels of Interactive Complexity

A common theme in the case study analyses of catastrophic accidents is that they involved tightly coupled systems. Tight coupling exists if different elements of an organizational system are highly interdependent and closely linked to one another, such that a change in one area quickly triggers changes in other aspects of the system. 

Tightly coupled systems have four attributes:

  1. time-dependent processes; 
  2. a fairly rigid sequence of activities; 
  3. one dominant path to achieving the goal; and
  4. very little slack.
When such rigidity exists within an organization, with very few buffers among the various parts, small problems can cascade quickly throughout the system, leading to catastrophe.

Another key phrase is interactive complexity. Interactive complexity refers to the extent to which different elements of a system interact in ways that are unexpected and difficult to perceive or comprehend.

Often, these interactions among elements of the system are not entirely visible to the people working in the organization.

Systems with high levels of interactive complexity and tight coupling are especially vulnerable to catastrophic failures. In fact, one can argue that accidents are inevitable in these situations; certain failures constitute “normal accidents”.


Such observations, however, have little prescriptive value and do not move us toward an understanding of how to prevent catastrophic accidents.

What can be observed is that most complex organizational decision-making failures do not trace back to one single cause; they involve a cascading chain of decision failures.

I briefly describe two efforts below to describe how a series of events led to catastrophic failures, both essentially involving large and complex organizations, but each considered in two quite different ways—the first from a behavioral perspective, and the second from an organizational perspective.

One behavioral (or sociological) perspective on decision-making failures is “normalizing deviance”, summarized well two papers titled "The Challenger Launch Decision" and 'The Normalization of Deviance", respectively.

Normalizing Deviance

The term “normalizing” here refers to social normalization, the process through which ideas and actions are made to appear culturally "normal", rather than the adjustments of values or distributions in statistics that we may be used to in mathematical sciences.

Many famous catastrophic accidents such as the Challenger space shuttle accident are characterized by incubation periods that stretched over many years, not days or hours.

The theory of normalizing deviance applied to the Challenger disaster concluded that engineers and managers moved down a dangerous slippery slope in a gradual evolutionary process that took place over many years. The O-ring erosion at fault was neither expected nor planned in the original designs, and consequently the detection of occasional failures was considered an anomaly. Further failures happened, and gradually, the unexpected became the expected, and ultimately, accepted.

Culture shaped this evolutionary process within a vast organization working under tremendous schedule pressure. Then the unusually cold temperatures on the fatal launch morning created a catastrophic O-ring failure, rather than mere cracking, and led to the result we now all know too well. It was the convergence of several key factors, notably the unusually cold temperatures on the launch morning that ultimately led to the “manageable” problem becoming unmanageable.

One organizational perspective on decision-making failures is “practical drift”.

Practical Drift

All organizations establish rules and procedures. Units within the organization engage in practical action that is locally efficient. These locally efficient procedures become accepted practice and perhaps even taken for granted by many people.

Gradually, actual practice drifts from official procedure. Some of that informal action is thoughtful and entrepreneurial. But sometimes communication breakdowns (and other barriers) cause organization members to not understand how their actions may affect others in other units. Unforeseen interactions occur at times, and these can be problematic.

One case study in "The Art of Critical Decision Making" course that I referred to in my previous related article described how US military forces in Iraq failed to recognize that colleagues in a different part of the country had changed the signage on helicopters for location-specific reasons, were not recognized when they flew into another US-occupied area and were shot down by their colleagues.

Mitigation and avoidance of practical drift quite logically involves building a climate of candid dialogue where the issue is affected by disconnected teams/offices/locations, encouraging transparency in organizational structures and systems, the development of cross-functional teams, and an attack on silo thinking and interdivision rivalries.

A Relevant Case Study

I saw an excellent presentation given by The Hon Sir Charles Haddon-Cave, Her Majesty's High Court of Justice in England, where he presented his findings from the review into the loss of the RAF Nimrod MR2 Aircraft XV230 in Afghanistan in 2006.


A “swiss cheese” model to illustrate the contributing factors to the Nimrod disaster.

Haddon-Cave eloquently emphasized how the convergence of many organizational, group and individual failures over a 30 year period that led to the catastrophic accident. His “swiss cheese” model below helps intellectualize the various contributing factors:

  • Poor design in the 1960s.
  • A history of fuel leaks in the 1970s and 180s, not acted upon.
  • An increase in operational tempo in the 1990s and 2000s.
  • Increasing problems maintaining what had become an ageing aircraft mode.
  • Distractions from major organizational changes and cuts.
  • The outsourcing of safety management.
  • The advent of air-to-air refueling, which ultimately led to the catastrophic fire and loss of aircraft and those onboard.

A major theme in Haddon-Cave’s review is that complexity should be avoided where possible in how organizations and projects are managed, lest they be incubators for failure and catastrophe.

Indeed, E.F. Schumacher, economist, author of “Small is beautiful”, wrote “Any intelligent fool can make things bigger, more complex, and violent. It takes a touch of genius—and a lot of courage, to move in the opposite direction. Keep it simple.”

Summary

I have only captured a few miscellaneous aspects here in order to emphasize the complexity of individual, group and organizational dynamics. I began this article today by addressing systems with high levels of interactive complexity and tight coupling, and I think we would all agree that we are faced with such challenges on a regular basis in our lives.

My very simple ambition when sharing these examples is that such a process helps us think about the challenges of our own unique problems. Are we unwittingly being affected by cognitive biases when we make time-constrained decisions? Are we being influenced by organization and behavioral forces that lead us to lose sight of the things that matter? Are we better able to anticipate problems when designing products and solutions, and are we better able to anticipate how to better support our organization when they receive new products and solutions?

Disclaimer

The content discussed here represents the opinion of Andrew Long only and may not be indicative of the opinions of Petroleum Geophysical AS For its affiliates ("PGS") or any other entity. Furthermore, material presented here is subject to copyright by Andrew Long, PGS, or other owners (with permission), and no content shall be used anywhere else without explicit permission. The content of this website is for general information purposes only and should not be used for making any business, technical or other decisions. 

Comments

Popular posts from this blog

Least Squares Migration: 2 of 2 Articles

Least Squares Migration: 1 of 2 Articles

Robot Farts: Debunking the Myth of 'Seismic Blasting'