The New Normal: Failure is a Good Thing

It seems that everyone is talking about microservices as the road to salvation. Why? Why now? The usual explanation is that it's an architecture style that encourages flexibility and makes a company more competitive. However, like agile development, patterns, and object-oriented development before it, the microservices architecture will not deliver its promises to everyone. If you understand what makes it work, you will know when and how to apply it successfully. Let us look deeper to see what forces really underlie this evolution.

 William Warby William Warby

In the 17th century Newtonian view of the world, the universe was compared to a mechanical clock that continued ticking along, as a perfect machine, with its gears governed by the laws of physics, making every aspect of the machine predictable. In many ways, we’ve designed our traditional IT infrastructures to perform as this perfect machine, minimizing risk by guarding against failure at all costs.

In reality, disruption is the order of the day—hardware breaks, flash mobs hit your site, viruses rampage around the network. I've seen e-commerce systems that couldn't take orders because the front end relied on a VPN that crossed corporate routers. The routers were saturated due to a worm in the corporate network, so the VPN was unavailable. I know of a data center that had a nest of snakes under the raised floor.

Instead of expecting everything to run like clockwork, we should anticipate the opposite. We must embrace failure as a means to build IT infrastructures and organizations that not only withstand threats but profit from them.

Everything breaks. It's just a question of when and how badly.

What we need is a new approach where “continuous partial failure” is the normal state of affairs. Continuous partial failure opens the doors to making big changes happen because you're already good at executing the small stuff.

In subsequent posts, I’ll talk about moving from the mentality of preventing problems to actually promoting them. I’ll look at the aging models for achieving resiliency and introduce microservices as an extension of the concept of antifragility into the design of IT infrastructure, applications, and organizations.

Along the way, I’ll share some stories about Netflix and their classic Chaos Monkey, how Amazon is becoming an increasingly terrifying competitor, the significance of maneuverability and the art of war, the unforeseen consequences of outsourcing and how Cognitect’s simple and sharp tools play a pivotal role in shaping the new IT blueprint.


Read all of Michael Nygard's The New Normal series here.  

Get In Touch