Introduction
“Some people consider network automation harmful because it blocks the ability for rapid changes in your network? True or False?”
LinkedIn discussion on the above statement led to dialog on the good / bad of rules, flexibility, the problem with automation, etc. In this blog, I unpack some of my thoughts on these topics: automation is great, there is a place for rules, and you get what you focus on in software development, including flexibility.
Automation Rocks
Is automation good or bad? Well, the jury rendered its verdict on this issue, some time ago. As long ago as The Wealth of Nations, if not before, it became clear to anyone who was seriously examining the issue, that specialized execution of a repeated task has enormous economic benefits. In my own personal work, I endeavor to automate everything I can.
The more I automate, the more value my function is producing for the company I work for / the more return my employer is getting on their investment in me. The only better return my employer is going to get is purchasing / licensing software that has already solved how to automate my function, and I welcome that as well. In the area I work, the stuff that can be automated is usually the stuff that has emerged from increasing complexity, not so much the creativity that comes from partnering with colleagues. Your mileage may vary.
When Automation Does Not Rock
There is a joke in networking, that is told in various flavors of “automation dramatically reduces the time it takes to bring down the network”. Jokes like this do not exist without some element of truth - or at least some widely retold anecdotes.
A couple of decades ago, the telephone network on the eastern seaboard of the United States had a long outage because the control plane reaction to a single line card failure, in a single switch, created rolling instability. The Internet has seen the occasional, usually partial, meltdown as well. To err is human. To really screw things up requires a computer. Incidents such as these reveal the truth in the above joke.
When you have complex entanglements and dynamic reactions, no question the world can become gnarly, quickly, and uncontrollably. Flooding, flapping, fuller routing tables, and large scale fast reroutes have all occasionally created challenges. At the same time, the industry has developed ways of dealing with these issues: architecture / design choices, algorithms, and more. We observe a problem with automated control planes, we discover a mitigation, and we move on. We value a dynamic, automated control plane, so we fix the issues.
It is possible to observe where automation carries little risk, and where automation carries substantial risk. Observing that risk / reward scenario leads to choices. No question some aspects of network automation will be viewed as having much greater risk, while other aspects will be viewed as mundane. These judgments will impact choices and the velocity of automation adoption in some aspects of networking.
Rules, Rigidity, and Flexibility
Software can or cannot be written for change. It is a programming choice. It stretches from the golden rule of not embedding magic numbers in code to many more complicated choices. Sometimes flexibility is simple to implement, sometimes it requires more thought and effort. Flexibility can also introduce complexity as well, both for the programmer and the operation of the software. Some programmers are more skilled than others at reducing that complexity, and some focus on it more than others.
Rules can be great, and rules can be a disaster. If you can characterize something precisely and match on it, then you can achieve a high-fidelity result. This is often the case in network automation. On the other hand, rules are not always optimal. Want to use rules (thresholds) to detect latency anomalies in a large network with a large spread of different distances and traffic loads? Results are not so good there. We all need the right tool for the job…
If the tool is right for the job, how easy is it to create. modify, and instantiate? In some shops, a rule change requires a work order to developers in another department who put their own priority on the request. In other shops, network operations teams have tools that make rule changes easy, without requiring any code changes.
There are places where rules make sense, and there are places where they do not make sense. Where they do make sense, the implementation itself, not the rules per se, often make a big difference. It is all about how the rule management software was designed and implemented.
Conclusion
In the world of software there is a seemingly endless option set, with all manner of intended and unintended consequences, and all manner of engineering trade offs. Pick your battles, code with intent, iterate when you have gained some experience / wisdom, and resist saying “software is xxx”. Software, and manifestations of software, for example automation, are whatever the developer(s) and operations make them.