Policy-Based Segment Routing: Operational Implications
Segment Routing does not support fine-grained traffic engineering (TE), meaning label switch paths between different source / destination pairs, and variants there of, based on policy. Segment Routing does not, in my terminology, have the same capabilities as IP/MPLS, especially RSVP-TE based IP/MPLS (fine-grained TE with bandwidth reservation per LSP). Whether path-based engineering is needed, is a matter of preference; even more so for bandwidth reservations. However, drilling down on this issue, provides an opportunity to use the Three Olive Martini model to explore the options and tradeoffs between different parts of the network and what the implications would be for segment routing, if a network operator did want fine-grained TE.
Figure 1. Segment Routing reduces complexity, but not for the same capability
The old saying in network is you can’t reduce complexity, which I always append “for the same capability”. The conventional wisdom is that Segment Routing shifts information from the control plane to the forwarding plane. There is truth to this. However, whether the complexity/state reduction from the IP/MPLS control plane to the Segment Routing control plane is mostly due to do that shifting of information to the forwarding plane, or mostly to due to a reduction in capability, is probably difficult to quantify, but for sure it is both.
For operators that do not want fine-grained TE, that is fine, the conversation ends there. For operators who still want a little more fine grained TE, then this opens up the issue of how, and what are the implications. One candidate that emerges in this conversation is policy-based routing. In other words, source-routing traffic on different paths based on inspecting packets and identifying different flows. This creates more information, though not back in the control plane, in configuration and operations.
Figure 2. Segment Routing + Policy-based Routing increases complexity / state in configuration and operations.
This leads to the important discussion of what should be the single/primary source of truth in a network, the network itself or controllers/management systems.
It has been demonstrated time and time again that the most authoritative/accurate source of information about the network, is the network. Companies that have started with the principle that the network management system is the source of truth and the the “owner” of network state, have sometimes had to start again. If you want to know about the network, then ask the network. If you want to know if a port is UP on a router, ask the router.
As much as routers know the most about their own state, they don’t know anything about the intent of the network operator. This has to be communicated to routers, often through configuration.
Figure 3. Policy being distributed from network manager to routers.
Generally speaking, routers do not have a way, today, to discover / create intent. This is clearly information that is derived from services supported, business models, and engineering / operations philosophies.
A potential problem arises if multiple entities start playing with policy, controllers/network managers and individual operations people.
Figure 4. Policy being distributed from network manager to routers.
There are multiple challenges with policy configuration to start with. Generally speaking, it can be difficult to look at policy statements and know individually, and especially collectively, what the reasons behind the configuration are. This can become especially confusing for operations people when doing troubleshooting. It is worse when there are policy statements throughout the network that have to be coherent. In the modern trend, and especially with Segment Routing, the issue of coherency is mitigated by moving policy from the core to the edge / distribution points in the network. The issue of why something is configured the way it is, may still remain. Hopefully controller / network managers will address this challenge going forward.
A potentially bigger problem is if one person does policy configuration via the CLI and another does it via a controller / network manager. It can be hard enough to know what the network should be / is doing with a specific flow when the list of policy statements become long. It will be even more difficult if a troubleshooter goes to the controller / network manager and gets one view of what is happening, not knowing about what has been entered through a command line interface (CLI).
The more non-programmatic configuration there is, the more confusing it maybe for operations people, to know what the network is doing, particularly problematic when trouble-shooting.
Conclusion
You cannot reduce complexity / information for the same capability. Segment Routing reduces state in the core, reducing information about individual paths, and thus reducing capability as well. Those network operators that want a more fine-grained approach to TE, may decide to use policy-based routing, which may push more state and complexity into configuration / operations. Information / complexity does not only move between control and forwarding plane, but also between the management plane / configuration / operations and the network.