BGP-LS: What is it and is it still needed?
A generally accepted tradeoff in networking is state vs optimality. Routing Interior Gateway Protocols (IGPs) like OSPF and IS-IS are segmented into areas, in part, to reduce the amount of routing information each router must manage, and to reduce the amount of routing information that is propagated when a topology change occurs. However, this segmentation comes at a cost, less information could mean less optimal end-to-end routing, a specific issue for source-routed paradigms, that are looking to create disjoint paths end to end/optimal routing. With the emergence of Segment Routing, this issue is getting more attention.
BGP link state (BGP-LS), transports IGP link state information via BGP, but BGP does not use the information in its own routing decisions.
While BGP-LS could be used in a controller-less network, the obvious question is if the network architecture has prescribed multiple areas, then presumably, there is a reason, for example state reduction.
In an augmented network, that makes use of a controller for optimizations, the tradeoff pivots a little, because while areas may still make sense with respect to managing state within routers, the controller may be able to manage more state, especially if it has access to gobs of memory and/or can scale-out. This assumes naturally that the extra load on BGP, due to BGP-LS, does not cause problems.
Figure 1. BGP-LS Drivers
There are three main drivers for BGP-LS/ a solution like BGP-LS. Figure 1.
Perceived difficulty in extracting information from OSPF/IS-IS
Potential requirement to extract information from many routers
The importance of optimal/disjoint routing
There was a good discussion of these issues in a recent Packet Pushers episode, between Ethan Banks and Hannes Gredler.
Extracting information from OSPF/IS-IS
The argument for BGP-LS goes something like this:
OSPF/IS-IS do not use TCP
It is much easier for an application programmer to use a TCP-based approach
The general idea is that OSPF/IS-IS lack the TCP pacing mechanisms that prevent a receiver being overwhelmed and still eventually receiving all the packets. In addition, integrating OSPF/IS-IS state machines and transport modules requires very specialized programming experience.
Image Source: Packets and Stuff
Extracting information from many routers
Most Enterprise architecture books recommend 2 or 3 areas only. In the 90’s the recommendation was 3 areas, core, distribution and access. The modern trend is towards recommending two, core and distribution/access.
There is at least one well known OSPF network that has traditionally architected each POP as its own area. In a network like this, information would need to be extracted from hundreds of routers, especially when backup is considered. While the need for so many areas may have been more acute in 2013, when BGP-LS was being conceptualized, it could still be an architectural choice today, in fact there is a certain logic to it, why should other routers be burdened with the knowledge of routes within a POP.
Clearly there is significant variation across networks in the number of areas. It is also true, that with 64-bit operating systems, multi-core processors, and increased processing power/memory, routers are much more capable than they once were. There are exceptions of course. Routers installed many years ago but still in service, and even some newer access routers.
Your mileage may vary…
The Importance of Disjoint/Optimal routing
In some senses, the entire raison d’etre of source routing is the ability for an edge router to precisely select the path through a network, whether augmented or not. No surprise then this is a current topic of conversation as segment routing gains momentum.
Other drivers for optimal/disjoint routing vary from network to network depending on mission, capacity, and other factors.
Image Source: Packets and Stuff
Alternatives to BGP-LS
A couple of alternatives to BGP-LS have already been discussed: application programs being OSPF/IS-IS listeners, and building networks with routers that have enough control plane capacity to support a small number of areas, in the extreme one, though best practices suggest there should always be at least two, to keep the core focused on high-speed packet forwarding.
When BGP-LS was developed, there were few of the interfaces to routers that are now emerging: NETCONF/RESTCONF, streaming telemetry, and even interfaces directly into routing tables/protocols. As these interfaces mature in support of augmented routing, will BGP-LS remain the preferred method of choice for digesting routing state? That would be speculation at this point.
Conclusion
There are many networks with only a few areas, there are small number with hundreds of areas, and likely, a number of networks with tens of areas. Arista, Cisco, Huawei, Juniper, and Nokia all claim support for BGP-LS, so whether there is a strong argument for it in a network or not, it is there, and it can be used if optimal routing and or end-to-end visibility for source-routed/augmented routing is desired/required.
Whether BGP-LS is the long-term solution or not, is arguably a TBD, but it does appear to be the emerging approach of choice, today, with support from the major suppliers.
An earlier article was published focusing on some big picture issues related to Segment Routing Solution Criteria. Not all details were covered. BGP-LS would be an addition.