The right tool for the job: Choosing where to use RSVP-TE or SR.

2015-06-01 imported Work · MPLS · RSVP · MPLS_TE · IETF · SR

I noted that at NANOG64 this week in San Francisco, there are talks (both from Juniper) about both SPRING/Segment Routing and RSVP-TE. These are both protocols/technology approaches (since one can’t really call SR a protocol) that I’ve been involved in the evolution of over the last few years. A question that I’ve been asked more times than I’d like is why we chose to look at a new approach (SR) rather than go with a technology that exists RSVP-TE.

The simple facts of the matter are that we aren’t backing just one of these technologies - we have networks that run RSVP-TE today; and we have networks where we don’t. To understand why one might want to consider either, we need to look at what the use cases for explicit paths are:

To allow bandwidth-aware routing of paths in the network: Where we want to place certain demands in the network according to the available resources on a certain link, then it is obvious that someone needs to be aware of where demands are currently placed. To do this, that someone needs to maintain state of LSPs. In cases where the placement is relatively static, or needs global optimisation, then often those paths can be pre-computed, and provisioned onto the network. In other cases, the demand of those paths may have significant temporal variation (think applications that use auto-bandwidth) - and local optimisation of path placement may be OK.

In the former case (especially where we are are concerned with global optimisation), one must rely on some element which is external to the ingress PE to calculate the path - and it stands to reason that this device must know about the placement of the existing paths in the network (or the utilisation of the links). At this point, there is very little value in maintaining reservation state on a per-hop basis - since the computing entity (usually an on- or offline PCE) has already done this. At this point, deploying RSVP-TE, and refreshing soft-state doesn’t make any sense – it’s simply work that the network is doing that doesn’t help anybody. SR helps you place these demands on the network - and adds very little overhead in doing so.

In the second case, moving the path computation out of the head-end PE doesn’t buy anything – there is no better path computation happening if we are happy with local placement. Consider the case where we have N ECMPs between two different devices, and we simply want to fill them so that the bandwidth is equally shared across them. At this point, least-fill will do a very good job, without needing to have any external entity. In such cases, keeping the state in the network lets one achieve the particular application that is required - without any external machinery. To get the same effect for SR, an external stateful PCE would be required.

Simply - it depends on what your deployment model is as to whether you need state in the network. if you do, then you probably want RSVP-TE. If it didn’t add anything, then SR does you a bunch of favours.

Disjoint path placement: This is a use case that I have a lot of interest in. Two services need to be placed on the network where they have no shared fate. Again, it depends on the deployment architecture that you have as to how one might want to consider deploying such a case.

Where there is a need to consider SRLGs across multiple layers (e.g., shared fibre ducts, same subsea cable system), then quickly it can become impractical to encode all this information into the IGP extensions available. Equally, where more complex path routing requirements are needed (‘in the core, these two services may not be in ducts within 3 kilometers of each other’), then it’s simply not possible to encode the right information into the IGP - let alone implement the placement algorithm on the ingress LER (iLER). In other cases, there isn’t the information available to the iLER to make the placement decision, or it might not be possible to place a service with locally optimal routing - particularly these cases can occur with path diversity where two paths originate at different ingress LERs. These cases lend themselves very well to placement with SR – one already has to maintain an element with global awareness, which must keep state (if A-B and C-D need to be diverse, then the computing entity needs to know where A-B is to place C-D…; and needs to react to failures that impact the placement of A-B to ensure that it remains diverse to C-D) - so there’s very little value of having state in the network as well as in this entity.

Other cases, where a diverse service might start at same iLER (e.g., path-protect FRR paths) - then RSVP-TE with XRO objects can suffice, and one can rely on the in-network state to ensure that paths are placed diversely, since the head-end has all the knowledge of the other services that are required. In this case, state in the network being maintained - and the single point of convergence for both the paths means that they can be re-placed where required - and RSVP-TE does a fine job of this.

Service paths rather than infrastructure paths: The work that I’ve shared previously concentrated on issues observed with RSVP-TE in networks with full mesh RSVP-TE (without diffserv TE). If one is considering such an architecture in a SP network today, then we are discussing 40,000 tunnels for a network of 200 PEs. If we consider DS-TE and an architecture with 3 different core classes, then 120,000. Whilst pinch points due to fibre routing between regions tend to drive up mid-point scale, we are still talking tens of thousands of LSPs for a single device. However, if we consider tunnels that need to be routed according to service demands, then a similar network with 200 PEs in it might support many, many more connections. Orders of thousands of services on an individual node (bear in mind that a 10-slot device can likely support 500+ edge ports) are not unheard of. At this point, having soft-state that might need to be resignalled during failures can be significantly painful – and result in message flooding loads that cause significant pain to pinch-point midpoints. At this point – taking the state away from the router CPU - and giving some means to be able to add additional computational resource/schedule how LSPs are re-routed is advantageous. SR lets you do this relatively easily; whereas RSVP-TE requires that we keep path-setup on-board with the network elements themselves. In these cases, RSVP-TE also consumes a label per service at each mid-point element, whereas SR has the nice property that the number of labels consumed per device is # of devices in the network + # of local adjacencies - significantly lower than the number of midpoint LSPs that might traverse an individual device.

Explicitly placed multicast: Multicast (from my perspective) is state in the network; we want to exploit the topology of the network to say that between some source, and some destination, there is no real need to carry N copies of a certain packet, we can carry many fewer, and replicate only when we need to. To do this, we either need to carry the information about the topology of the network in the packet (it’s not there right now…) or we need to let the network know something about the paths that it are being carried over it. The latter is essentially what we do with P2MP RSVP-TE. We signal to the network that there are paths going over it, and determine where the right point to branch those are (based on S2L sub-LSPs). In this case, RSVP-TE gives us a way to ensure that we exploit the topological information about the network that we already have in the IGP - to meet constriants, and efficiently deliver packets. SR has no real way to deliver this at all – since it simply tries to avoid that state. Unless one has many thousands of multicast groups - then keeping this state doesn’t seem hugely problematic, so RSVP-TE continues to seem a very sane choice.

As with most new approaches, there is always a bunch of buzz that surrounds the new kid on the block - and there’s a need to give it a push, such that solutions make it out into the real world. It seems that segment routing is getting there - and that’s great news, and certainly something that we’re intending to deploy. However, it’s also given the folks deploying RSVP-TE a bit of a kick (that I must admit, we were struggling to motivate before) such that we’re starting to see solutions to some of the problems that we saw in the real-world c.5 years ago emerge (see: draft-ravisingh-teas-rsvp-setup-retry and draft-beeram-mpls-rsvp-te-scaling-01).

I’m glad that we’re not abandoning RSVP-TE. There are cases, such as those discussed above, where it makes a better tool for the job than SR. However, there are cases where it’s not well suited, and we need SR. Giving operators the choice to pick the right solution for their problem space is always a good thing to achieve. Some networks will roll SR, some will roll RSVP-TE. Some will roll both. As long as it’s fixing problems, and the network is robust and operable, that’s all good.

Either way, non-IGP placed paths have significant complexities in terms of operational model, and need consideration when one is used to the IGP-congruent world - one to discuss another day.

[0]: A very interesting discussion that Mark Townsley and I have had over the last couple of years, whilst I was giving guest lectures at X, was how one defines ‘soft state’ - in this case, I will refer to state that is created in a device due to protocol activity (rather than configuration); and requires ongoing activity to maintain.