32-bit AS numbers introduce a new BGP flaw.
imported Tech · Geek · ISP · Cisco · BGP · JunOS
Last Friday, Andy Davidson, Jonathan Oddy, and I pushed out some research that has some quite worrying repercussions. Whilst I’ve heard from a lot of people privately about this matter, there’s a big flaw here, and as Andy posted on his blog (which is much more informative than mine, I think!), this is a big problem.
The reason, I think, that we’re getting limited public discussion of this exploit (I hesitate to call it an exploit, it’s a flaw really, because it’s actually a result of the RFC that the problem exists), is because the implementations of 4-byte AS support that are out there already are generally not standards compliant. Let’s run down the list:
- Juniper - not standards compliant on this matter, some form of AS4_PATH and AS_PATH munge happens during processing, and the paths don’t come out with the confeds in. I’ve verified this by looking at the updates via a Goscomb Technologies Juniper box - look at the monitor traffic output.
- Force10 - Greg Hankins has a test peering box running at route-server.cluepon.net, which shows the following
Received from : 72.37.255.12 (72.37.255.1) Best AS_PATH : 18508 19151 35320 196629 23456
This looks to be doing the same thing as JunOS is doing, and ignoring the invalid parts of the AS4_PATH when building the AS_PATH. - Cisco - IOS XR/XE - I’ve not yet confirmed this one, but I believe this doesn’t have a standards compliant implementation either. I’m hoping to get some time with someone to discuss what they’re seeing later tonight, or maybe later this week. I think that, given that we haven’t heard anything, and there are likely to be some SPs that are learning 91.207.218.0/23 via 35320, this is probably another platform that doesn’t comply.
- Cisco - IOS - There’s just one lonely IOS release that runs AS4 right now, and that’s 12.0(32)S12. This is the IOS that we used to demonstrate how IOS reacts to the flaw in the RFC. When IOS sees AS_CONFED_SET in the AS4_PATH, it drops the session, as required by the RFC
Maybe there are a couple of interesting points here, why are most vendors not actually complying with this RFC, does this mean that they’ve spotted what Andy, Jonathan and I have reported on, and dropped this requirement? If this is the case, then I wonder, when IETF IDR is so full of people with @juniper.net, and @cisco.com addresses - why did this ever appear in the first place?
This is a serious flaw in the standards, and despite the fact that today, we reported on how the issue has actually come to pass, this is going to remain open, unless we fix the RFC.
The issue here is, with most (if not all) BGP attributes, there’s almost an expectation that the immediate neighbour will sanity-check what their peer has sent - if it’s one hop away, you can generally interact directly with that neighbour, and work out what the problem is, there’s no-one harmed, as just one session is dropped, by two networks sharing some adjacency. A case in point of this, is the problem that we saw with Cisco not obeying the RFC relating to sending UPDATES before KeepAlives in BGP conversations (CSCsu84268). As far as I saw, this bug only affected directly connected neighbours, and hence there was no major impact. Now, let’s consider what happens the case of the AS4_PATH problem we reported. AS4_PATH is optional transitive in BGP, hence, if you hand it to a non-AS4 speaker, the router will just transmit it along to the peers it advertises the route to. This is a reasonably neat solution, one would think, as AS4 information is transmitted, but it doesn’t require every router in the path between two AS4 speakers to understand it, yet they can still get the same information as they could if the path was completely made up of AS4 speakers. Furthermore, by appending 23456 to AS_PATH, then even the non-AS4 speakers understand that there was some AS in this path. However, this also means that if I announce, to a non-AS4 speaker, a completely invalid AS4_PATH, they don’t know anything about it, or the contents, and hence can’t sanity check it. This results in me being able to tunnel my AS4_PATH across the internet.
Great, so now I’ve described, in some more chatty language, what we wrote on NANOG, and C-NSP. What does this mean to any operator? Well, if I take a prefix, originate it, and then announce it to the internet, then I can get the first AS4 speaker I find to tear down whatever session they learned my prefix on. If I combine this with injecting some ASNs into the path, so that some networks don’t accept it (due to loop prevention), then I can probably work out a way to get my copy of the update across to you. In IOS’s logs, you can’t even tell who originated that prefix, and it doesn’t seem to show the whole AS_PATH/AS4_PATH either. Say I send two prefixes that you learn one via one transit provider, and the other via another, I’ll disconnect your full table connectivity.
This isn’t even a bug, this is a flaw in the standard. I’d really like to get this fixed, and the way to do that is to get a bunch of operator experience/views, and take it to the IETF. So, if this concerns you (or you’re going to need to deploy a new point release – like 12.0(32)S12 is, or maybe need hardware support with 12.2SRE…), please put some pressure on your vendor, or drop me a note at rjs@eng.gxn.net.