| |
|
Belgium
NANOG
Code
ic.ac.uk
netnod
Tech
Apple
Geek
Cycling
Route
RFID
rob.sh
Work
Me
Crime
London
ISP
LINX
Food
londonfgss
Rollapaluza
Photography
IPv6
RIPE
Cisco
MPLS
Code
JunOSe
BGP
SDN
IOS
JunOS
Thoughts
MPLS_TE
Grupetto
IETF
UKNOF
Presentations
UKNOF
UK
|
| |
Further to my previous post - I presented this issue at LINX65 - video and slides can be found below.
Video
Fixed Slides - LINX's PowerPoint install seems to have corrupted my slides on the day.
Comments and feedback are most welcome.
|
|
After a late programme committee request, I presented on "Enhancing BGP" at UKNOF 16. The presentation was intended to be an update on the current drafts in the IDR working group, and give some encouragement to operators to get involved, and contribute.
I'll put the video up when the Tom at PortFast and Brandon of Bogons have done their excellent job on it. For the meantime, the slides are linked below.
There's also a good add-paths presentation that John Scudder and Dave Ward gave at NANOG here
|
|
Tom Bird of PortFast and Brandon Butterworth of Bogons do a great job of webcasting, and recording UKNOF video. Thanks to them, the video of the presentation I gave at UKNOF16 can be watched here. Or you can download it by clicking the image below!
As always, thoughts/comments/corrections most welcome!
This is also probably a good time to mention that my new work mail address is rob.shakir (at) cw.com
|
|
I spoke at LINX71 about the testing that we (C&W) have been doing in the lab with 100GigE - we got a pre-production card and hence had a look at the technology for real. Thanks to LINX, the presentation video can be seen by clicking on the image below.
Once again, however, whatever LINX use as a presentation laptop didn't render my slides properly - even though I'd submitted PDF too! Hence the slides can be found on this site.
|
|
As I presented at UKNOF 18, I have now written an Internet-Draft to address the requirements of Network Operators for how BGP should handle errors in UPDATE messages. The draft can be found on the IETF site, and I'm currently seeking opinions as to whether this reflects the an operational consensus! If you're an Operator (DFZ, MSE or otherwise), it would be great to hear from you!
I'll be presenting the draft at NANOG 51 in Miami on Tuesday - if you're there, feel free to ping me!
|
|
The video from the presentation I gave a NANOG, LINX and UKNOF has now been posted. You can find the video at the following URL - NANOG 51: BGP Error Handling or by clicking on the image below. The full slide deck is also on this site - here.
|
|
It's been quite a while since I updated this blog, very lax of me, sorry!
The lack of updates appears more indicative of how busy I appear to have been since presenting the error handling draft work at NANOG (which looks to be the last post!). Since January, I've presented at the IETF in Prague, and then again in Québec City - particularly on a number of aspects of the work that I've been documenting here for some time!
The good news is - we're making some significant progress. Over the last 6 months or so, the work that a number of operators have done, as well as work being focused from particular vendors has been focusing us towards how robust BGP needs to be to meet the operational requirements of the protocol right now. At IETF 80 in Prague, I presented at both the Global Routing Operations WG and Inter-Domain Routing, on the draft that I've described in the presentations linked in previous pages. For those that are interested, the slides for this are linked below.
The response, both at NANOG, and at the IETF meeting to this work has been very positive - I think as I've tried to characterise, there are a lot of operators that understand that this is an issue. Also - and perhaps somewhat surprisingly to me, there are a lot of vendors/implementors of protocols that also agree that this behaviour is very sub-optimal in numerous network deployments. There is significant appetite in the IDR working group to try and solve this issue in a deployable, scaleable manner - which is fantastic. Since BGP is the signalling glue for the Internet, and most modern IP networks, then it's really good that we are able to provide some focus for this issue, which, at the end of the day, will result in a more resilient set of networks.
In addition to such enthusiasm in the IETF IDR working group, GROW accepted the draft that I put together as a working group document - which is great, GROW's charter is almost to provide IDR work items that come from the operations area of the IETF. Pushing these requirements from GROW into IDR, whilst it might sound a bit like just internal workings of the IETF, gives some further credence to the fact that this is required by operators. Given all the discussions that I have had with operators about this issue, and how much of an issue I know this to be, I think that having the IETF process work on this the right way is great. This adoption means that the draft is now called draft-ietf-grow-ops-reqs-for-bgp-error-handling - and is progressing really well - I can't thank a number of people, including Bruno Decraene, Shane Amante, and David Freedman enough for their excellent discussions and suggestions on this subject - IMHO, such inter-operator collaboration is fantastic to see in terms of generally improving the operations, robustness, scalability and management of IP networks in general - and is of huge benefit to both the Internet and general network operations.
But, of course, just a requirements draft is not going to solve the issues that exist in the protocol - however, it does give a framework that gives us something to work around. As such, the point of this post is to point out to any operators that might read here, and not the IETF mailing lists, what actual progress we've been making in the IETF!
- On the issue of preventing all errors having to be responded to with a NOTIFICATION - whilst we don't have a clear draft that says that this will happen with both eBGP and iBGP, there is a clear understanding within the IDR working group that this is the operational demand. The IDR chairs have tasked the WG to produce a single solution 'error handling' draft - this is likely to be heavily based on both the optional transitive error handling draft written by John Scudder (Juniper), and the eBGP errors draft written by Enke Chen and Keyur Patel (Cisco) - this combined error handling document is going to be the cornerstone of the changes that really meet the requirements laid out in the draft I wrote.
- Keyur Patel, Enke Chen and Alton Lo (Cisco) have been doing some fantastic work in terms of looking at the hitless session restart on non-recoverable errors occurring set of requirements outlined in my document - a number of comments from (amongst others) Chris Morrow, prompted some revision of this section of the draft in the -01 version - describing which particular errors are deemed to be non-recoverable. It's safe to say that I've learnt quite a lot about what can go wrong in parsing streams like BGP messages over the last few months - and I've definitely got Alton, Keyur, Jeff Haas, and others to thank on this one. As such, I think the requirements that are now in the draft match up to what the operational requirements are - if you disagree, I'd love to hear from you!
Keyur et al's work has been focused around some discussions that we had in Miami, and then looking at how these ideas would scale (which I know a bunch of us discussed in Prague!) - if you're interested in this, then GR Notification, and Accelerated Convergence for BGP Graceful Restart - both of these essentially meet the requirement to perform some hitless session restart, whilst also looking to make this as scaleable as possible.
- In terms of prefix recovery during an inconsistent RIB state, there are a couple of drafts that are doing this work - but there's still some opportunity for improvement. Deployment issues of ORF are holding up the two that I am co-authoring with Jie Dong (Huawei), and Jakob Heitz (Ericsson) et al - which are described in One-time Extended Community-based ORF and One-time Address-Prefix-based ORF. Alternatives to this exist in how we might implement Route Target Constraint, and also how we might look at being able to deploy other ROUTE REFRESH-based mechanisms. I think, whilst there are some options here, there's still some unanswered questions!
- The final requirement that is outlined in the requirements draft relates to how the BGP protocol can be managed. This has turned out to be one of the most complicated requirements - as I am not certain that there is a direct agreement as to how much should be integrated into the protocol. Whilst Tom Scholl (nLayer) suggested DIAGNOSTIC. As such, DIAGNOSTIC offered much more functionality in terms of a query/answer set of mechanisms, along with some similar functionality in terms of giving means to be able to perform logging. As requested by the IETF IDR chairs at IETF80, Robert Raszuk, David Freedman and I sat down to write a draft combining both ADVISORY and DIAGNOSTIC. The end result (OPERATIONAL), I presented at IETF81 last week - I think we would all be very grateful for any comments (slides linked below)!
Overall, I think we're really making some good progress on this one - I'm hopeful that the requirements draft can be cut and dried, and go to GROW WGLC prior to IETF 82 in Taipei! From then on, I think, as a community, we're really driving some good solutions that, at the end of the day, are going to improve the stability, robustness and operations of the Internet!
As always, comments (via the re-enabled form below) or via e-mail are greatly appreciated - I'm really keen to ensure that we're hitting the right requirements here!
|
|
On Friday, I presented at the Netnod meeting in Stockholm, Sweden - again about BGP error handling - this time presenting a bit of an update as to why this continues to be a problem for the Internet (and private BGP deployments) - and why this work is still really relevant. In addition, I tried to give an overview of what the solution space looks like. I'm not sure whether there's video, but as usual, the slides are linked below!
As usual, I'm happy to take questions, comments or further queries on this work - please just let me know!
|
|
|
|
|