Error Handling in BGP (Again!)

              · ·

It looks like, once again, there's another attribute flying around the global BGP table causing Quagga instances to crash (if based on 0.99.9 - I believe the bug is fixed in 0.99.10). This relates to the 2007 draft that introduced AS_PATHLIMIT - see ietf.org - draft-ietf-idr-as-pathlimit. This attribute is actually relatively interesting, from an operator's point of view, where control that is more granular than setting the common no-export or no-advertise communities does not suffice.

As far as I could see, there's only one BGPd that supports this draft (which does appear to have fallen by the wayside in the IDR WG), which is quagga. Whilst looking for this, I found the patch in which this was introduced. Looking at the code, it really reminded me why John Scudder's optional-transitive draft is very useful. If we look at the code in the patch mentioned above:

+  if (flag != BGP_ATTR_FLAG_TRANS)
+    {
+      zlog (peer->log, LOG_ERR, 
+       "AS-Pathlimit attribute flag isn't transitive %d", flag);
+      bgp_notify_send_with_data (peer, 
+                BGP_NOTIFY_UPDATE_ERR, 
+                BGP_NOTIFY_UPDATE_ATTR_FLAG_ERR,
+                startp, total);
+      return -1;
+    }
+  
+  if (length != 5)
+    {
+      zlog (peer->log, LOG_ERR, 
+       "AS-Pathlimit length, %u, is not 5", length);
+      bgp_notify_send_with_data (peer, 
+                BGP_NOTIFY_UPDATE_ERR, 
+                BGP_NOTIFY_UPDATE_ATTR_FLAG_ERR,
+                startp, total);
+      return -1;
+    }
Note here that in both these cases, we're checking for errors in AS_PATHLIMIT, that's an optional transitive. This is another "routing metadata" element, or really just a protocol enhancement, but - when an error is found - we're sending a NOTIFICATION message to the other side. Once again, a bug, or incorrect population of this element is going to affect a whole session (which may carry multiple AFIs). I understand that there's no standard, other than NOTIFICATION, available to implementers right now, but this really does seem quite harmful.

I hope that this will be solved as it's pushed through the IDR WG, as usual, operator support for this is useful, as it ensures that vendors, and the WG is well aware of the importance of such a draft!