Evolving Toward a Self-Managing Network --------------------------------------- Why is network management so difficult? I think the complexity of the individual network elements is a big part of the problem, and a big barrier to fundamental change. Today's routers implement numerous routing protocols (e.g., BGP, OSPF, IS-IS, EIGRP, and RIP) and data-plane mechanisms (e.g., class-based queuing, RED, and access control lists), with inumerable configurable parameters that have a profound influence on the behavior of the network. Despite years of research work on protocols and mechanisms, we have relatively few meaningful guidelines for (i) selecting and composing these features to build a network and (ii) setting the tunable parameters to maximize the performance, reliability, and security of a running network. Rather than managing "the network," operators actually manage individual network elements in low-level, device-specific configuration languages. Clearly, we need to raise the level of abstraction to design and manage at the *network* level. But, how should we approach this problem? On the surface, building the abstraction on top of today's devices is an appealing approach. We can design a high-level language for specifying network goals and objectives, and then generate network designs and the low-level configuration commands that should be applied to the individual devices. However, the intrinsic complexity of today's protocols and mechanisms makes this approach extremely difficult (if not impossible) to succeed in practice. An alternative approach is to design the network elements, and their protocols and mechanisms, with network-level management in mind. The "design for manageability" approach starts by asking what kind of network-level abstractions we want, and then designs the device-level interfaces and protocols to support them. Though conceptually appealing, the "design for manageability" philosophy runs head-first against the important practical problem of how to achieve any significant deployment in practice. I think that breaking the stalemate requires us to think creatively about evolutionary approaches to revolutionary change. The key is the sometimes tedious notion of "incremental deployability." To be incrementally deployable, a solution must have two key ingredients: - Backwards compatibility: To effect change, we need to find ways to move forward without requiring a "flag day" where the legacy equipment and protocols are replaced. For example, approaches that retain the message formats of our existing protocols can allow substantive change while still accommodating legacy equipment. Ethernet is a great example of this principle -- significant change has occurred in the past decade or so while (and arguably *because of*) remaining compatible with the legacy specification of the message format. - Incentive compatibility: Each step along the way to the target end state needs to offer substantive benefits to the early adopters, and additional incentives for the remainder of the players to join the party. The need for incentive compatibility is so strong that it might dictate the choice of steps and, in some cases, even the target end state itself. A sound treatment of incentive compatibility requires accurate models of the cost of deployment and the benefits at each stage of adoption. The construction of the steps along the way is arguably a research problem in its own right. As an example, the Internet's interdomain routing system is widely viewed fraught with difficult management problems. The Border Gateway Protocol (BGP) is hard to configure, slow to converge, prone to serious anomalies (e.g., persistent oscillation, forwarding loops, and black holes), vulernable to malicious attack, difficult to troubleshoot, and overly sensitive to small topology changes. Building meaningful abstractions on top of such a system is fundamentally hard. The research community has made progress in creating static-analysis tools for detecting configuration errors, checking if a collection of routing policies are vulnerable to routing anomalies, and predicting the effects of configuration changes on the flow of traffic. Other research has created tools for analyzing measurement feeds of BGP update messages to detect and diagnose routing problems. These contributions have significantly improved our understanding of BGP and our ability to "work around" some of its limitations. However, raising the level of abstraction for BGP has remained elusive. Moreover, having a "flag day" to replace BGP with a new protocol is not viable in practice. We cannot simply "reboot the Internet." Yet, BGP has one key feature that makes real change tantalizingly possible: any system that sends BGP messages to a router in the appropriate format can tell the router what routes to use. This enables us to change everything about interdomain routing, while still speaking to the legacy routers in terms they can understand. In our work on the Routing Control Platform (RCP) [1], we exploit this observation and propose a three-step evolution to a new interdomain routing architecture: 1. Path selection in a single domain: In the first phase, the RCP has internal BGP (iBGP) sessions with the operational routers in a single Autonomous System (AS). This requires just a small configuration change on the routers (to exchange iBGP messages with the RCP, rather than one another), and enables the RCP to make customized routing decisions for each destination prefix on behalf of each router. This phase enables more flexible traffic engineering and network maintenance, and allows the AS to avoid routing anomalies such as protocol oscillation, forwarding loops, and black holes by explicitly enforcing correctness constraints. These capabilities provide a powerful incentive for an AS to deploy the RCP even if other ASes have not. 2. Flexible routing policy in a single domain: In the second phase, the RCP has external BGP (eBGP) sessions directly with the border routers in neighboring ASes. This requires the neighboring domains to make a small change to the configuration of the eBGP sessions on their border routers to exchange BGP messages with the RCP. As a result, the RCP has complete control over the sending and receiving of BGP messages, as well as the policies for path selection and export. The operational routers no longer have any BGP configuration state, except for the iBGP sessions to the RCP. This phase enables the use of new policy specification languages, intelligent route-flap damping, minimization of the number of routing-table entries on the routers, and many other applications. 3. Redefinition of interdomain routing: In the third phase, ASes coordinate interdomain routing directly through their RCPs. This requires the participating ASes to run an interdomain routing protocol between their RCPs, while still communicating with legacy routers via iBGP. Although the RCPs could conceivably run a policy-based, path-vector protocol like BGP, they need not. For example, a new routing protocol could attach prices to advertised routes or explicitly support inter-AS negotiation. RCPs could also base their routing decisions on measured end-to-end performance, as proposed in work on overlay networks, and even make the performance statistics available to end-host overlays through appropriate interfaces. The RCP could also be used to deploy an interdomain routing protocol with better security properties than BGP. Each step offers strong deployment incentives by simplifying network management and enabling new services, while remaining backwards compatible with the installed base of routers. (In addition, experiments with our prototype implementation [2] show that the RCP is feasible, from a systems perspective; the RCP can be made fast and reliable enough to make routing decisions for a backbone network with hundreds of routers.) If the RCP approach is successful, future routers could be built with much less control software, and with new dissemination protocols for communicating with the RCP (rather than continuing to use iBGP for this purpose). Stepping back from the specific example of BGP and the idea of the RCP, I think that making significant progress in improving network management requires changes in the division of labor between the network devices and the management systems. Making these changes actually happen requires us to grapple with backwards compatibility with the legacy equipment (e.g., by finding ways to use the existing protocols and message formats to coax the devices) and to identify compelling incentives for incremental deployment (e.g., by solving real problems and enabling new applications for the early adopters). [1] Nick Feamster, Hari Balakrishnan, Jennifer Rexford, Aman Shaikh, and Jacobus van der Merwe, "The case for separating routing from routers," Proc. ACM SIGCOMM workshop on Future Directions in Network Architecture, August 2004. http://www.cs.princeton.edu/~jrex/papers/rcp.pdf [2] Matthew Caesar, Donald Caldwell, Nick Feamster, Jennifer Rexford, Aman Shaikh, and Jacobus van der Merwe, "Design and implementation of a Routing Control Platform," Proc. Networked Systems Design and Implementation, May 2005. http://www.cs.princeton.edu/~jrex/papers/rcp-nsdi.pdf