Congestion: Mapping Interconnection in the Internet: Colocation, Connectivity and Congestion

In collaboration with David Clark (MIT/CSAIL), we will characterize the changing nature of the Internet's topology and traffic dynamics, and describe the implications of these changes for network science, architecture, operations, and public policy.

Sponsored by:
National Science Foundation (NSF)

Principal Investigators: kc claffy Amogh Dhamdhere

Funding source:  CNS-1414177 Period of performance: October 1, 2014 - September 30, 2018.


Outcomes Report

Internet engineering, science, and public policy communities have significant interest in understanding the extent and scope of, as well as potential consumer harm induced by, persistent interdomain congestion on the Internet. In this project we developed, deployed, and validated a lightweight active measurement method and system to enable a third party to measure evidence of congestion on thousands of interconnection links between broadband access ISPs and major interconnecting parties, including directly connected content providers. Our original motivation for this work was an increase in heated peering disputes between powerful players in the U.S., which raised questions about intentional degradation of performance as a business strategy to obtain (or avoid) interconnection fees.

This work required us to tackle the tedious and long-unsolved problem of automatically and correctly inferring network boundaries in traceroute data. Lack of progress on this problem has impeded a wide range of research and development efforts for decades. We developed and validated a method that uses targeted traceroutes, knowledge of traceroute idiosyncrasies, and codification of topological constraints in a structured set of heuristics, to correctly identify interdomain links (i.e., between independently operated networks) at the granularity of individual border routers. Our initial algorithm focused on the newtork boundaries we have most confidence we can accurately infer in the presence of sampling bias: interdomain links attached to the network launching the traceroute. This method formed the cornerstone of the platform we built to map interconnection performance. In the final year of the project, we undertook a collaboration to extend and generalize this algorithm to make boundary inferences beyond the immediately upstream network.

The mapping, measurement, and analysis platform took us three years to build, but the investment was worth it: we were able to address a decades-long gap in a third-party's ability to study peering disputes in an open, objective, scientifically validated way. Our measurements revealed ongoing indications of persistently congested transit links, whichregardless of causeimplies clear motivation for large players to engage in direct peering negotiations. Especially in today's deregulatory political climate, such measurement is the most promising strategy for incentivizing trans- parent and accountable ISP behavior.

The science of Internet cartography is still young, and its measurement tools rudimentary. The intellectual merit of our work lies in the novel approaches we developed to enable comprehensive, pervasive, accurate, and usable measurement capabilities for the Internet, and in the process addressed pressing and salient needs for a richer understanding of the robustness of this critical communications infrastructure.

The project produced 21 peer-reviewed publications, including the Best Paper Award at ACM SIGCOMM 2018, and 3 workshop reports.

Broader impacts: We used these results to investigate questions related to performance, QoE, scientific modeling, network economics, and public policy. We presented results at academic, operational, and policy venues: ACM SIGCOMM, ACM IMC, PAM, TPRC, BITAG, FCC, NANOG, UCSD, Georgia Tech, the Congressional Internet Caucus Advisory Committee (ICAC), CAIDA's AIMS and WIE workhops, and the NSF/FCC workshop on QoE. Other current and future CAIDA activities will heavily leverage the system we built for collection, analysis, and validation of interdomain congestion.

Project Summary

As the global Internet expands to satisfy the demands and expectations of an ever-increasing fraction of the world's population, profound changes are occurring in its interconnection structure, traffic dynamics, and the economic and political power of different players in the ecosystem. These changes not only impact network engineering and operations, but also present broader challenges for technology investment, future network design, public policy, and scientific study of the Internet itself. And yet, from both scientific and policy perspectives, the evolving ecosystem is largely uncharted territory.

We will focus our attention on two related transformations of the ecosystem: the emergence of Internet exchanges (IXes) as anchor points in the mesh of interconnection, and the growing role of content providers and Content Delivery Networks (CDNs) as major sources of traffic flowing into the Internet. By some accounts over half the traffic volume in North America now comes from just two content distributors (Youtube and Netflix). This shift constitutes the rise of a new kind of hierarchy in the ecosystem, bringing fundamentally new constraints on existing players who need to manage traffic on their networks to minimize congestion. Evidence of trouble has increased dramatically in the last five years, resulting in tussles among commercial players as well as between the private sector and regulatory bodies, at the expense of users suffering degraded quality of experience.

The proposed research is structured as two foundational tasks and a set of research questions that build on those tasks. First, we will construct a new type of semantically rich Internet map, which will elucidate the role of IXes in facilitating robust and geographically diverse but complex interdomain connectivity. This map will guide our second task: a measurement study of traffic congestion dynamics induced by evolving peering and traffic management practices of CDNs and ISPs. We will use our own measurement infrastructure as well as measurements from four industry collaborators: Akamai, Netflix, Google, and Comcast. Finally, we will use the results of these two tasks to investigate questions related to infrastructure resiliency, scientific modeling, network economics, and public policy.


Project Timeline

Task 1: Create an IX-aware map of the Internet (October 1, 2014 - September 30, 2016). Lead: CAIDA

Description Projected Date Assigned to Status
Task 1.1: Incorporating IX connectivity into an AS-level Internet map
1.1.1 automate our technique extracting multilateral peering (MLP) links from public route servers at IXPs May 2015 CAIDA done
1.1.2 build a colocation database using available data sources (PeeringDB, Euro-IX, Packet Clearing House, published IX member lists and peering matrices) Sep 2015 CAIDA done
1.1.3 use colocation database to identify private peering points Sep 2015 CAIDA/MIT done
1.1.4 conduct targeted traceroute probing from multiple vantage points: Ark Year 1 CAIDA done
1.1.5 conduct targeted traceroute probing from multiple vantage points: Akamai - MIT - targeted traceroute - CoNEXT Year 1 MIT done
1.1.6 conduct targeted traceroute probing from multiple vantage points: (periscope) Year 1 CAIDA done
1.1.7 conduct targeted traceroute probing from multiple vantage points: RIPE Atlas Year 2 CAIDA done
1.1.8 explore techniques to identify IXes not explicitly seen in the path or documented in public databases and peering at them Year 2 CAIDA in progress
Task 1.2: Enrich AS map annotations
1.2.1 incorporate additional inputs into machine learning classifier of AS graph node types Year 1 CAIDA done
1.2.2 improve node classification by using a much larger training data set of ground-truth AS classifications available from PeeringDB Year 1 CAIDA done
1.2.3 improve link classification by analyzing geographic trends in observed BGP announcements to infer regional business AS relationships Year 1 CAIDA done
1.2.4 improve link classification using insights into region-specific peering behavior extracted from our MLP inferences Sep 2017 CAIDA in progress
Task 1.3: Validate the IX-aware map
1.3.1 publish a list of looking glasses hosted at/nearby IXPs and automate querying them to validate the accuracy of our MLP inferences over time Apr 2016 CAIDA done
1.3.2 integrate IX-awareness into AS-Rank and collect feedback from network operators Sep 2015 CAIDA done
1.3.3 test our classifications of region-specific relationships using the BGP community values with region-specific annotations Year 1 CAIDA done
1.3.4 assign a confidence level to each link observed in traceroute data Year 2 CAIDA done
1.3.5 cross-validate inferences from BGP tables, MLP inferences, targeted traceroutes, and published peering matrices Year 2 CAIDA done
1.3.6 explore new validation methods combining information from multiple sources to obtain hints about the likely existence of peering links Year 3 CAIDA done

Task 2: Inferring congestion at interconnection points (October 1, 2014 - September 30, 2016). Lead: MIT

Description Projected Date Assigned to Status
Task 2.1: Conduct delay-based measurements of congestion
2.1.1 collect and analyze time-series of delay measurements ongoing CAIDA done
2.1.2 study diurnal RTT variations along the paths in question ongoing CAIDA/MIT ongoing
2.1.3 identify and analyze manifestations of interesting events ongoing CAIDA ongoing
2.1.4 provide backend for measurements May 2015 MIT done
Task 2.2: Conduct throughput-based measurements of congestion
2.2.1 confirm experimentally that a rate-limited TCP download probe (RLTP) can detect congestion from a single test measuring each direction separately Jul 2016 MIT done
2.2.2 Use cloud-based nodes and Ark nodes as both clients and servers, conduct rate-limited downloads attempting to cross known interconnection points Aug 2017 MIT done
2.2.3 assess difficulty of using tomography to infer likely location of congestion given current measurement infrastructure deployment; make recommendations for improvement Aug 2017 MIT done
Task 2.3: Conduct passive traffic-based measurements of congestion
2.3.1 map discovered congestion onto interconnection paths using binary tomography Year 2 CAIDA done
2.3.2 compare passive download performance with congestion signal from delay-based measurements Year 2 MIT done
Task 2.4: Validation and automation of congestion-related inferences
2.4.1 automate detection and analysis of congestion in raw data ongoing CAIDA/MIT done
2.4.2 validate congestion inferences using ground truth data provided by collaborating ISPs ongoing CAIDA/MIT done
2.4.3 cross-validate the three methods (2.1, 2.2, and 2.3) of discovering congestion ongoing CAIDA/MIT done
2.4.4 automate finding routers along the path of interest and selecting ping targets Year 1 CAIDA/MIT done
2.4.5 validate congestion inferences using external tools (testing link capacities and identifying bottlenecks) and data sets (TCP tests from M-Lab) Year 2 CAIDA done

Task 3: Exploring implications for network resiliency, policy, and science (April 1, 2016 - September 30, 2017).

Description Projected Date Assigned to Status
Task 3.1: Investigate how IXes influence Internet resiliency
3.1.1 assess the importance of certain ASes and AS-links in maintaining global or local reachability Year 3 CAIDA done
3.1.2 consider whether networks tend to connect at multiple IXes in a city/region (for redundancy) or in different regions (for geo-diversity) Year 2 and 3 CAIDA created supporting data
3.1.3 characterize the degree to which the loss of a major IX might potentially isolate or impair regions of the Internet Year 2 and 3 CAIDA created supporting data
Task 3.2: Study consequences of regional differences in peering behavior
3.2.1 compare topological structure of different countries/regions Year 2 CAIDA retasked
3.2.2 analyze the extent to which some countries/regions/organizations are essential hubs connecting to the global Internet Year 3 CAIDA retasked
3.2.3 identify ASes or IXes exercising regional control of Internet infrastructure ("choke points") Year 3 CAIDA retasked
3.2.4 reveal routing inefficiencies due to peering issues or unavailability of IXes for local interconnection Year 3 CAIDA retasked
Task 3.3: Inform growing policy concerns such as network transparency, investment incentives, and market power
3.3.1 publish data on the degree and character of the connections between an ISP and the rest of the Internet, including persistent under-provisioning of interconnection links and compare practices across regions; Year 2 and 3 CAIDA/MIT done (1, 2, 3)
3.3.2 define requirements for data disclosures by ISPs under various scenarios Sep 2016 MIT done
3.3.3 extend our past modeling work on fair and stable peering settlements to model performance degradations due to congested interconnections Year 3 CAIDA retasked
Task 3.4: Refine modeling assumptions about Internet topology and routing
3.4.1 produce an AS graph that is an order of magnitude larger than the previous graphs and is both a multi-graph and a hypergraph Year 3 CAIDA done
3.4.2 consider significance of commonly used topological characteristics (degree distribution, clustering, assortativity, etc.) Year 3 CAIDA done
3.4.3 quantify the prevalence of complex region-specific AS-relationships in the real world Year 3 CAIDA in progress
3.4.4 distill relevant statistical properties from the IX-aware map: colocation and peering behavior of different AS types, distributions of the number of physical locations at which two ASes connect, the scope and nature of region-specific business relationships, etc. Year 3 CAIDA created supporting data (1, 2, 3)
Task 3.5: Study trends over time in interconnection and performance
3.5.1 develop, maintain, and archive classic data sets to preserve Internet history Year 3 CAIDA ongoing
3.5.2 track longitudinal trends in the practices and effects of interconnection Year 3 CAIDA done
3.5.3 capture dynamics and evolution of the interaction between CDNs and IXes Year 3 CAIDA done (1, 2)


Additional Content

Mapping Interconnection in the Internet: Colocation, Connectivity and Congestion - Proposal

The proposal for “Mapping Interconnection in the Internet: Colocation, Connectivity and Congestion”.

Published
Last Modified