Congestion: Mapping Interconnection in the Internet: Colocation, Connectivity and Congestion
In collaboration with David Clark (MIT/CSAIL), we will characterize the changing nature of the Internet's topology and traffic dynamics, and describe the implications of these changes for network science, architecture, operations, and public policy.
Principal Investigators: kc claffy Amogh Dhamdhere
Funding source: CNS-1414177 Period of performance: October 1, 2014 - September 30, 2018.
Outcomes Report
Internet engineering, science, and public policy communities have significant interest in understanding the extent and scope of, as well as potential consumer harm induced by, persistent interdomain congestion on the Internet. In this project we developed, deployed, and validated a lightweight active measurement method and system to enable a third party to measure evidence of congestion on thousands of interconnection links between broadband access ISPs and major interconnecting parties, including directly connected content providers. Our original motivation for this work was an increase in heated peering disputes between powerful players in the U.S., which raised questions about intentional degradation of performance as a business strategy to obtain (or avoid) interconnection fees.
This work required us to tackle the tedious and long-unsolved problem of automatically and correctly inferring network boundaries in traceroute data. Lack of progress on this problem has impeded a wide range of research and development efforts for decades. We developed and validated a method that uses targeted traceroutes, knowledge of traceroute idiosyncrasies, and codification of topological constraints in a structured set of heuristics, to correctly identify interdomain links (i.e., between independently operated networks) at the granularity of individual border routers. Our initial algorithm focused on the newtork boundaries we have most confidence we can accurately infer in the presence of sampling bias: interdomain links attached to the network launching the traceroute. This method formed the cornerstone of the platform we built to map interconnection performance. In the final year of the project, we undertook a collaboration to extend and generalize this algorithm to make boundary inferences beyond the immediately upstream network.
The mapping, measurement, and analysis platform took us three years to build, but the investment was worth it: we were able to address a decades-long gap in a third-party's ability to study peering disputes in an open, objective, scientifically validated way. Our measurements revealed ongoing indications of persistently congested transit links, whichregardless of causeimplies clear motivation for large players to engage in direct peering negotiations. Especially in today's deregulatory political climate, such measurement is the most promising strategy for incentivizing trans- parent and accountable ISP behavior.
The science of Internet cartography is still young, and its measurement tools rudimentary. The intellectual merit of our work lies in the novel approaches we developed to enable comprehensive, pervasive, accurate, and usable measurement capabilities for the Internet, and in the process addressed pressing and salient needs for a richer understanding of the robustness of this critical communications infrastructure.
The project produced 21 peer-reviewed publications, including the Best Paper Award at ACM SIGCOMM 2018, and 3 workshop reports.
Broader impacts: We used these results to investigate questions related to performance, QoE, scientific modeling, network economics, and public policy. We presented results at academic, operational, and policy venues: ACM SIGCOMM, ACM IMC, PAM, TPRC, BITAG, FCC, NANOG, UCSD, Georgia Tech, the Congressional Internet Caucus Advisory Committee (ICAC), CAIDA's AIMS and WIE workhops, and the NSF/FCC workshop on QoE. Other current and future CAIDA activities will heavily leverage the system we built for collection, analysis, and validation of interdomain congestion.
Project Summary
As the global Internet expands to satisfy the demands and expectations of an ever-increasing fraction of the world's population, profound changes are occurring in its interconnection structure, traffic dynamics, and the economic and political power of different players in the ecosystem. These changes not only impact network engineering and operations, but also present broader challenges for technology investment, future network design, public policy, and scientific study of the Internet itself. And yet, from both scientific and policy perspectives, the evolving ecosystem is largely uncharted territory.
We will focus our attention on two related transformations of the ecosystem: the emergence of Internet exchanges (IXes) as anchor points in the mesh of interconnection, and the growing role of content providers and Content Delivery Networks (CDNs) as major sources of traffic flowing into the Internet. By some accounts over half the traffic volume in North America now comes from just two content distributors (Youtube and Netflix). This shift constitutes the rise of a new kind of hierarchy in the ecosystem, bringing fundamentally new constraints on existing players who need to manage traffic on their networks to minimize congestion. Evidence of trouble has increased dramatically in the last five years, resulting in tussles among commercial players as well as between the private sector and regulatory bodies, at the expense of users suffering degraded quality of experience.
The proposed research is structured as two foundational tasks and a set of research questions that build on those tasks. First, we will construct a new type of semantically rich Internet map, which will elucidate the role of IXes in facilitating robust and geographically diverse but complex interdomain connectivity. This map will guide our second task: a measurement study of traffic congestion dynamics induced by evolving peering and traffic management practices of CDNs and ISPs. We will use our own measurement infrastructure as well as measurements from four industry collaborators: Akamai, Netflix, Google, and Comcast. Finally, we will use the results of these two tasks to investigate questions related to infrastructure resiliency, scientific modeling, network economics, and public policy.
Project Timeline
Task 1: Create an IX-aware map of the Internet (October 1, 2014 - September 30, 2016). Lead: CAIDA
Description | Projected Date | Assigned to | Status | |
---|---|---|---|---|
Task 1.1: Incorporating IX connectivity into an AS-level Internet map | ||||
1.1.1 | automate our technique extracting multilateral peering (MLP) links from public route servers at IXPs | May 2015 | CAIDA | done |
1.1.2 | build a colocation database using available data sources (PeeringDB, Euro-IX, Packet Clearing House, published IX member lists and peering matrices) | Sep 2015 | CAIDA | done |
1.1.3 | use colocation database to identify private peering points | Sep 2015 | CAIDA/MIT | done |
1.1.4 | conduct targeted traceroute probing from multiple vantage points: Ark | Year 1 | CAIDA | done |
1.1.5 | conduct targeted traceroute probing from multiple vantage points: Akamai - MIT - targeted traceroute - CoNEXT | Year 1 | MIT | done |
1.1.6 | conduct targeted traceroute probing from multiple vantage points: (periscope) | Year 1 | CAIDA | done |
1.1.7 | conduct targeted traceroute probing from multiple vantage points: RIPE Atlas | Year 2 | CAIDA | done |
1.1.8 | explore techniques to identify IXes not explicitly seen in the path or documented in public databases and peering at them | Year 2 | CAIDA | in progress |
Task 1.2: Enrich AS map annotations | ||||
1.2.1 | incorporate additional inputs into machine learning classifier of AS graph node types | Year 1 | CAIDA | done |
1.2.2 | improve node classification by using a much larger training data set of ground-truth AS classifications available from PeeringDB | Year 1 | CAIDA | done |
1.2.3 | improve link classification by analyzing geographic trends in observed BGP announcements to infer regional business AS relationships | Year 1 | CAIDA | done |
1.2.4 | improve link classification using insights into region-specific peering behavior extracted from our MLP inferences | Sep 2017 | CAIDA | in progress |
Task 1.3: Validate the IX-aware map | ||||
1.3.1 | publish a list of looking glasses hosted at/nearby IXPs and automate querying them to validate the accuracy of our MLP inferences over time | Apr 2016 | CAIDA | done |
1.3.2 | integrate IX-awareness into AS-Rank and collect feedback from network operators | Sep 2015 | CAIDA | done |
1.3.3 | test our classifications of region-specific relationships using the BGP community values with region-specific annotations | Year 1 | CAIDA | done |
1.3.4 | assign a confidence level to each link observed in traceroute data | Year 2 | CAIDA | done |
1.3.5 | cross-validate inferences from BGP tables, MLP inferences, targeted traceroutes, and published peering matrices | Year 2 | CAIDA | done |
1.3.6 | explore new validation methods combining information from multiple sources to obtain hints about the likely existence of peering links | Year 3 | CAIDA | done |
Task 2: Inferring congestion at interconnection points (October 1, 2014 - September 30, 2016). Lead: MIT
Description | Projected Date | Assigned to | Status | |
---|---|---|---|---|
Task 2.1: Conduct delay-based measurements of congestion | ||||
2.1.1 | collect and analyze time-series of delay measurements | ongoing | CAIDA | done |
2.1.2 | study diurnal RTT variations along the paths in question | ongoing | CAIDA/MIT | ongoing |
2.1.3 | identify and analyze manifestations of interesting events | ongoing | CAIDA | ongoing |
2.1.4 | provide backend for measurements | May 2015 | MIT | done |
Task 2.2: Conduct throughput-based measurements of congestion | ||||
2.2.1 | confirm experimentally that a rate-limited TCP download probe (RLTP) can detect congestion from a single test measuring each direction separately | Jul 2016 | MIT | done |
2.2.2 | Use cloud-based nodes and Ark nodes as both clients and servers, conduct rate-limited downloads attempting to cross known interconnection points | Aug 2017 | MIT | done |
2.2.3 | assess difficulty of using tomography to infer likely location of congestion given current measurement infrastructure deployment; make recommendations for improvement | Aug 2017 | MIT | done |
Task 2.3: Conduct passive traffic-based measurements of congestion | ||||
2.3.1 | map discovered congestion onto interconnection paths using binary tomography | Year 2 | CAIDA | done |
2.3.2 | compare passive download performance with congestion signal from delay-based measurements | Year 2 | MIT | done |
Task 2.4: Validation and automation of congestion-related inferences | ||||
2.4.1 | automate detection and analysis of congestion in raw data | ongoing | CAIDA/MIT | done |
2.4.2 | validate congestion inferences using ground truth data provided by collaborating ISPs | ongoing | CAIDA/MIT | done |
2.4.3 | cross-validate the three methods (2.1, 2.2, and 2.3) of discovering congestion | ongoing | CAIDA/MIT | done |
2.4.4 | automate finding routers along the path of interest and selecting ping targets | Year 1 | CAIDA/MIT | done |
2.4.5 | validate congestion inferences using external tools (testing link capacities and identifying bottlenecks) and data sets (TCP tests from M-Lab) | Year 2 | CAIDA | done |
Task 3: Exploring implications for network resiliency, policy, and science (April 1, 2016 - September 30, 2017).
Description | Projected Date | Assigned to | Status | |
---|---|---|---|---|
Task 3.1: Investigate how IXes influence Internet resiliency | ||||
3.1.1 | assess the importance of certain ASes and AS-links in maintaining global or local reachability | Year 3 | CAIDA | done |
3.1.2 | consider whether networks tend to connect at multiple IXes in a city/region (for redundancy) or in different regions (for geo-diversity) | Year 2 and 3 | CAIDA | created supporting data |
3.1.3 | characterize the degree to which the loss of a major IX might potentially isolate or impair regions of the Internet | Year 2 and 3 | CAIDA | created supporting data |
Task 3.2: Study consequences of regional differences in peering behavior | ||||
3.2.1 | compare topological structure of different countries/regions | Year 2 | CAIDA | retasked |
3.2.2 | analyze the extent to which some countries/regions/organizations are essential hubs connecting to the global Internet | Year 3 | CAIDA | retasked |
3.2.3 | identify ASes or IXes exercising regional control of Internet infrastructure ("choke points") | Year 3 | CAIDA | retasked |
3.2.4 | reveal routing inefficiencies due to peering issues or unavailability of IXes for local interconnection | Year 3 | CAIDA | retasked |
Task 3.3: Inform growing policy concerns such as network transparency, investment incentives, and market power | ||||
3.3.1 | publish data on the degree and character of the connections between an ISP and the rest of the Internet, including persistent under-provisioning of interconnection links and compare practices across regions; | Year 2 and 3 | CAIDA/MIT | done (1, 2, 3) |
3.3.2 | define requirements for data disclosures by ISPs under various scenarios | Sep 2016 | MIT | done |
3.3.3 | extend our past modeling work on fair and stable peering settlements to model performance degradations due to congested interconnections | Year 3 | CAIDA | retasked |
Task 3.4: Refine modeling assumptions about Internet topology and routing | ||||
3.4.1 | produce an AS graph that is an order of magnitude larger than the previous graphs and is both a multi-graph and a hypergraph | Year 3 | CAIDA | done |
3.4.2 | consider significance of commonly used topological characteristics (degree distribution, clustering, assortativity, etc.) | Year 3 | CAIDA | done |
3.4.3 | quantify the prevalence of complex region-specific AS-relationships in the real world | Year 3 | CAIDA | in progress |
3.4.4 | distill relevant statistical properties from the IX-aware map: colocation and peering behavior of different AS types, distributions of the number of physical locations at which two ASes connect, the scope and nature of region-specific business relationships, etc. | Year 3 | CAIDA | created supporting data (1, 2, 3) |
Task 3.5: Study trends over time in interconnection and performance | ||||
3.5.1 | develop, maintain, and archive classic data sets to preserve Internet history | Year 3 | CAIDA | ongoing |
3.5.2 | track longitudinal trends in the practices and effects of interconnection | Year 3 | CAIDA | done |
3.5.3 | capture dynamics and evolution of the interaction between CDNs and IXes | Year 3 | CAIDA | done (1, 2) |
Additional Content
Mapping Interconnection in the Internet: Colocation, Connectivity and Congestion - Proposal
The proposal for “Mapping Interconnection in the Internet: Colocation, Connectivity and Congestion”.