PENMAN: Performance Evaluation Network Measurements and Analytics
The project's goal is to improve substantially the ability of a third party to ascertain the presence of performance bottlenecks along a given path of interest, and identify physical attributes of that bottleneck.
Principal Investigator: kc claffy
Funding source: HR00112020014 Period of performance: March 1, 2020 - May 27, 2022.
Project Summary
Despite recent innovations in the field of Internet measurement, both the public and private sectors struggle to develop and deploy accurate and usable measurement capabilities for the Internet. This capability gap presents an increasing risk for U.S. Department of Defense (DoD) use of the Internet, especially for complex mission critical applications. This project will provide empirical grounding for research and development of tools and methods that monitor performance of complex distributed applications operating on global public Internet infrastructure, and will stimulate improvements in three supporting measurement and analytics capabilities: (1) Internet topology inference; (2) performance bottleneck inference; and (3) geophysical annotation inference. The project consists of three inter-related tasks that will be pursued in parallel.
The first task focuses on developing new techniques to enable more comprehensive visibility into Internet interconnection topology via powerful on-demand measurement capabilities and trove of information inferred from applying innovative ML algorithms to historical archives of traceroute data.
The second task concentrates on advancing the CAIDA’s prototype Internet congestion measurement and analysis platform. Based on the analysis of the prevalence, duration, and location, of recurring congestion in networks, we will develop the state-of-the-art congestion detection techniques and will integrate them into our platform. This will enable a more reliable detection of flash congestion events, including those from DDoS attacks, outages, and geophysical disasters.
The third task is to synthesize methods for annotation and analysis of the topology and performance data gathered by CAIDA, to support rigorous investigation of causes and implications of anomalies, as well as geographic annotation and visualization of congestion phenomena. This task will provide critical context to inform assessments of Internet congestion and its potential impacts.
The results of these three tasks will be integrated into a set of guidelines for developing distributed platforms for Internet measurement, including novel methods to construct incentive-compatible deployment strategies for crowd-sourced Internet measurement.
This project directly addresses a key short-term objective of DARPA -- an improved understanding of the extent to which DoD’s application performance and robustness needs can best be achieved over the commercial Internet. The results will inform DARPA’s investment strategies for cost-effective improvement of network performance of defense-related enterprises.
Statement of Work
This 18-month project involves academic researchers from UC San Diego’s Center for Applied Data Analysis (CAIDA) and the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts Institute of Technology (MIT). It is structured as three inter-related tasks that are pursued in parallel. It builds on existing CAIDA data, tools, measurement and analysis platforms and on fruitful collaboration history of CAIDA and CSAIL. The deliverables include open source measurement and analytics software, data sets sharable with academic and government-funded researchers, and reports on advances in methods and their applications. The project timeline is subdivided into two phases: Period 1, March 2020 - February 2021, and Period 2, March 2021 - August 2021.
Period 1
Milestones | Deliverables | |||
---|---|---|---|---|
Task 1: Extend software development and deployment to execute global Internet topology analytics | ||||
1.1 | Extend and deploy topology measurement software | M1, M2 | D1, D2 | |
1.2 | Deduce internal and inter-domain network topologies from large-scale traceroute datasets | M3, M4 | D3, D4 | |
Task 2: Develop and apply new techniques for inference of congestion in the Internet core | ||||
2.1 | Perform Time-Series Latency Probing to extract Round-Trip Time data over time for targets of interest | M5, M6 | D5, D6 | |
2.2 | Characterize and cross-correlate episodes of congestion | M7, M8 | D7, D8 | |
Task 3: Practical applications of network topology, performance measurements, and geophysical meta-data | ||||
3.1 | Survey geolocation capability for interconnections and internal network topology | M9, M10, M11 | D9, D10 | |
3.2 | Studying the relationship among network topology, congestion, and the quality of distributed applications | M12, M13 | D11 |
Milestones - Period 1
# | Task | Milestone | Date | Status |
---|---|---|---|---|
M1 | 1.1 | Extend scamper measurement software platform | May 2020 | done |
M2 | 1.1 | Experimentally deploy new scamper module on 5-10 RouteViews collector vantage points (VPs) | Nov 2020 | done |
M3 | 1.2 | Synthesis of state-of-the-art inter-domain topology analytics | Aug 2020 | done |
M4 | 1.2 | Documentation of data sets and resulting inferences | Feb 2021 | done, done, and done |
M5 | 2.1 | Make available six month of TSLP probing data from available VPs | Aug 2020 | done |
M6 | 2.1 | Import resulting TSLP data into multidimensional database for use in analytics | Feb 2021 | done |
M7 | 2.2 | Develop and document new methods for investigating spatial correlations across massive numbers of links | Aug 2020 | done |
M8 | 2.2 | Evaluate recently developed nonparametric HMM algorithm for detection of congestion | Nov 2020 | done |
M9 | 3.1 | Select targets for geolocation comparison (subset of topology captured in Task) | May 2020 | done |
M10 | 3.1 | Compare performance of geolocation data bases for the target list | Aug 2020 | done |
M11 | 3.1 | Develop new data sets of DNS-based hints | Feb 2021 | done |
M12 | 3.2 | Attend relevant meetings as requested by DARPA PM | as needed | done |
M13 | 3.2 | Study of incentive-compatible deployment strategies for crowdsourced Internet measurement | Feb 2021 | done |
Deliverables - Period 1
# | Task | Deliverable | Type | Date | Status |
---|---|---|---|---|---|
D1 | 1.1 | Scamper release | Software | May 2020 | done |
D2 | 1.1 | Results of deployment on RouteViews VPs | Report | Feb 2021 | done |
D3 | 1.2 | Release of updated topology analytics tools | Software | Aug 2020 | done |
D4 | 1.2 | Release of resulting data sets with network boundaries identified | Data | Feb 2021 | done |
D5 | 2.1 | Data sets generated from TSLP probing | Data | Aug 2020 | done |
D6 | 2.1 | Multidimentsional data base containing integrated TSLP data | Data | Feb 2021 | done* |
D7 | 2.2 | Implementation of a new spatial correlation algorithm | Software | Nov 2020 | done |
D8 | 2.2 | Evaluation of nonparametric HMM algorithm | Report | Feb 2021 | done |
D9 | 3.1 | Identified target infrastructure for geolocation | Data | Jun 2020 | done |
D10 | 3.1 | Inferred geolocations of target links | Data | Feb 2020 | done |
D11 | 3.2 | Analysis of DoD QoS needs | Presentation | as needed | done |
* requires login credential
Period 2
Milestones | Deliverables | |||
---|---|---|---|---|
Task 1: Improve deductions of internal and interdomain network topologies | ||||
1.1 | Improve deductions of internal and interdomain network topologies | M14 | D12, D13 | |
Task 2: Operationalize detection of congestion patterns | ||||
2.1 | Operationalize detection of congestion patterns | M15, M16 | D14 | |
Task 3: Practical applications of network topology, performance measurements, and geophysical meta-data | ||||
3.1 | Use improved geolocation capabilities to annotate interconnections and internal network topology | M17, M18 | D15, D16 | |
3.2 | Develop recommendations for advancing DoD QoS-sensitive applications | M19 | D17 |
Milestones - Period 2
# | Task | Milestone | Date | Status |
---|---|---|---|---|
M14 | 1.1 | Apply VRFinder algorithm on measurements from existing and new CAIDA vantage points | May 2021 | |
M15 | 2.1 | Apply algorithms to detect patterns identifying spatial correlations | May 2021 | done |
M16 | 2.1 | Integrate algorithms into data processing pipeline (CAIDA) | Aug 2021 | |
M17 | 3.1 | Release inferences of links in the same location | May 2021 | |
M18 | 3.1 | Publish study of limitations of existing methods and identify potential improvements for future work | Aug 2021 | |
M19 | 3.2 | Analysis of DoD QoS needs and recommendations for commodity Internet use | Aug 2021 |
Deliverables - Period 2