PENMAN: Performance Evaluation Network Measurements and Analytics

The project's goal is to improve substantially the ability of a third party to ascertain the presence of performance bottlenecks along a given path of interest, and identify physical attributes of that bottleneck.

Sponsored by:
Defense Advanced Research Projects Agency (DARPA)

Principal Investigator: kc claffy

Funding source:  HR00112020014 Period of performance: March 1, 2020 - May 27, 2022.

Project Summary

Despite recent innovations in the field of Internet measurement, both the public and private sectors struggle to develop and deploy accurate and usable measurement capabilities for the Internet. This capability gap presents an increasing risk for U.S. Department of Defense (DoD) use of the Internet, especially for complex mission critical applications. This project will provide empirical grounding for research and development of tools and methods that monitor performance of complex distributed applications operating on global public Internet infrastructure, and will stimulate improvements in three supporting measurement and analytics capabilities: (1) Internet topology inference; (2) performance bottleneck inference; and (3) geophysical annotation inference. The project consists of three inter-related tasks that will be pursued in parallel.

The first task focuses on developing new techniques to enable more comprehensive visibility into Internet interconnection topology via powerful on-demand measurement capabilities and trove of information inferred from applying innovative ML algorithms to historical archives of traceroute data.

The second task concentrates on advancing the CAIDA’s prototype Internet congestion measurement and analysis platform. Based on the analysis of the prevalence, duration, and location, of recurring congestion in networks, we will develop the state-of-the-art congestion detection techniques and will integrate them into our platform. This will enable a more reliable detection of flash congestion events, including those from DDoS attacks, outages, and geophysical disasters.

The third task is to synthesize methods for annotation and analysis of the topology and performance data gathered by CAIDA, to support rigorous investigation of causes and implications of anomalies, as well as geographic annotation and visualization of congestion phenomena. This task will provide critical context to inform assessments of Internet congestion and its potential impacts.

The results of these three tasks will be integrated into a set of guidelines for developing distributed platforms for Internet measurement, including novel methods to construct incentive-compatible deployment strategies for crowd-sourced Internet measurement.

This project directly addresses a key short-term objective of DARPA -- an improved understanding of the extent to which DoD’s application performance and robustness needs can best be achieved over the commercial Internet. The results will inform DARPA’s investment strategies for cost-effective improvement of network performance of defense-related enterprises.

Statement of Work

This 18-month project involves academic researchers from UC San Diego’s Center for Applied Data Analysis (CAIDA) and the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts Institute of Technology (MIT). It is structured as three inter-related tasks that are pursued in parallel. It builds on existing CAIDA data, tools, measurement and analysis platforms and on fruitful collaboration history of CAIDA and CSAIL. The deliverables include open source measurement and analytics software, data sets sharable with academic and government-funded researchers, and reports on advances in methods and their applications. The project timeline is subdivided into two phases: Period 1, March 2020 - February 2021, and Period 2, March 2021 - August 2021.

Period 1

Milestones Deliverables
Task 1: Extend software development and deployment to execute global Internet topology analytics
1.1 Extend and deploy topology measurement software M1, M2 D1, D2
1.2 Deduce internal and inter-domain network topologies from large-scale traceroute datasets M3, M4 D3, D4
Task 2: Develop and apply new techniques for inference of congestion in the Internet core
2.1 Perform Time-Series Latency Probing to extract Round-Trip Time data over time for targets of interest M5, M6 D5, D6
2.2 Characterize and cross-correlate episodes of congestion M7, M8 D7, D8
Task 3: Practical applications of network topology, performance measurements, and geophysical meta-data
3.1 Survey geolocation capability for interconnections and internal network topology M9, M10, M11 D9, D10
3.2 Studying the relationship among network topology, congestion, and the quality of distributed applications M12, M13 D11

Milestones - Period 1

# Task Milestone Date Status
M1 1.1 Extend scamper measurement software platform May 2020 done
M2 1.1 Experimentally deploy new scamper module on 5-10 RouteViews collector vantage points (VPs) Nov 2020 done
M3 1.2 Synthesis of state-of-the-art inter-domain topology analytics Aug 2020 done
M4 1.2 Documentation of data sets and resulting inferences Feb 2021 done, done, and done
M5 2.1 Make available six month of TSLP probing data from available VPs Aug 2020 done
M6 2.1 Import resulting TSLP data into multidimensional database for use in analytics Feb 2021 done
M7 2.2 Develop and document new methods for investigating spatial correlations across massive numbers of links Aug 2020 done
M8 2.2 Evaluate recently developed nonparametric HMM algorithm for detection of congestion Nov 2020 done
M9 3.1 Select targets for geolocation comparison (subset of topology captured in Task) May 2020 done
M10 3.1 Compare performance of geolocation data bases for the target list Aug 2020 done
M11 3.1 Develop new data sets of DNS-based hints Feb 2021 done
M12 3.2 Attend relevant meetings as requested by DARPA PM as needed done
M13 3.2 Study of incentive-compatible deployment strategies for crowdsourced Internet measurement Feb 2021 done

Deliverables - Period 1

# Task Deliverable Type Date Status
D1 1.1 Scamper release Software May 2020 done
D2 1.1 Results of deployment on RouteViews VPs Report Feb 2021 done
D3 1.2 Release of updated topology analytics tools Software Aug 2020 done
D4 1.2 Release of resulting data sets with network boundaries identified Data Feb 2021 done
D5 2.1 Data sets generated from TSLP probing Data Aug 2020 done
D6 2.1 Multidimentsional data base containing integrated TSLP data Data Feb 2021 done*
D7 2.2 Implementation of a new spatial correlation algorithm Software Nov 2020 done
D8 2.2 Evaluation of nonparametric HMM algorithm Report Feb 2021 done
D9 3.1 Identified target infrastructure for geolocation Data Jun 2020 done
D10 3.1 Inferred geolocations of target links Data Feb 2020 done
D11 3.2 Analysis of DoD QoS needs Presentation as needed done

* requires login credential

Period 2

Milestones Deliverables
Task 1: Improve deductions of internal and interdomain network topologies
1.1 Improve deductions of internal and interdomain network topologies M14 D12, D13
Task 2: Operationalize detection of congestion patterns
2.1 Operationalize detection of congestion patterns M15, M16 D14
Task 3: Practical applications of network topology, performance measurements, and geophysical meta-data
3.1 Use improved geolocation capabilities to annotate interconnections and internal network topology M17, M18 D15, D16
3.2 Develop recommendations for advancing DoD QoS-sensitive applications M19 D17

Milestones - Period 2

# Task Milestone Date Status
M14 1.1 Apply VRFinder algorithm on measurements from existing and new CAIDA vantage points May 2021
M15 2.1 Apply algorithms to detect patterns identifying spatial correlations May 2021 done
M16 2.1 Integrate algorithms into data processing pipeline (CAIDA) Aug 2021
M17 3.1 Release inferences of links in the same location May 2021
M18 3.1 Publish study of limitations of existing methods and identify potential improvements for future work Aug 2021
M19 3.2 Analysis of DoD QoS needs and recommendations for commodity Internet use Aug 2021

Deliverables - Period 2

# Task Deliverable Type Date Status
D12 1.1 Results of application of VRFinder algorithm Report Jun 2021
D13 1.1 Release of VRFinder work product Software Aug 2021
D14 2.1 Data set annotated with evidence of congestion detected Data Aug 2021
D15 3.1 Data set of DNS-hints in hostnames for strategic ISPs Data Aug 2021
D16 3.1 Study of limitations of existing methods and report potential improvements for future work Report Aug 2021
D17 3.2 Recommendations on use of Internet measurements report to advance DoD QoS-sensitive applications Report Aug 2021

Acknowledgment of awarding agency's support

Defense Advanced Research Projects Agency (DARPA)

This work is funded by the U.S. Department of Defense (DoD) Defense Advanced Research Projects Agency under Cooperative Agreement HR00112020014. The papers resulting from this award are not subject to publication restriction and must include the following distribution statement: "Approved for public release; distribution is unlimited." All information releases (including, but not limited to news releases, articles, manuscripts, brochures, advertisements, still and motion pictures, speeches, trade association proceedings, symposia) must include the following acknowledgment: "This work is sponsored by the Defense Advanced Research Projects Agency. It does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred."

Last Modified