Designing a Global Measurement Infrastructure to Improve Internet Security (GMI3S)

We propose a project to design and prototype a distributed but integrated infrastructure to measure the Internet, with the objective of improving Internet infrastructure security.

Work done in collaboration with subcontractors at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) and University of Oregon Network Startup Resource Center (NSRC).

Sponsored by:
National Science Foundation (NSF)

Principal Investigators: kc claffy David ClarkBradley Huffaker

Funding source:  NSF OAC-2131987 Period of performance: October 1, 2021 - September 30, 2024.


Project Summary

We propose a project to design and prototype a distributed but integrated infrastructure to measure the Internet, with the objective of improving Internet infrastructure security. The Internet’s central role in society was demonstrated vividly in 2020. While the Internet has become critical infrastructure permeating all aspects of modern society, its security and trustworthy character are subject to constant threats and attacks. The security of the Internet is a high priority for the security research community, but that community is greatly hindered by a lack of relevant data. Researchers, governments and advocates for society need a more rigorous understanding of the Internet ecosystem, a need made more urgent by the rising influence of adversarial actors. We cannot secure what we do not understand, and we cannot understand what we do not measure. As we both design the future Internet for future generations and operate the current Internet, data is lacking. Through the lens of defense systems analysis, observation (and the infrastructure to support those observations) is the basis of all defense systems. We therefore identify Internet measurement, data curation and making data usable by the research community as critical research infrastructure.

We recognize the need for an infrastructure project to support measurement of the global Internet, similar to how governments support large-scale measurements of the oceans, atmosphere, and various critical infrastructure. But the Internet sits in contrast to other critical systems, such as health care, transportation, agriculture, and commerce, where the government plays a role that complements the role of the private sector – it monitors the state of those systems, and acts as necessary to ensure that they are meeting the needs of society. The first step in this process is gathering data to understand how the system is actually working. Unfortunately, far more than other domains, the scientific enterprise of Internet security is mired in interdisciplinary challenges: complexity and scale of the infrastructure; information-hiding properties of the routing system; security and commercial sensitivities; costs of storing and processing the data; and lack of incentives to gather or share data in the first place, including cost-effective ways to use it operationally. As a result, today, operators, policy makers and citizens have no consensus view of the Internet to drive decision-making, understand the implications of current or new policies, assess the resilience of the Internet infrastructure in times of crisis, or know if the Internet is being operated in the best interests of society. Governments could gather data directly, but the trans-national character of the Internet raises challenges for government coordination. An accepted approach to data gathering and analysis is to make sure that data is made available to neutral third-parties such as academic researchers, who can independently pursue their efforts, draw their own conclusions, subject these to comparison and peer review, and present their results as advice to governments. Although we come to this challenge with open eyes, we recognize the scope of the aspiration, and thus propose a substantial 3-year MSR1 Design Project to design a Global Measurement Infrastructure to Improve Internet Security (hereafter, GMI3S-Design).

We do not intend to tackle all Internet security problems. The Internet has a layered structure, which (in its simplest form) is a data transport layer on top of which run a wide range of applications. Our focus is on the Internet as a data transport service, and vulnerabilities specific to that layer: attacks on Internet routing that deflect traffic to bogus destinations (a persistent problem), abuses of the Domain Name System (a widespread, pernicious problem), attacks on the key management system (the Certificate Authorities) that underpin identity and authentication on the Internet, and spoofing of Internet addresses to disrupt regions of the Internet with untraceable traffic (Denial of Service attacks). Security challenges at these layers seem to get less publicity than attacks on end-points (malware, ransomware, etc.), or design features in applications that lead to risky user experiences. But the challenges at the data transport layer are foundational: they affect the reliable operation of every application that operates over the Internet. Any enterprise or service can have traffic intended for it deflected to a masquerading site that attempts to mimic the legitimate site, steal user credentials, disrupt security, or defraud users (x3). This perhaps surprising fact is the critical driver for this proposal.

Thus, the immediate target for this infrastructure is the research community that measures the Internet and tries to improve its security, the intermediate beneficiary will be the operator community and the service providers of the Internet, and the ultimate beneficiary will be all of society. We recognize that better data alone will not improve the security of the Internet–this proposal is part of a larger community agenda of research and outreach to industry and governments. The proposed infrastructure and its community of users will enable wide engagement of academic groups as well as private-sector security researchers in developing innovative, efficient, and robust capabilities to tackle the challenges of known and emerging Internet vulnerabilities. Since the Internet is a designed artifact (as opposed to a natural phenomenon of nature), it might seem that one could understand its operational character from analysis of its specifications. This is not so. The Internet is composed of tens of thousands of independent networks and the overall behavior of the Internet is determined by the independent decisions of the operators of those networks. Moreover, in most of the world, the Internet infrastructure is the product of the private sector. Economic considerations that drive the private sector shape the character of the Internet, key aspects of its resilience, security, privacy, and its overall future trajectory. The only way to understand the behavior of the Internet is to measure it.

Consistent with NSF’s Blueprint for a National Cyberinfrastructure Ecosystem, we view infrastructure holistically, and aim to integrate a range of resources, from sustainable, production data acquisition, to tools for curation, meta-data generation, efficient storage and dissemination, to community services to support accessibility and extensibility of the infrastructure, to domain expertise to enable use of the infrastructure for transformative discoveries.

We organize our proposed work into four tasks. Our first task is to design, prototype, test and evaluate a new highly distributed network measurement platform capable of capturing several types of data relevant to security research, as well as hosting new vetted experiments. This task will require consideration of both dedicated hardware and virtualized software deployments, in a modular architecture that allows hosting sites to opt in to measurements as policy allows. Our second task includes many facets of data management: meta-data ontologies; standardizing data exchange formats; tools to support data curation and documentation; and techniques for efficient data sharing, discovery, use, and dissemination.

Our third task focuses on community-oriented infrastructure that will enable use of the data for a broad set of cybersecurity research and beyond. This task will tackle issues with sensitive data that raises privacy or corporate concerns. One subtask is to bridge the current gap between the emerging data disclosure control technologies and measurement and security practitioners. We will explore the relevance of computer science advances such as differential privacy and secure multi-party computation, to current and emerging cybersecurity research priorities. We will design a set of legal enablers, e.g., normalization of data-sharing agreements, and socialize these among our partners and the larger community as part of our fourth task, outreach. Task four will include workshops, curriculum development, and STEM/cybersecurity work force training. To prototype our design, we will work with the community of Research and Educations (R&E) networks, which interconnect campuses and research centers across the globe. The largest R&E networks in the U.S. and the EU (Internet2 and GEANT), along with ten other academic networks, have agreed to collaborate for testing and evaluation. This is a Design proposal, so many details are as yet unresolved. Reaching agreement on the specifics of the design, informed by prototype deployments, and finding and documenting working solutions is exactly the scope of this project.

Projected Timeline

Description Projected Date
302
Task 1: Design Infrastructure for Data Acquisition
1.1 Report on Internet infrastructure security vulnerabilities Mar 2022
1.2 Complete data needs report (based on 1.1.1) Sep 2022
1.3 Draft monitor requirements report Sep 2022
1.4 Draft monitor hardware specifications report Mar 2023
1.5 Prototype monitor software Sep 2023
1.6 Initiate monitor deployment pilot Sep 2023
1.7 Evaluation report of Data Acquisition Component Sep 2024
1.8 Evaluate and prototype virtualization capabilities Mar 2024
444
Task 2: Design Infrastructure for Data Management
2.1 Document data storage hardware requirement Sep 2022
2.2 Document data storage systems specifications Dec 2022
2.3 Data and metadata standards specifications (annual revisions) Sep 2022
Sep 2023
Sep 2024
2.4 Report on tools for data curation and documentation Mar 2024
2.5 Report on Data and metadata APIs Sep 2024
2.6 Evaluate SDK Libraries Sep 2024
2.7 Prototype and document data discovery tools Sep 2023
2.8 Document approaches to dissemination design Sep 2023
248
Task 3: Design Community Infrastructure for Broad Usability
3.1 Prototype and document tools for integrating additional data sources Sep 2024
3.2 Document Software disclosure control approaches Sep 2024
3.3 Evaluate and document policy tools for disclosure control Sep 2024
3.4 Document extensibility case studies Sep 2024
616
Task 4: Infrastructure for Outreach
4.1 Host bi-annual workshops Feb 2022
Aug 2022
Feb 2023
Aug 2023
Feb 2024
Aug 2024
4.2 Launch virtual collaboration environment Oct 2021
4.2 Evaluate and report on virtual collaboration environment Sep 2024
4.3 Create online course on Network Infrastructure Sep 2023
4.4 Create, test, evaluate, and report on course materials Sep 2024

Milestones

# Description Date
1 Complete Design for Data Acquisition Infrastructure Oct 1, 2022
2 Complete Design for Data Management Infrastructure Oct 1, 2023
3 Complete Design for Infrastructure for Broad Usability Oct 1, 2024
4 Complete Outreach and STEM Development Support Activities Oct 1, 2024
5 Complete infrastructure evaluation Jun 30, 2024
6 GMI3S-Design project completion report Sep 30, 2024

Collaborators

GMI3S involves a number of collaborating institutions (listed alphabetically):

Data Providers

  • RIPE NCC
  • University of Oregon RouteViews
  • NLNet Labs

Commercial Partners

  • Farsight Security
  • Kentik

R&E Cyberinfrastructure

  • CENIC
  • Great Plains Networks
  • GÉANT Vereniging
  • Indiana University
  • Internet2
  • Network Startup Resource Center (NSRC)
  • UC San Diego
  • UC San Diego/San Diego Supercomputer Center (SDSC)
  • UC Santa Cruz
  • University of Hawaii
  • University of Illinois Urbana-Champaign (UIUC)
  • University of Memphis
  • University of Virginia

Security

  • Army Cyber Institute at West Point
  • Brigham Young University (BYU)
  • Case Western
  • Colgate
  • Columbia University
  • Grenoble INP
  • HAW Hamburg
  • Indiana University
  • MIT/Lincoln Laboratory
  • Northeastern University
  • UC Davis
  • UC San Diego
  • UIUC
  • University of Twente
  • University of Waikato
  • Virginia Tech

Policy/Legal/Privacy

  • Internet Society
  • Kelley Drye & Warren LLP
  • UC Irvine

Acknowledgement of awarding agency’s support

This project is the result of funding provided by the National Science Foundation under NSF OAC-2131987.

The published material represents the position of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of NSF.


Additional Content

Designing a Global Measurement Infrastructure to Improve Internet Security (GMI3S) Proposal

Proposal for Designing a Global Measurement Infrastructure to Improve Internet Security.

Published
Last Modified