Integrated Library for Advancing Network Data Science - (ILANDS)

We propose to enhance infrastructure to handle 100GB packet rates, and projected routing table growth, including deploying enhanced storage and compute resources to support long-term use of the data.

Work done in collaboration with subcontractors at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) and University of Oregon Network Startup Resource Center (NSRC).

Sponsored by:
National Science Foundation (NSF)

Principal Investigators: kc claffy David Clark

Funding source:  CNS-2120399 Period of performance: October 1, 2021 - September 30, 2026.


Project Summary

Understanding the Internet’s changing character is impossible without realistic and representative datasets and measurement infrastructure that can support sustained longitudinal measurements as well as new experiments, and with resulting data available to scientific researchers. But there is a dearth of good data to support research, for several good reasons: complexity, scale, and cost of measurement instrumentation; information-hiding properties of the routing system, security and commercial sensitivities; costs of storing and processing the data; and lack of incentives to gather data in the first place. This lack of data hinders our ability to understand and reason about real-world properties of the Internet such as robustness, resilience, security, and stability.

CAIDA and NSRC propose to upgrade and integrate two of our measurement capabilities – 100GB traffic capture and BGP routing data collection – to enable a community of researchers across many institutions to collaborate on a high-level focused agenda.

We propose to enhance infrastructure to handle 100GB packet rates, and projected routing table growth, including deploying enhanced storage and compute resources to support long-term use of the data.

Our proposed approach integrates the community into the process from the beginning, to align the research goals and optimize NSF’s investment toward achievement of these goals. Our outreach coordination process will have five objectives: (1) shape what data we collect and store, (2) find new users of the infrastructure, especially from underrepresented groups, (3) bring our focus research collaborators together, (4) publish research results and analysis methods, and (5) establish a sustainability plan.

Projected Timeline

Task Description Projected Date Organizations Status
699
Traffic Data Infrastructure Enhancements
1.1 Build 100GB traffic monitor Year 1 CAIDA Done
1.2 Test and evaluate monitors Year 2 CAIDA Done
1.3 Deploy monitors Year 2 CAIDA, DREN, Kentik Done
1.4 Establish data enclave at CAIDA Year 4 CAIDA
1.5 Manage and share traces Year 3 CAIDA Done
1.6 Augment with Kentik data sources Year 4 CAIDA, Kentik
1.7 User training and support Year 5 CAIDA
337
BGP Routing Data Infrastructure Enhancements
2.1 Enhance BGPStream service broker Year 3 NSRC, CAIDA Done
2.2 Interface to scamper at RouteViews Year 2 CAIDA, NSRC, Waikato Done
2.3 Scale up BGP data collection ongoing CAIDA, NSRC ongoing
2.4 Upgrade libbgpstream Year 4 CAIDA, NSRC Done
2.5 Data integrity and quality controls ongoing CAIDA, NSRC ongoing
2.6 Authentication Year 3 CAIDA Done
2.7 RouteViews infrastructure updates ongoing NSRC ongoing
980
Outreach and Community Engagement
3.1 Catalog Management ongoing CAIDA ongoing
3.2 Ongoing user support ongoing CAIDA ongoing
3.3 Biannual newsletters ongoing CAIDA ongoing
3.4 Biannual community meetings ongoing CAIDA ongoing
3.5 Annual community workshops ongoing CAIDA ongoing
3.6 Annual community surveys ongoing CAIDA ongoing
3.7 Sustainability plan report ongoing CAIDA ongoing

Collaborators

ILANDS involves substantial involvement of CISE researchers to advance our focused research agenda (listed alphabetically):

  • Army Cyber Institute at West Point
  • California Institute of Technology (Caltech)
  • Canadian Internet Registration Authority (CIRA)
  • Carnegie Mellon University (CMU)
  • Colgate University
  • Columbia University
  • Freie Universitat Berlin
  • HAW Hamburg
  • Indiana University
  • International Computer Science Institute (ICIR)
  • Kentik
  • Microsoft Research
  • MIT/Computer Science and Artificial Intelligence Laboratory (CSAIL)
  • MIT/Lincoln Laboratory
  • Princeton University
  • Purdue University
  • RIPE NCC
  • UC Davis
  • Universidad de Buenos Aires
  • University of Illinois Urbana-Champaign (UIUC)
  • University of Minnesota
  • University of Oregon/Network Startup Resource Center (NSRC)
  • University of Waikato
  • USC/ISI

Acknowledgment of awarding agency’s support

National Science Foundation (NSF)

This material is based on research sponsored by the National Science Foundation (NSF) grant CNS-2120399. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of NSF.


Additional Content

Integrated Library for Advancing Network Data Science - (ILANDS)

Proposal for CCRI:Integrated Library for Advancing Network Data Science - (ILANDS)