Sustainable data-handling and analysis methodologies for the IRNC networks
The goal of this IRNC Special Project is to identify and support the measurement priorities of the International Research Network Connections community.
Principal Investigator: kc claffy
Funding source: OCI-0963073 Period of performance: March 1, 2010 - February 28, 2014.
Project Summary
Effective Internet measurement raises daunting issues for the research community and funding agencies. Improved understanding of the structure and dynamics of Internet topology, routing, workload, performance, and vulnerabilities remain a disturbingly elusive priority, in part for lack of large-scale distributed network measurements available to scientific researchers. Ironically, even the research community networks struggle to make progress on these essential obstacles to cyberinfrastructure research. The data dearth is understandable. Measurement of operational Internet infrastructure involves navigating more complex and interconnected dimensions (logistical, financial, methodological, technical, legal, and ethical) than measurement in most scientific disciplines. CAIDA has been navigating these challenges with modest success for fifteen years, collecting, coordinating, curating, and sharing data sets for the Internet research and operational community in support of Internet science.
We propose three concrete contributions to the IRNC community's measurement efforts: to foster and distill discussion of how to best make IRNC data and statistics available, to adapt two CAIDA measurement technologies for IRNC community needs, and to experiment with two innovations in data-handling procedures applied to existing IRNC measurements. We will accomplish the first task by organizing and hosting a series of workshops, including half-day workshops at IRNC PI meetings to discuss IRNC measurement priorities and identify how CAIDA and other researchers can support them. In between IRNC PI meetings we will have 2-day annual workshops dedicated to measurement activities, to build on and extend previous efforts of the IRNC measurement group and explore in depth how the community can make better use of perfSONAR, metadata, and other data-handling and data-protection technologies.
Second, we propose to improve two CAIDA measurement technologies we already know can better serve the IRNC community: (1) We will upgrade our traffic reporting software to be IRNC-friendlier, by adding functionality to recognize IPv6 and DNSSEC, support anonymization and aggregation for privacy protection, and read data formats used by the majority of the IRNC operators, such as netflow output from routers. (2) We will (optionally) install, deploy, and manage IPv6-capable active measurement nodes at each interested IRNC site. IPv6 Internet reachability measurements are of particular interest since available data suggests the educational and government-supported communities are deploying IPv6 before the commercial sector.
Third, we propose to apply two innovations in data-handling procedures to existing IRNC measurement data. The first is a recently proposed framework for privacy-sensitive data sharing, to apply to data not appropriate for public posting, but explicitly requested through designated channels to use in clearly defined research. Second, we propose to illustrate our community building effort with a landmark reporting deliverable: a prototype of a "Bureau of Internet Statistics" report, hopefully inspiring other network infrastructure communities to join in this effort.
Intellectual merit. The proposed work will help IRNC operators better understand their networks by making more effective use of data they already collect as well as newer technologies for measurement and visibility of their networks. The data, tools, and distillations resulting from this effort will be made available to researchers using a privacy-sensitive sharing framework and will advance research in a number of sub-disciplines of network science.
Broader impact. Contributions from this project promise to strengthen activities in network modeling, simulation, analysis, and theoretical research, enabling the IRNC program to play a formative role in the emerging discipline of network science, and enhancing NSF's leading role in sustainable stewardship of cyberinfrastructure.
Management Plan
CAIDA personnel will be responsible for working on the proposed tasks. The requested budget supports 12.8 person-months of effort per year. The schedule of work below shows how we plan to accomplish the proposed tasks in three years of the project.
Task Number | Task Description | Projected Timeline | Status |
---|---|---|---|
Task 1 | Conduct the 1st workshop introducing the IRNC community to available CAIDA (and other) measurement tools and techniques | Year 1 (1st quarter) | done |
Task 2 | Create project web pages and start regular updates with relevant information | Year 1 (1st and 2nd quarter) | done |
Task 3 | Start deploying Ark monitors at interested collaborating IRNC sites | Year 1 | done (syd-au) |
Task 4 | Implement proposed modifications to CoralReef software suite | Year 1 (1st and 2nd quarter) | done |
Task 5 | Update report generator to include statistics of interest for the IRNC community | Year 1 (2nd and 3rd quarter) | done |
Task 6 | Introduce the Privacy-Sensitive Data Sharing (PSS) Framework to IRNC members | Year 1 (2nd quarter) | done |
Task 7 | Continue deploying Ark monitors at collaborating IRNC sites | Year 2 | done (per-au, sao2-br, bjl-gm) |
Task 8 | Create web pages showing per-node connectivity and performance monitored by Ark at participating IRNC sites | Year 2 (1st quarter) | done |
Task 9 | Organize an add-on workshop co-located with the IRNC PI meeting to report progress and discuss ongoing measurement issues | Year 2 (1st quarter) | done |
Task 10 | Assist the IRNC Data Providers with preparing MOUs and MOAs for their data sharing projects | Year 2 | done |
Task 11 | Continue CoralReef modifications to implement user feedback and requests | Year 2 | done |
Task 12 | Co-host (with ISC) a workshop to discuss novel case studies of network and security data analysis and data sharing | Year 3 (3rd quarter) | done |
Task 13 | Publish the workshop report | Year 3 (4th quarter) | done |
Task 14 | Maintain Ark monitors at collaborating IRNC sites and the corresponding statistics web pages | Year 3 | in progress (hnl-us) |
Task 15 | Conduct an add-on workshop co-located with the IRNC PI meeting to report progress and discuss ongoing measurement issues | Year 3 (4th quarter) | done |
Task 16 | Refine the Privacy-Sensitive Data Sharing Framework concept based on feedback and experience of IRNC Data Providers | Year 3 (2nd quarter) | done |
Task 17 | Lead discussion at workshop regarding how to begin data collection toward a sample report actualizing the "Bureau of Internet Statistics" (BIS) concept | Year 3 (4th quarter) | done |
Task 18 | Attend the final workshop to discuss and finalize the usage and value measurement report | Year 3 (4th quarter) | done |
Presentations
- IRNC-SP: Sustainable data-handling and analysis methodologies for the IRNC networks (NSF IRNC Program Kickoff Workshop)
- IRNC-SP: Sustainable data-handling and analysis methodologies for the IRNC networks (NSF-IRNC Workshop)
- Legal Aikido: A Data-Sharing Framework to Advance Network & Security Research