Internet Laboratory for Empirical Network Science (iLENS)
The iLENS project proposes to upgrade and extend the active measurement infrastructure Archipelago (Ark), to provide academic researchers an unprecedented laboratory in which to quickly design, implement, and easily coordinate the execution of experiments across a widely distributed set of dedicated monitors.
Principal Investigator: kc claffy
Funding source: CNS-0958547 Period of performance: March 1, 2010 - February 28, 2014.
Project Summary
Effective Internet measurement raises daunting issues for the research community and funding agencies. Improved understanding of the structure and dynamics of Internet topology, routing, workload, performance, and vulnerabilities remain a disturbingly elusive priority, in part for lack of largescale distributed network measurement infrastructure available to scientific researchers. The dearth is understandable; measurement of operational Internet infrastructure involves navigating more complex and interconnected dimensions than measurement in most scientific disciplines: logistical, financial, methodological, technical, legal, and ethical. CAIDA has been navigating these challenges with modest success for fifteen years, collecting, coordinating, curating, and sharing data sets for the Internet research and operational community in support of Internet science. With previous NSF (CRI) and other funding, we have been able to design, implement, deploy, and operate a relatively small but secure platform capable of performing various types of Internet infrastructure measurements and assessments. We propose to upgrade and extend -- in geographic scope as well as function -- this active measurement instrument (Ark) to provide academic researchers an unprecedented laboratory in which to quickly design, implement, and easily coordinate the execution of experiments across a widely distributed set of dedicated monitors.
In September 2007 Ark began to support ongoing global Internet topology measurement and mapping, and Ark now gathers the largest set of IP topology data for use by academic researchers. We are using the best available, but still rudimentary, techniques for IP topology mapping, and we also make several processed data sets (AS-links, AS relationships) available as "soft infrastructure" to researchers. We propose to deploy new techniques, as well as supporting software for analysis, annotation, topology generation, and interactive visualization of resulting annotated Internet graphs.
More importantly, we have demonstrated, and now wish to operationalize, the ability for this infrastructure to serve other researchers undertaking macroscopic studies of the Internet. Our first two experiments with external use of the infrastructure resulted in publications in the Internet Measurement Conference in 2008 and 2009.
We look forward to to a broad cross-section of research communities making substantial use of our Internet measurement infrastructure. Our top infrastructure development priorities are: (1) add monitors in geographic and topological areas we lack coverage; (2) improve tools for processing raw topology data, to enable an unprecedented range of Internet mapping research while reducing the burden on individual researchers and students to achieve results; (3) enhance and develop new software modules to support new types of experiments and validation. We propose to conduct annual workshops to collect, synthesize, and plan implementation of feedback on infrastructure operation.
Sustainable funding for large-scale measurement instrumentation past the span of a given funded research project has eluded the Internet research community, which has inhibited the creation of an underlying discipline that formalizes our observations and understanding of this complex networked system. By lowering the cost in time and effort needed to implement a measurement idea, Ark allows researchers to test and evaluate more experimental, sophisticated, and risky ideas, and facilitates integration of measurements and data into course curricula. The data currently provided by our infrastructure has strengthened the intellectual merit of a wide range of network modeling, simulation, analysis, and theoretical research activities. The broader impacts of the proposed work are reflected in the new types of research and data enabled, including historical Internet studies, evaluation of future Internet architectures, and empirical grounding for the emerging discipline of network science.
Management Plan
Throughout the project we will emphasize support for external researchers wishing to run experiments on Ark. We will provide data storage, analysis tools, Internet measurement expertise and advice, and a system for continuous feedback to improve the operation of our experimental infrastructure and to increase user satisfaction. The labor effort includes: 1) maintenance and support of the central server and remote monitors, 2) integration and deployment of new monitors and coordination with remote sites, 3) integration of data and compute servers, and network switch, 4) software development including bulk DNS queries, interactive visualization, and integration of real time routing data, 5) curation, archival and distribution of the data, 6) development of supporting documentation, web pages, surveys, and educational materials, and 7) organization of annual workshops and publication of resulting reports.
CAIDA personnel will be responsible for accomplishing all proposed tasks. The detailed project timeline follows. Note that the submitted budget will support a full-time effort for only one system administrator, and only part-time effort for the other five researchers involved, so we spread some of the proposed tasks, particularly software development, over longer intervals than they would otherwise require.
Task | Description | Projected Timeline | Status |
---|---|---|---|
1 | Integrate 8 new Ark monitors into Ark platform | Year 1 | done |
2 | Acquire fast network switch, upgrade and re-configure CAIDA network | Year 1 (1st quarter) | done |
3 | Acquire a new data server and put it into production mode | Year 1 (3rd quarter) | done |
4 | Develop project web pages and post updates on data collection status, list ongoing Ark experiments, and other project-related information | Year 1 (1st quarter) | done |
5 | Develop a web-based survey to assess the level of user satisfaction and to provide a communication channel between users and CAIDA personnel | Year 1 (2nd quarter) | done |
6 | Implement mper measurement engine | Year 1 (1st, 2nd, and 3rd quarters) | done |
7 | Design interactive visualization to support validation with network operators | Year 1 | done |
8 | Conduct a pilot study on integrating Ark topology measurements and real-time routing data | Year 1 | done |
9 | Organize a workshop on measurement needs for validation of modeling and simulation | Year 1 (3rd quarter) | done |
10 | Integrate 8 new Ark monitors into Ark platform | Year 2 | done |
11 | Publish the workshop report | Year 2 (1st quarter) | done |
12 | Upgrade the code for AS relationship and AS ranking calculations to fully utilize advanced computational capabilities of a new machine | Year 2 (1st, 2nd, and 3rd quarters) | done |
13 | Develop external API to our bulk DNS lookup service | Year 2 (1st and 2nd quarters) | done |
14 | Refine mper measurement engine based on experience | Year 2 (4th quarter) | done |
15 | Using the interactive visualization and related web forms, collect feedback from network operators regarding the completeness and veracity of data about their networks | Year 2 | done |
16 | Integrate Ark topology measurements and real-time Internet routing data | Year 2 | done |
17 | Publish our IRB application for active measurement experiments. | Year 2 (4th quarter) | done |
18 | Organize a workshop to introduce and gather feedback on the first version of the data correlation tools, including on which formatting and post-processed forms of the data are most useful to researchers | Year 2 (4th quarter) | done |
19 | Integrate 8 new Ark monitors into Ark platform | Year 3 | done |
20 | Publish the workshop report | Year 3 (1st quarter) | done |
21 | Develop code to automatically annotate topology data with hostnames | Year 3 (1st and 2nd quarters) | done |
22 | Improve topology mapping techniques based on the feedback received from network operators | Year 3 | done |
23 | Based on our experience with external researchers using Ark infrastructure, prepare guidelines for Internet measurement data sharing | Year 3 (2nd quarter) | done |
24 | Start serving integrated topology and routing data to the research community | Year 3 (3rd quarter) | done |
25 | Consolidate, refine, and generalize tool set for active measurement workflows | Year 3 (3rd and 4th quarters) | done |
26 | Widely circulate and publish project results | Year 3 (3rd and 4th quarters) | done |
27 | Organize a workshop to discuss technology transfer possibilities of Ark-based technology, e.g., stand-alone Ark measurement tool kit | Year 3 (4th quarter) | done |
28 | Prepare plans how to maintain funding for the developed large-scale network measurement infrastructure after the end of this project | Year 3 | done |