Internet Laboratory for Empirical Network Science (iLENS)

The iLENS project proposes to upgrade and extend the active measurement infrastructure Archipelago (Ark), to provide academic researchers an unprecedented laboratory in which to quickly design, implement, and easily coordinate the execution of experiments across a widely distributed set of dedicated monitors.

Sponsored by:
National Science Foundation (NSF)

Principal Investigator: kc claffy

Funding source:  NSF CNS-0958547 Period of performance: March 1, 2010 - February 28, 2014.


Project Summary

Effective Internet measurement raises daunting issues for the research community and funding agencies. Improved understanding of the structure and dynamics of Internet topology, routing, workload, performance, and vulnerabilities remain a disturbingly elusive priority, in part for lack of largescale distributed network measurement infrastructure available to scientific researchers. The dearth is understandable; measurement of operational Internet infrastructure involves navigating more complex and interconnected dimensions than measurement in most scientific disciplines: logistical, financial, methodological, technical, legal, and ethical. CAIDA has been navigating these challenges with modest success for fifteen years, collecting, coordinating, curating, and sharing data sets for the Internet research and operational community in support of Internet science. With previous NSF (CRI) and other funding, we have been able to design, implement, deploy, and operate a relatively small but secure platform capable of performing various types of Internet infrastructure measurements and assessments. We propose to upgrade and extend -- in geographic scope as well as function -- this active measurement instrument (Ark) to provide academic researchers an unprecedented laboratory in which to quickly design, implement, and easily coordinate the execution of experiments across a widely distributed set of dedicated monitors.

In September 2007 Ark began to support ongoing global Internet topology measurement and mapping, and Ark now gathers the largest set of IP topology data for use by academic researchers. We are using the best available, but still rudimentary, techniques for IP topology mapping, and we also make several processed data sets (AS-links, AS relationships) available as "soft infrastructure" to researchers. We propose to deploy new techniques, as well as supporting software for analysis, annotation, topology generation, and interactive visualization of resulting annotated Internet graphs.

More importantly, we have demonstrated, and now wish to operationalize, the ability for this infrastructure to serve other researchers undertaking macroscopic studies of the Internet. Our first two experiments with external use of the infrastructure resulted in publications in the Internet Measurement Conference in 2008 and 2009.

We look forward to to a broad cross-section of research communities making substantial use of our Internet measurement infrastructure. Our top infrastructure development priorities are: (1) add monitors in geographic and topological areas we lack coverage; (2) improve tools for processing raw topology data, to enable an unprecedented range of Internet mapping research while reducing the burden on individual researchers and students to achieve results; (3) enhance and develop new software modules to support new types of experiments and validation. We propose to conduct annual workshops to collect, synthesize, and plan implementation of feedback on infrastructure operation.

Sustainable funding for large-scale measurement instrumentation past the span of a given funded research project has eluded the Internet research community, which has inhibited the creation of an underlying discipline that formalizes our observations and understanding of this complex networked system. By lowering the cost in time and effort needed to implement a measurement idea, Ark allows researchers to test and evaluate more experimental, sophisticated, and risky ideas, and facilitates integration of measurements and data into course curricula. The data currently provided by our infrastructure has strengthened the intellectual merit of a wide range of network modeling, simulation, analysis, and theoretical research activities. The broader impacts of the proposed work are reflected in the new types of research and data enabled, including historical Internet studies, evaluation of future Internet architectures, and empirical grounding for the emerging discipline of network science.

Management Plan

Throughout the project we will emphasize support for external researchers wishing to run experiments on Ark. We will provide data storage, analysis tools, Internet measurement expertise and advice, and a system for continuous feedback to improve the operation of our experimental infrastructure and to increase user satisfaction. The labor effort includes: 1) maintenance and support of the central server and remote monitors, 2) integration and deployment of new monitors and coordination with remote sites, 3) integration of data and compute servers, and network switch, 4) software development including bulk DNS queries, interactive visualization, and integration of real time routing data, 5) curation, archival and distribution of the data, 6) development of supporting documentation, web pages, surveys, and educational materials, and 7) organization of annual workshops and publication of resulting reports.

CAIDA personnel will be responsible for accomplishing all proposed tasks. The detailed project timeline follows. Note that the submitted budget will support a full-time effort for only one system administrator, and only part-time effort for the other five researchers involved, so we spread some of the proposed tasks, particularly software development, over longer intervals than they would otherwise require.

Task Description Projected Timeline Status
1 Integrate 8 new Ark monitors into Ark platform Year 1 done
2 Acquire fast network switch, upgrade and re-configure CAIDA network Year 1 (1st quarter) done
3 Acquire a new data server and put it into production mode Year 1 (3rd quarter) done
4 Develop project web pages and post updates on data collection status, list ongoing Ark experiments, and other project-related information Year 1 (1st quarter) done
5 Develop a web-based survey to assess the level of user satisfaction and to provide a communication channel between users and CAIDA personnel Year 1 (2nd quarter) done
6 Implement mper measurement engine Year 1 (1st, 2nd, and 3rd quarters) done
7 Design interactive visualization to support validation with network operators Year 1 done
8 Conduct a pilot study on integrating Ark topology measurements and real-time routing data Year 1 done
9 Organize a workshop on measurement needs for validation of modeling and simulation Year 1 (3rd quarter) done
10 Integrate 8 new Ark monitors into Ark platform Year 2 done
11 Publish the workshop report Year 2 (1st quarter) done
12 Upgrade the code for AS relationship and AS ranking calculations to fully utilize advanced computational capabilities of a new machine Year 2 (1st, 2nd, and 3rd quarters) done
13 Develop external API to our bulk DNS lookup service Year 2 (1st and 2nd quarters) done
14 Refine mper measurement engine based on experience Year 2 (4th quarter) done
15 Using the interactive visualization and related web forms, collect feedback from network operators regarding the completeness and veracity of data about their networks Year 2 done
16 Integrate Ark topology measurements and real-time Internet routing data Year 2 done
17 Publish our IRB application for active measurement experiments. Year 2 (4th quarter) done
18 Organize a workshop to introduce and gather feedback on the first version of the data correlation tools, including on which formatting and post-processed forms of the data are most useful to researchers Year 2 (4th quarter) done
19 Integrate 8 new Ark monitors into Ark platform Year 3 done
20 Publish the workshop report Year 3 (1st quarter) done
21 Develop code to automatically annotate topology data with hostnames Year 3 (1st and 2nd quarters) done
22 Improve topology mapping techniques based on the feedback received from network operators Year 3 done
23 Based on our experience with external researchers using Ark infrastructure, prepare guidelines for Internet measurement data sharing Year 3 (2nd quarter) done
24 Start serving integrated topology and routing data to the research community Year 3 (3rd quarter) done
25 Consolidate, refine, and generalize tool set for active measurement workflows Year 3 (3rd and 4th quarters) done
26 Widely circulate and publish project results Year 3 (3rd and 4th quarters) done
27 Organize a workshop to discuss technology transfer possibilities of Ark-based technology, e.g., stand-alone Ark measurement tool kit Year 3 (4th quarter) done
28 Prepare plans how to maintain funding for the developed large-scale network measurement infrastructure after the end of this project Year 3 done

Published
Last Modified