The Impact of the Archipelago Measurement Platform
The Ark Measurement Infrastructure
As a measurement infrastructure, the Archipelago (Ark) platform has just begun to hit its stride on many fronts. With the adoption of the Raspberry Pi as a hardware platform, we have deployed 52 Pis since January 2013, a rate of nearly three monitors per month. We have implemented and deployed improvements in process automation, monitor management, and the underlying software that enables team communication and coordination (implemented as a distributed shared memory tuple space structure). This dedicated, reliable, and robust platform supports ongoing, scheduled, and ad-hoc experiments.
Research Enabled by Ark Data Products
Our web site lists publications known to us by non-CAIDA authors that make use of CAIDA data (summarized in Figure 1 below) [1], a lower bound since we cannot enforce the reporting requirements of our AUP. Researchers have requested CAIDA's topology data to support research in the areas of: modeling IPv4 and IPv6 AS-level topology and BGP behavior; alias resolution, router-level, and PoP-level topology discovery; improving any-cast implementations; new metrics for describing scale-free networks; peer-to-peer system scalability; improving visualization of complex systems; geolocation; modeling of delay; improved trace-back for network attacks; and improved packet marking/filtering. Publications reported back to us have covered a variety of topics related to the security and trustworthiness of the Internet as critical infrastructure [2, 3, 4]: risks of Internet partitioning [5]; prefix hijacking [6, 7, 8]; DDoS attack countermeasures [9, 10]; complex network robustness in the face of epidemics [11]; complex network theory [12, 13]; future Internet architectures; CDN architectures [14]; and a geographic database (Atlas) of the Internet [15].
Figure 1: External (non-CAIDA) papers reported (a lower bound since reporting not enforced) to CAIDA as using our topology data.
CAIDA conducts twice per year searches to compile a list of papers that make use of CAIDA data. We currently find 70 papers making use of Ark data, 49 since 2011. We find 50% of papers used the IPv4 Routed /24 Topology Dataset , 25% used the IPv4 Routed /24 AS Links Dataset. We found 11 papers that make use of the Macroscopic Internet Topology Data Kit (ITDK). (CAIDA derives the data sets from the raw data gathered by Ark.)
Early in 2014, we loosened restrictions on topology data older than two years and now make it publicly downloadable. Users have downloaded 248.6 GiB of non-restricted Ark data since its release. Users have downloaded a total of 14.0 TiB of restricted Ark data since its inception in 2007.
Recent Requests for Ark Topology Data
Topology Requests Received | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec | (total) | |
2013 | 16 | 19 | 10 | 13 | 11 | 8 | 21 | 12 | 10 | 17 | 14 | 11 | 162 |
2014 | 13 | 6 | 3 | 3 | 4 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 31 |
193 |
Topology Approved Count | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec | (total) | |
2013 | 13 | 15 | 9 | 9 | 10 | 6 | 13 | 9 | 6 | 13 | 10 | 5 | 118 |
2014 | 9 | 3 | 2 | 2 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 19 |
137 |
Top Institutions Using Restricted Topology Datasets
Direct Use of Ark Infrastructure
In addition to access to the data collected by ongoing experiments running on the Ark infrastructure, we provide several methods of access for researchers to conduct their own coordinated measurements and experiments.
We provide this access primarily through the topo-on-demand (tod) service that provides a scriptable interface for performing IPv4 and IPv6 traceroutes and pings on Ark. It allows measurements from 100+ globally distributed monitors (36 with IPv6) in both commercial and R&E networks. The tod service supports varying levels of user sophistication and needs. A user accesses the service through remote client measurement requests via a single access point. Clients, whether web-based or command-line, can issue multiple concurrent requests and receive the results asynchronously.
There are two interfaces to tod, a command-line interface and a web-based interface. The command-line interface is useful for conducting large-scale feedback-driven, dynamic measurements under the full control of the user's own program (written in any language of the user's choosing, such as Perl, Python, Ruby, or C).
The web interface, Vela, provides an easy way for researchers to perform on-demand, ad-hoc exploratory topology measurements on Ark. Users can conduct ping and traceroute measurements in IPv4 and IPv6 using ICMP, UDP, or TCP from any Ark monitor. The Vela interface(s) are still in early days of development and beta testing. At this point, we have created accounts associated with one Regional Internet Registry, five accounts for government agencies (including the Naval Postgraduate School), nine accounts for academic researchers, and two accounts with commercial entities.
Groups Using Ark Infrastructure
-
Robert Beverly and colleagues at the Naval Postgraduate School make use of Ark's topo-on-demand for use as part of the Spoofer Project. The Spoofer project measures the Internet's susceptibility to spoofed source address IP packets. The group also collaborated on Speedtrap[16]. Additionally, Rob is working with colleagues at ICSI and SPAWAR on methods of revealing packet header manipulation to both endpoints of a TCP connection [17].
-
Emile Aben and colleagues at the Ripe NCC made use of Ark to augment measurements on World IPv6 Day. Similarly, RIPE-NCC used 20 dual-stacked Ark vantage points to measure for comparision RTT values in both IPv4 and IPv6.
-
David Clark (MIT CSAIL Information Policy Project) working with CAIDA researchers use the Ark infrastructure to measure Internet interdomain congestion. The two groups hope the research in Internet traffic congestion can inform the FCC policy debate over network neutrality. The research has sparked some heated debate over interconnection links and the need for more transparency from both sides of the debate.
-
Benoit Donnet and colleagues at Université de Strasbourg and the Université catholique de Louvain, and the University of Waikato ran alias resolution experiments with the probing tool, MERLIN. Further, the group made use of 1.2 million IP addresses sourced from CAIDA's traceroute measurements conducted on Ark as destination addresses in the measurements to discover MPLS and fingerprint networks on the Internet. A paper describing the results of the experiment were published in the proceedings of the Conference on Next Generation Internet [18].
-
We host accounts for users at several federal agencies related to securing cyberinfrastructure including the Department of Homeland Security and the National Security Agency.
-
Stephen Eichler and colleagues at the WAND Research Group at the University of Waikato used Ark to run an experiment, conducted over approximately eight weeks, to examine load balancer turnover and packet field sensitivity. They carried out the traceroutes using the Multipath Discovery Algorithm (MDA) in TCP source port, UDP source port and ICMP echo modes. The traces conduct several measurements per destination and investigate the possible existence of successor forwarding decision by fields outside the standard flow 5-tuple. The experiment also studies the efficiency of MDA analysis under different modes of flow ID selection.
-
We host accounts for numerous institutions that use the Ark infrastructure on their own topology and performance research. This list includes: Ethan Katz-Bassett and colleagues at the Computer Science Department at the University of Southern California; Nick Feamster's group in the School of Computer Science in the College of Computing at Georgia Tech; John Heidemann and colleagues at the Information Sciences Institute at University of Southern California; and Mehmet Gunes at the Department of Computer Science & Engineering at the University of Nevada, Reno
Approximately 20 hosting sites hold local shell accounts on the monitors.
References
[1] | Cooperative Association for Internet Data Analysis (CAIDA), Papers Published (by non- CAIDA Authors Using CAIDA Datasets. https://catalog.caida.org/search?query=types=paper%20!links=tag:caida%20links=tag:used_caida_data |
[2] | G. Yan, S. Eidenbenz, S. Thulasidasan, P. Datta, and V. Ramaswamy, "Criticality analysis of Internet infrastructure", Computer Networks, vol. 54, May 2010. |
[3] | W. Deng, M. Karaliopoulos, W. Muhlbauer, P. Zhu, X. Lu, and B. Plattner, "k-Fault tolerance of the Internet AS graph", Computer Networks, vol. 55, no. 10, 2011. |
[4] | A. Haeberlen, I. Avramopoulos, J. Rexford, and P. Druschel, "NetReview: Detecting when interdomain routing goes wrong", in USENIX Symposium on Networked Systems Design \& Implementation (NSDI), Apr 2009. |
[5] | M. Wachs, C. Grothoff, and R. Thurimella, "Partitioning the Internet", in International Conference on Risk and Security of Internet and Systems (CRiSIS), pp. 18, 2012. |
[6] | Y. Liu, B. Dai, P. Zhu, and J. Su, "Whom to Convince? It Really Matters in BGP Prefix Hijacking Attack and Defense", in Future Information Technology (J. J. Park, L. Yang, and C. Lee, eds.), vol. 184 of Communications in Computer and Information Science, 2011. |
[7] | Y. Liu, B. Zhang, W. Fei, and J. Su, "Evaluation of Prefix Hijacking Impact Based on Hinge-Transmit Property of BGP Routing System", Journal of Next Generation Information Technology (JNIT), vol. 1, no. 3, 2010. |
[8] | W. Deng, P. Zhu, N. Xiong, Y. Xiao, and X. Hu, "How resilient are individual ASes against AS-level link failures?", in Workshop on Security in Computers, Networking and Communications (SCNC), 2011. |
[9] | V. Kambhampati, C. Papadopoulos, and D. Massey, "Epiphany: A location hiding architecture for protecting critical services from DDoS attacks", in IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2012. |
[10] | M. D. D. Moreira, R. P. Laufer, N. C. Fernandes, and O. C. M. B. Duarte, "A Stateless Trace-back Technique for Identifying the Origin of Attacks from a Single Packet", in IEEE Interna- tional Conference on Communications (ICC), 2011. |
[11] | M. Youssef, R. Kooij, and C. Scoglio, "Viral conductance: Quantifying the robustness of networks with respect to spread of epidemics", Journal of Computational Science, vol. 2, no. 3, 2011. |
[12] | M. Boguñá, F. Papadopoulos, and D. Krioukov, "Sustaining the Internet with Hyperbolic Mapping'', Nature Communications, vol. 1, no. 62, Oct 2010. |
[13] | F. Papadopoulos,C. Psomas, and D. V. Krioukov , "Replaying the Geometric Growth of Complex Networks and Application to the AS Internet", ACM SIGMETRICS Performance Evalua- tion Review, vol. 40, no. 3, pp. 104106, 2012. |
[14] | M. Yu, W. Jiang, H. Li, and I. Stoica, "Tradeoffs in CDN designs for throughput oriented traffic", in ACM CoNEXT, 2012. |
[15] | R. Durairajan, S. Ghosh, X. Tang, P. Barford, and B. Eriksson, "Internet Atlas: A Geographic Database of the Internet", in ACM HotPlanet Workshop, August 2013. |
[16] | M. Luckie, R. Beverly, W. Brinkmeyer, and k. claffy, "Speedtrap: Internet-Scale IPv6 Alias Resolution", in Internet Measurement Conference (IMC), Oct 2013, pp. 119--126. |
[17] | R. Craven, R. Beverly, and M. Allman, "A Middlebox-Cooperative TCP for a non End-to-End Internet", Proceedings of ACM SIGCOMM 2014 Conference, August 2014 (to appear). |
[18] | P. Mérindol, B. Donnet, J. Pansiot, M. Luckie, and Y. Hyun, "MERLIN: MEasure the Router Level of the INternet", in Conference on Next Generation Internet, Jun 2011. |