Archipelago (Ark) Measurement Infrastructure

Archipelago (Ark): CAIDA's active measurement infrastructure serving the network research community since 2007.

Please send questions or comments regarding Ark to ark-info@caida.org.


Current Monitor Status and Statistics

Interactive Ark monitors map Interactive Ark monitors map
(click on map for interactive monitors map and graphs for statistics)

Introduction

CAIDA deploys and maintains a globally distributed measurement platform we call Archipelago (Ark). We grow the infrastructure by distributing hardware measurement nodes (typically Raspberry Pi systems) with as much geographical and topological diversity as we can to improve our view of the global Internet. Our primary goals with the Ark infrastructure are to:
  • reduce the effort needed to develop and deploy sophisticated large-scale measurements, and
  • provide a step toward a community-oriented measurement infrastructure by allowing collaborators to run their vetted measurement tasks on a security-hardened distributed platform.

Ark is tailored specifically for active network measurement. This allows Ark to be simpler than some other general-purpose distributed experimental platforms, and it allows us to concentrate on providing facilities that directly address the needs of networking research. In particular, we provide a facility for communication and coordination that makes it easier to write distributed measurements that must work together to achieve a goal. We are working on providing a high-level API to ease the challenges of writing measurement tools. Our goal is to lower the barrier to bringing novel and interesting measurement techniques to life.

Community Measurements

The following lists ongoing, ad hoc, and hosted measurements and experiments conducted on the Ark measurement infrastructure. In addition to scheduled measurements, we provide users with access and tools to enable execution of ad hoc measurements from the command-line or via a web interface through the Vela Ark Topo-on-Demand Service. Follow this link to see a list of historical measurements and experiments.

Hosted Measurements

  1. The Spoofer Project: Ark monitors participating in the Spoofer Project help measure the Internet's susceptibility to spoofed source address IP packets. The monitors gather data on IP spoofing by receiving potentially spoofed traffic and forwarding it on to the Spoofer Project's server for analysis.

  2. TCP Behavior Inference (TBIT): Assessing the deployment of TCP algorithms is integral to understanding the ability of TCP to perform. We are using Ark infrastructure to infer the deployment of TCP algorithms and features in the modern Internet using an approach based on the TCP Behavior Inference Tool (TBIT), with a current focus on the deployment of slow-start behaviors, as well as feasibility of blind in-window TCP attacks.

  3. IPv4 and IPv6 stability: working with researchers at Simula Research Laboratory, we are studying IPv4 and IPv6 stability and performance: From dual-stacked Ark monitors, we are running measurements (high-frequency pings and traceroutes over IPv4 and IPv6) towards dual-stacked servers from the Alexa list. The objective is to compare the reachability and performance (in terms of RTT) of dual-stacked targets over IPv4 and IPv6.

  4. Domain Name System (DNS) Health: Working with researchers at Verisign Labs, we are using nodes from CAIDA's Ark project to establish regular diagnostic monitoring from diverse perspectives, to establish a baseline of response behavior and quantify current connectivity issues as well as those that might emerge with a root key rollover. The Ark nodes not only provide a diverse path sampling to achieve our objectives but are also suited for rapid and flexible deployment, as they allow the installment of existing tools, rather than refactored or platform-specifics, which are constraints of similar platforms.

    From each Ark node we periodically issue a series of diagnostic queries to each of the servers to which top-level domain (TLD) namespace has been delegated. The queries include transport-layer tests to detect connectivity issues over both TCP and UDP, PMTU bounding tests to identify servers affected by a smaller PMTU than the payload they attempt to send, version consistency tests to detect stale zone data from servers that are out of sync, and DNSSEC correctness and consistency tests. The diagnostic queries are performed three times daily, and results are aggregated and stored for both point-in-time (simultaneous consistency) and temporal analysis (behavioral changes over time).

  5. TCP-HICCUPS: HICCUPS (Handshake-based Integrity Check of Critical Underlying Protocol Semantics), developed by researchers at the Naval Postgraduate School is a lightweight extension to TCP that helps end-hosts infer when their communication is being misinterpreted due to middlebox packet header modifications. By using HICCUPS, end-nodes can better cooperate with middleboxes and improve application performance. Measurement results from a SIGCOMM 2014 publication used HICCUPS on 58 Ark vantage points as part of an Internet-wide survey of Internet path behavior.

  6. Middlebox Policy Taxonomy: Researchers at the Université de Liège, Belgium have deployed tracebox on dual-stack Ark vantage points to compare middleboxes in IPv4 and IPv6 environments. From Ark vantage points, we probed all dual-stack servers, from the top 1 Million Alexa web site. The dataset collected allowed the researchers to build a path impairment oriented middlebox taxonomy that aims at categorizing the initial purpose of a middlebox policy as well as its potential unexpected complications. Measurement results are described in the NetSciCom 2015 publication.

  7. Localizing Middleboxes: The University of Waterloo makes use of Ark infrastructure to accurately detect and localize middleboxes (MBs) in the Internet for purposes of debugging the network and identifying misbehaving middlboxes. Other applications include detecting censorship and ICMP blackholes. The project also aims to study the extensibility of TCP, studying the deployability of multipath TCP. The experiments makes use of the boxMap tool, a client server application that uses a probe-feedback mechanism to probe different Internet paths and post the packet impairment results to a master node.

  8. Transport Evolution: A group of researchers from Université de Liège, Belgium, ETH Zurich, Switzerland, and RIPE NCC, Amsterdam, Netherlands study how the increasing use of middleboxes (e.g., NATs, firewalls) in the Internet has made it more difficult to deploy new transport or higher layer protocols. They us the Ark nodes to conduct measurements to examine the use of UDP for Internet transport evolution.

Ongoing Measurements

To provide timely, regular data, several categories of measurements run on an ongoing basis driven by CAIDA's mission, in part, to provide macroscopic insights into Internet infrastructure, behavior, usage, and evolution.

  1. Internet Topology Discovery: Using multiple teams of geographically distributed Ark monitors we dynamically and strategically divide up the probing work among teams to conduct coordinated, large-scale traceroute-based topology measurements. Supported by (DHS S&T contract N66001-12-C-0130) Cartographic Capabilities for Critical Cyberinfrastructure, we integrate these measurements with data analysis capabilities to provide comprehensive annotated Internet topology maps that will improve our ability to identify, monitor, and model critical cyberinfrastructure.

    • IPv4: Ark's parallelization allows us to obtain a traceroute measurement to all the routed /24 networks in the IPv4 address space in about half a day for a team of 100 monitors probing over 10 million /24's (that is, the full routed address space subdivided into /24's) at 100pps. We make these measurements available for download as The Ark IPv4 Routed /24 Topology Dataset.

    • IPv6: For each probed path, we collect the IP address, RTT, reply TTL, and ICMP responses for all hops, including intermediate hops. Each Ark monitor probes all announced IPv6 prefixes (/48 or shorter) once every 48 hours. One probing pass through all announced prefixes is called a cycle. In each cycle, a monitor probes only a single random destination in each prefix. Different monitors probe prefixes in independently-chosen random orders and probe to an independently-chosen random destination in each prefix. Prefixes are randomly ordered in such a way that a given monitor never probes the same prefix within 16 hours across cycle boundaries (a monitor can never re-probe a prefix within the same cycle, by definition). We make these measurements available for download as The Ark IPv6 Topology Dataset.

The team-probing experiment performs traceroute measurements using scamper, a powerful and flexible active measurement tool supporting IPv4, IPv6, traceroute, and ping. Scamper supports TCP-, UDP-, and ICMP-based measurements and Paris traceroute variations.

  • Congestion: As described in (NSF award CNS-1414177) Mapping Interconnection in the Internet: Colocation, Connectivity and Congestion, we are running measurements to detect congestion on interdomain links of the networks hosting the Ark monitors. These measurements will inform analysis of traffic congestion dynamics induced by evolving interconnection and traffic management practices of CDNs and ISPs.

    • Time-Sequence Latency Probing (TSLP) involves sending a crafted sequence of pings along the path in question looking for a diurnal variation in delay potentially indicating congestion on a link.
    • Border mapping using Ark vantage points to research and develop measurement techniques to accurately infer the presence of interdomain links for the network hosting the Ark vantage point. Our approach combines our experience in Internet-scale topology discovery and alias resolution, as well as our algorithms to infer routing relationships between networks, to accurately label interdomain routers with their owners.
  • We distribute the results of these measurements as well as others in various topology related datasets. For a complete list of CAIDA data, please see the CAIDA Data Overview page.

    On-Demand Measurements

    In addition to ongoing measurements, researchers can execute ad hoc measurements on the Ark monitors via either a command-line interface or a web-browser interface.

    1. The tod-client (topology on-demand) gives users working at a command-line shell a scriptable interface for performing IPv4 and IPv6 traceroute and ping measurements.

    2. Vela: Web Interface: The Vela service provides access to Ark's topo-on-demand functionality via a web browser. Organizations access the Ark platform via the vela web interface to run "one-off" measurements. The interface allows users to select a subset of monitors (e.g., all Asian monitors, or one Ark monitor from each continent with IPv6 connectivity) using ping or traceroute.

    The following organizations have run on-demand measurements on Ark:

    • Department of Homeland Security (DHS S&T)
    • Naval Postgraduate School (NPS)
    • The Réseaux IP Européens Network Coordination Centre (RIPE NCC)
    • Jacobs University, Bremen, Germany
    • Eurocom, France
    • Fraunhofer AISEC
    • The University of Cape Town

    Presentations

    Feb 2016 Ark Topology Query System slideset
    Mar 2015 Ark Update: Present and Future
    Jun 2014 Dolphin: Bulk DNS Resolution Tool
    Feb 2013 Ark Update: Vela and Raspberry Pi's
    Feb 2012 Archipelago: On-Demand IPv4 and IPv6Topology Measurements
    Feb 2011 Archipelago: Updates
    Feb 2010 Archipelago: Updates and Case Study
    Feb 2009 Archipelago: update and analyses
    Aug 2008 Archipelago Measurement Infrastructure: Status and Experiences
    Apr 2007 Archipelago: A Coordination-Oriented Measurement Infrastructure
    Nov 2006 The Archipelago Measurement Infrastructure

    Funding Support

    Defense Advanced Research Projects Agency (DARPA) Department of Homeland Security (DHS) National Science Foundation (NSF)

    Support for the Archipelago Measurement Infrastructure project is provided by the Defense Advanced Research Projects Agency (DARPA) cooperative agreement HR00112020014 Performance Evaluation Network Measurements and Analytics, the Department of Homeland Security (DHS) cooperative agreement FA8750-18-2-0049 Advancing Scientific Study of Internet Security Topological Stability, S&T contract HHSP 233201600012C Science of Internet Security: Technology Experimental Research, S&T contract NBCHC070133 Supporting Research Development of Security Technologies through Network Security Data Collection, and S&T cooperative agreement FA8750-12-2-0326 Supporting Research and Development of Security Technologies through Network and Security Data Collection, and the National Science Foundation (NSF) grants CNS-0958547 Internet Laboratory for Empirical Network Science, CNS-1513283 Internet Laboratory for Empirical Network Science: Next Phase, CNS-1901517 Strategies for Large-Scale IPv6 Active Mapping, CNS-1925729 Facilitating Advances in Network Topology Analysis, and OAC-1724853 Integrated Platform for Applied Network Data Analysis. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA, DHS, NSF, or the U.S. Government.

    Questions about Ark?

    Please send questions or comments regarding Ark to ark-info@caida.org.

    Related Objects

    See https://catalog.caida.org/search?query=ark to explore related objects to this document in the CAIDA Resource Catalog.

    Additional Content

    Archipelago Monitor Locations

    Archipelago (Ark): CAIDA’s active measurement infrastructure serving the network research community since 2007. This page provides an interactive map of Ark monitor locations and individual monitor attributes.

    Internet Topology Datasets Collected on the Archipelago (Ark) Infrastructure

    This page provides a listing of Internet topology related datasets directly collected or derived from measurments conducted on the Ark infrastructure.

    The Ark Platform: Hardware, Software, and Tools

    This page describes the underlying hardware and the software that make the Ark infrastructure so unique. The combination of the distributed, dedicated measurement nodes, advanced coordination facilities, state-of-the-art measurement tools and execution.

    Historical Measurements Running on the Archipelago (Ark) Infrastructure

    This page provides a listing of historical measurements and experiments conducted on the Ark measurement infrastructure. We provide users with access and tools to enable execution of measurements from the command-line or via a web interface through the Vela Ark Topo-on-Demand Service.

    The Impact of the Archipelago Measurement Platform

    This page provides a compendium of data, datasets, publications and other impacts the Archipelago (Ark) measurement platform has made on the nascent science of Internet topology discovery, measurement, and analysis.

    Archipelago Monitor Data Coverage

    The page shows the duty cycle of all currently active Archipelago monitors. The data coverage graphs indicates for which times IPv4 traceroute data and/or IPv6 data are available. The size graphs indicate the growth of the raw data in both compressed and uncompressed numbers.

    Vela: On-Demand Topology Measurement Service

    Vela is the on-demand topology measurement service of CAIDA’s Archipelago (Ark) infrastructure.

    Archipelago Memorandum of Cooperation (MOC) Between Hosting Sites and CAIDA

    This MOC concerns data collection and node usage between CAIDA and your organization as a Hosting Site for an Archipelago (Ark) node(s). CAIDA and your organization understand and agree that you are providing Ark node hosting without fee in exchange for its use in research.

    Archipelago Monitor Statistics

    Statistical information for the topology traces taken by Ark monitors is displayed here for each individual monitor. Summary graphs for all monitors are also available. Read the overview for more information.

    What exactly is skitter?

    An explanation of the multiple meanings of the term skitter and how Archipelago differs from skitter.

    Frequently Asked Questions for sites interested in hosting an Ark monitor

    Archipelago (Ark): CAIDA’s active measurement infrastructure serving the network research community since 2007.

    Archipelago (Ark) Site Acceptable Use Policy (AUP)

    CAIDA seeks sites interested in becoming a part of our next generation measurement infrastructure named Archipelago (Ark). CAIDA has more than nine years of experience in collection, curation, and distribution of topology data. This is a reference to those who accepted the AUP before 2010.

    CAIDA Active Probe Information

    As part of our Cartographic Capabilities for Critical Cyberinfrastructure and iLENS activities, CAIDA runs a number of periodic and ongoing macroscopic topology surveys. This page describes the probes sent out by our monitors.

    Published
    Last Modified