CAIDA Home
 Macroscopic Topology | IMDC | COMMONS | Network Telescope | Ark | Day in the Life | Coralreef | IPNC  
 www.caida.org > projects : : ark
    visit     contact     search:
CAIDA: Cooperative Association for Internet Data Analysis
Archipelago Measurement Infrastructure

-----summary of contents-----
Archipelago (Ark for short) is CAIDA's next-generation active measurement infrastructure and represents an evolution of the skitter infrastructure that has been serving the network research community for more than 8 years.
-----end summary of contents-----

Current Monitor Status

monitor map
(click on image for an interactive map)

Introduction

Archipelago (Ark) is CAIDA's newest active measurement infrastructure, the next generation in evolution of the skitter infrastructure CAIDA operated for nearly a decade (what is skitter and how is Ark different from skitter?). The primary goals are to

  • reduce the effort needed to develop and deploy sophisticated large-scale measurements, and
  • provide a step toward a community-oriented measurement infrastructure by allowing collaborators to run their vetted measurement tasks on a security-hardened distributed platform.

Ark is tailored specifically for active network measurement. This allows Ark to be simpler than some other general-purpose distributed experimental platforms, and it allows us to concentrate on providing facilities that directly address the needs of networking research. In particular, we provide a facility for communication and coordination that makes it easier to write distributed measurements that must work together to achieve a goal. We are working on providing a high-level API to ease the challenges of writing measurement tools. Our goal is to lower the barrier to bringing novel and interesting measurement techniques to life.

Current Measurements

Dataset quick links:

The initial focus of Ark is coordinated large-scale traceroute-based topology measurements using a process called team probing. In team probing, we group monitors into teams and dynamically divide up the measurement work among team members. This parallelization allows us to obtain a traceroute measurement to all routed /24's in a short period of time: about 48-56 hours for a team of 13 monitors probing 7 million /24's (that is, the full routed address space subdivided into /24's) at 100pps. We currently have two teams active, and each team probes independently.

We perform traceroute measurements using scamper, a powerful and flexible active measurement tool supporting IPv4, IPv6, traceroute, and ping. Scamper supports TCP-, UDP-, and ICMP-based measurements and Paris traceroute variations. Scamper has been in development for several years by our collaborator Matthew Luckie at the University of Waikato.

The end product is the IPv4 Routed /24 Topology Dataset. These measurements have been ongoing since September 2007, and as of mid-August 2008, we have collected 1.3 billion traceroutes and 519GB of data.

We augment the Routed /24 Topology Dataset with automated lookups of DNS names. We have an in-house bulk DNS lookup service called HostDB that can look up millions of addresses per day. We look up all intermediate addresses and responding destinations seen in the Topology Dataset.

We are working on combining the three alias resolution techniques currently available (Mercator, Ally, APAR) into a unified tool and system that we will use to generate router-level topology from the IP Topology Dataset.

Finally, we provide the IPv4 Routed /24 AS Links Dataset, which contains Autonomous System (AS) links derived from the IP paths of the Topology Dataset. This AS links dataset is useful for studying the peering relationships of autonomous systems, which are approximately network(s) under a single administrative control.

Tuple Space

One of the distinguishing features of Ark is its focus on coordination. Coordination, broadly speaking, is concerned with planning, executing, and controlling an ensemble of distributed computations. Coordination is the meta-activity that surrounds a computation.

To facilitate coordination, Ark provides a new implementation, called Marinda, of the well-known tuple-space coordination model first introduced by David Gelernter in his Linda coordination language. A tuple space is a distributed shared memory combined with a small number of easy-to-use operations. The tuple space stores tuples, which are arrays of simple values (strings and numbers). Clients retrieve tuples by pattern matching.

The tuple space is a many-to-many communication and coordination medium. Over this medium, measurement clients can interact in sophisticated ways, such as exchanging state and triggering actions among monitors. The tuple space abstraction leads to a peer-to-peer architecture, in which participants can be both a client and a server seamlessly. For example, it is simple to write a traceroute service that takes requests and sends responses over the tuple space. We can then layer on top of these traceroute services clients that trigger traceroutes when certain conditions are met. By lowering the barrier to writing and deploying services to just a few lines of code, the tuple space abstraction allows a rich ecosystem of measurement services to thrive, in the same way that HTML empowered users by allowing anyone to become a publisher on the Internet.

For more information, see the list of coordination references below.

Future Plans

  • We will release the source code of the Marinda tuple space implementation under the GPL.
  • We will continue implementing the Ark infrastructure software, including a high-level API for performing network measurements and the security layers needed to allow semi-trusted third parties to conduct measurements.
  • We will conduct IPv6 topology measurements from 6 monitors that currently have IPv6 connectivity (as of August 2008). We also hope to perform DNS open resolver surveys.

Presentations

References

Has your computer received a probe from an Ark monitor?

Learn more about the probes sent by CAIDA for these experiments.

Questions about Ark?

Please send questions or comments regarding Ark to ark-info@caida.org.


Cooperative Association for Internet Data Analysis (CAIDA)
  Last Modified: Mon Nov-17-2008 15:0:8 PDT
  Maintained by: Young Hyun
  Page URL: http://www.caida.org/projects/ark/index.xml