This page contains information useful to sites interested in hosting an Archipelago node.
Questions about Archipelago (Ark)?
Please send questions or comments regarding Ark to ark-info@caida.org.
Why should I host an Ark node?
-
Institutions with interest in hosting an Ark node should take a look at our brochure "Why should my network host an Ark node?" that details the benefits that hosting an Ark node can provide for the local network as well as the Ark project.
-
For those requiring more background and a more formal format, we prepared this document explaining the Ark project on official University letterhead.
MOC and Spoofer
-
A site wishing to host an Ark monitor should first review and approve the Memorandum of Cooperation (MOC), which outlines the obligations and expectations of both CAIDA and the hosting site. If needed, CAIDA can work with a prospective site to craft and implement a custom MOC that allows our monitor to conduct measurements within the bounds of a site's local policies.
-
The Spoofer project is studying the prevalence of IP source-address filtering (BCP 38) among networks attached to the Internet. A hosting site wishing to allow their hosted Ark monitor to receive (not generate) Spoofer measurement traffic should first review and approve the AUP for Spoofer measurements.
Hardware and Power
-
We are now deploying small, inexpensive network measurement nodes based on the Raspberry Pi. A Raspberry Pi consumes under 3 watts of power and draws around 700mA. No special cooling is required. These systems can be placed anywhere that is convenient for a hosting site, including on someone's desk.
-
In the past, we have deployed measurements on 1U servers, including under virtualization, but we now prefer to use Raspberry Pi's. If, for whatever reason, a hosting site cannot deploy a Raspberry Pi, then we may consider using a traditional server (possibly under virtualization) if the hosting location is particuarly beneficial in increasing the topological/geographical diversity of available vantage points.
Usage Patterns
-
We run our current traceroute measurements to every routed /24 prefix at 100pps for about 35kbps of outgoing traffic. That produces about 5MB of trace data per hour which we download to a CAIDA host concurrently with the measurements. This exemplifies typical bandwidth requirements for an active measurement. We might run a few measurements concurrently, each having about that much bandwidth usage (or more likely less). We do not plan to host any services (web, content, or distributed hash table) that can potentially generate a lot of traffic. Nor will we do any high volume bandwidth measurements (a la Iperf), since we would like to avoid generating complaints from recipients of measurement traffic.
-
Our goal is for Ark monitors to be used for a wide variety of measurements, and the set of measurements will evolve over time. However, current measurements are about Internet topology and, therefore, they employ similar types of low-level probe packets and receive similar types of responses even though the exact details and goals of the measurements may differ. We provide below a list of these low-level probe and response packets in the form of firewall-like rules. We will contact each hosting site to request permission to conduct any measurements that are significantly different from these current measurements (for example, for
Spoofer measurements). Current measurement traffic consists mostly of outgoing topology probe packets (e.g., ICMP echo request) and their expected responses (e.g., ICMP echo reply). We also need to perform traceroute and ping measurements to Ark monitors themselves, so a firewall should freely allow ICMP request/response traffic in both incoming and outgoing directions (that is, for both measurements to and from monitors).
In addition to measurement traffic, an Ark monitor will need to open a TCP connection to a central server at CAIDA (this will always be an outgoing connection from the Ark box to CAIDA's server), and a monitor will need to allow incoming SSH connections from CAIDA's /24 (but from nowhere else). In general, we do not run any network service except SSH on an Ark box, to increase security.
The following is a summary of the expected traffic in firewall-like rules:
Outgoing:- ntp (123/udp) to your local NTP server, to CAIDA's NTP server, or to the nearest NTP Pool server
- dns (53/udp) to your local DNS server(s) or to CAIDA's DNS server
- TCP connection to CAIDA's tuple-space server from any local (ephemeral) port (for Ark's tuple space communication)
- ICMP echo request, echo reply, port unreachable to any host (for ICMP-based topology measurements to and from the monitor)
- no ICMP rate limiting
- UDP probes from any local port to any host and any port (for UDP-based topology measurements from the monitor)
- TCP probes from any local port to any host and any port regardless of connection state (for non-SYN based TCP measurements such as sending a TCP ACK probe, which won't establish a connection nor be part of an existing connection)
- NTP and DNS responses
- ssh (22/tcp) from only CAIDA's /24 prefix
- ICMP echo request, echo reply, time exceeded, and destination unreachable (type 3, code any) from any host
- no ICMP rate limiting
- TCP packets (SYN, ACK, RST, etc.) from any host and any port regardless of connection state (for TCP-based topology measurements)
IPv6 Network Requirements and Recommendations (Desireable)
-
CAIDA has plans for continued strategic deployment of IPv6-capable
Ark measurement nodes. If your site enjoys IPv6 connectivity, the following
describes the minimum requirements for conducting measurements on
an Ark node.
Currently, we only accept native IPv6 connectivity.
An autoconfigured IPv6 address works fine (that is, the address does not have to be manually assigned to the host such that a change in hardware does not change the assignment).
A DNS PTR record for the IPv6 address offered over either IPv4 or IPv6 transport.
Mitigation and Handling of Complaints
-
In general, we try to perform measurements in ways that reduce the
likelihood of complaints. For example, we do relatively low volume
and low frequency measurements (from the point of view of individual
destinations) and prefer to avoid probing the same destinations
repeatedly.
Because complaints do occasionally occur, we try our best to direct the complaints to us rather than to the site hosting a monitor. An important way is by setting up the reverse mapping for a monitor IP address to either monitor.ark.caida.org (e.g., san-us.ark.caida.org) or monitor.ark.caida.site-domain (e.g., san-us.ark.caida.ucsd.edu).
We have weighed the possibility of running a lightweight webserver on the monitors themselves that would describe the measurements, but based on an evaluation of the security vs. benefits tradeoff (something we have considered for many years in the context of the skitter infrastructure), it is not our general policy to set up such a webserver. We can, however, do so upon request by the site hosting a monitor.
Hosting sites should simply forward any complaints they receive to us. We will respond to the complaints, and if necessary, add destinations to our no-probe list which will prevent future complaints from the same destination.
![[CAIDA - Cooperative Association for Internet Data Analysis logo]](/images/caida_globe_faded.png)