This dataset contains information useful for studying the topology
of the Internet. Data is collected by a globally
distributed set of Ark monitors. The monitors
use team-probing to distribute the work of probing the
destinations among the available monitors.
We collect data by sending scamper probes continuously
to destination IP addresses. Destinations are selected randomly
from each routed IPv4 /24 prefix on the Internet such that a random
address in each prefix is probed approximately every 48 hours (one
probing cycle). Because team-probing distributes the probing work
across all monitors, a single destination /24 will be probed by
only one monitor in each probing cycle. The current list of routed
IPv4 prefixes was created using RouteViews BGP tables from
Aug 27-Sep 2, 2008. Rather than having a static list of IP addresses
to probe, we dynamically pick a new random address in each /24
prefix for every new cycle of probing. The current prefix list
includes approximately 7.4 million prefixes (with 6.5 million for
data collected before November 2007). Scamper:
- Measures Forward IPv4 Paths
- scamper records an IPv4 address seen at each hop from
a source to a destination by incrementing the "time to live"
(TTL) of each IPv4 packet header, and recording replies
from each router leading to the destination host.
- Measures Round Trip Times (RTT)
- scamper collects round trip time measured to each
intermediate router as well as to the destination host.
In the current configuration, scamper probes with ICMP packets,
using the Paris traceroute
technique (ICMP-paris) to improve measurement integrity across load-balanced
links. Data prior to November 2007 used an alternate
UDP traceroute method. Data collected for each path probed includes:
- RTTs, including both intermediate hops and the destination
- IPID, TOS, and TTL, and size fields of response packets
- IP length, TTL, and TOS fields of the probe packet that reached each hop (extracted from the response packet)
- The ICMP type and code of responses
Scamper also is able to collect Path MTU information, but current measurements do not include that information. A
sample binary
warts file is available.
Data has been collected continuously since September 13, 2007,
and is made available in hour-duration files for the most recent
ten days, as well as a historical archive of 24-hour-duration files
for the duration of IPv4 Routed /24 Topology project.
Caveats that apply to this dataset:
-
Unlike CAIDA's previous skitter
macroscopic topology measurements, IPv4 Routed
/24 Topology Dataset uses a dynamic destination list. Measurements
to consistent IPv4 addresses are not available in this dataset.
-
Because team-probing distributes measurements across many
monitors, the randomly selected IP addresses in a given routed
prefix are not probed by the same set of monitors consistently
over time.
- Scamper outputs traces in a different file format than
skitter.
Reading Topology Data
You can analyze this data (available in the warts format) with the
sc_analysis_dump tool included in the scamper distribution package.
The sc_analysis_dump tool prints out information about each trace in
an easy-to-parse textual format (one trace per line). You would
typically write a perl script to analyze the output of
sc_analysis_dump.
Another tool you may want to consider is the warts-dump tool,
which is also included in the scamper distribution. The output of
warts-dump is somewhat less easy to parse,
but warts-dump prints out practically all information
contained in a warts file.
Finally, you can write your analysis scripts in the
Ruby language
using rb-wartslib,
an easy-to-use Ruby binding to the warts I/O library.
Data Use Restrictions
Acceptable Use Policy for the files of the IPv4 Routed /24 Topology Dataset
- Macroscopic Topology data will not be distributed beyond
authorized users.
- I will notify CAIDA of the names and email addresses of
any persons (and their respective affiliations) assisting
me in research using the macroscopic topology data. This includes
graduate students and interns.
- At the end of the research, or semi-annually (which ever
is more frequent), a summary of the research and any
findings/conclusions will be reported to CAIDA. If any
research is described on the WWW, a URL will be provided.
This information is primarily used in reports to our funding
agencies.
- In so far as possible, research findings and conclusions
using the topology data will be published and/or made publicly
available
- All users who publish a document (including web pages
and papers) using data from a IPv4 Routed /24 Topology Dataset must provide
CAIDA with a copy of the publication.
-
All users who publish a document (including web pages, and papers) using data
from this dataset must provide CAIDA with a copy of the publication and must cite:
The CAIDA IPv4 Routed /24 Topology Dataset - < dates
used >, Young Hyun, Bradley Huffaker, Dan Andersen, Emile
Aben, Colleen Shannon, Matthew Luckie, and kc claffy
http://www.caida.org/data/active/ipv4_routed_24_topology_dataset.xml.
-
Users are encouraged, but not required, to include the following
attribution in the acknowledgments section of their document:
Support for the IPv4 Routed /24 Topology Dataset is
provided by the National Science Foundation, the US Department
of Homeland Security, the WIDE Project, Cisco Systems, and
CAIDA Members.
-
All users who create a publicly available presentation using
data from this dataset must provide CAIDA with a copy of the
presentation and must use the full name of the dataset ("The
CAIDA IPv4 Routed /24 Topology Dataset") in the
presentation. Users are further encouraged, but not required,
to include the URL for the dataset
(http://www.caida.org/data/active/ipv4_routed_24_topology_dataset.xml) in
their presentation.
IPv4 Routed /24 Topology Dataset Access
Request Access to IPv4 Routed /24 Topology Data
Other Topology Datasets:
- Freely Available Datasets
- Restricted Access Datasets
References
For more information on topology measurements see:
For more information on the Archipelago (Ark) Measurement Infrastructure, see:
Acknowledgments
Special thanks to Matthew Luckie for development of and assistance with scamper.
The IPv4 Routed /24 Topology Dataset was sponsored by: