This dataset contains information useful for studying the topology
of the Internet. Data is collected by a globally
distributed set of Ark monitors. The monitors
use team-probing to distribute the work of probing the
destinations among the available monitors.
We collect data by sending scamper probes continuously
to destination IP addresses. Destinations are selected randomly
from each routed IPv4 /24 prefix on the Internet such that a random
address in each prefix is probed approximately every 48 hours (one
probing cycle). Because team-probing distributes the probing work
across all monitors, a single destination /24 will be probed by
only one monitor in each probing cycle. The current list of routed
IPv4 prefixes was created using RouteViews BGP tables from
October 14-20, 2007. Rather than having a static list of IP addresses
to probe, we dynamically pick a new random address in each /24
prefix for every new cycle of probing. The current prefix list
includes approximately seven million prefixes (with 6.5 million for
data collected before November 2007). Scamper:
- Measures Forward IPv4 Paths
- scamper records an IPv4 address seen at each hop from
a source to a destination by incrementing the "time to live"
(TTL) of each IPv4 packet header, and recording replies
from each router leading to the destination host.
- Measures Round Trip Times (RTT)
- scamper collects round trip time measured to each
intermediate router as well as to the destination host.
In the current configuration, scamper probes with ICMP packets,
using the Paris traceroute
technique (ICMP-paris) to improve measurement integrity across load-balanced
links. Data prior to November 2007 used an alternate
UDP traceroute method. Data collected for each path probed includes:
- RTTs, including both intermediate hops and the destination
- IPID, TOS, and TTL, and size fields of response packets
- IP length, TTL, and TOS fields of the probe packet that reached each hop (extracted from the response packet)
- The ICMP type and code of responses
Scamper also is able to collect Path MTU information, but current measurements do not include that information. A
sample binary
warts file is available.
Data has been collected continuously since September 13, 2007,
and is made available in hour-duration files for the most recent
ten days, as well as a historical archive of 24-hour-duration files
for the duration of IPv4 Routed /24 Topology project.
Caveats that apply to this dataset:
-
Unlike CAIDA's previous skitter
macroscopic topology measurements, IPv4 Routed
/24 Topology Dataset uses a dynamic destination list. Measurements
to consistent IPv4 addresses are not available in this dataset.
-
Because team-probing distributes measurements across many
monitors, the randomly selected IP addresses in a given routed
prefix are not probed by the same set of monitors consistently
over time.
- Scamper outputs traces in a different file format than
skitter.
Reading Warts Data
The sc_analysis_dump tool is available in the scamper distribution
that reads both skitter-format (arts++) and scamper-format (warts)
files and outputs data in a textual format already familiar to
researchers who analyze skitter files. Warts files can also be
viewed with the warts-dump utility also found in the scamper distribution.
Warts-dump prints the warts file in a human-readable format, although
the output is more difficult to parse than that of
sc_analysis_dump.
Data Use Restrictions
Acceptable Use Policy for the files of the IPv4 Routed /24 Topology Dataset
- Macroscopic Topology data will not be distributed beyond
authorized users.
- I will notify CAIDA of the names and email addresses of
any persons (and their respective affiliations) assisting
me in research using the macroscopic topology data. This includes
graduate students and interns.
- The privacy of end users (hosts) and networks monitored
by the IPv4 Routed /24 Topology will be respected by
the data users. All publications (papers, web pages,
presentations, etc.) will anonymize IP addresses, network
names, and domain names.
- At the end of the research, or semi-annually (which ever
is more frequent), a summary of the research and any
findings/conclusions will be reported to CAIDA. If any
research is described on the WWW, a URL will be provided.
This information is primarily used in reports to our funding
agencies.
- In so far as possible, research findings and conclusions
using the topology data will be published and/or made publicly
available
- All users who publish a document (including web pages
and papers) using data from a IPv4 Routed /24 Topology Dataset must provide
CAIDA with a copy of the publication.
-
All users who publish a document (including web pages, and papers) using data
from this dataset must provide CAIDA with a copy of the publication and must cite:
The CAIDA IPv4 Routed /24 Topology Dataset - < dates
used >, Young Hyun, Bradley Huffaker, Dan Andersen, Emile
Aben, Colleen Shannon, and Matthew Luckie,
http://www.caida.org/data/active/ipv4_routed_24_topology_dataset.xml.
-
Users are encouraged, but not required, to include the following
attribution in the acknowledgments section of their document:
Support for the IPv4 Routed /24 Topology Dataset is
provided by the National Science Foundation, the US Department
of Homeland Security, the WIDE Project, Cisco Systems, and
CAIDA Members.
-
All users who create a publicly available presentation using
data from this dataset must provide CAIDA with a copy of the
presentation and must use the full name of the dataset ("The
CAIDA IPv4 Routed /24 Topology Dataset") in the
presentation. Users are further encouraged, but not required,
to include the URL for the dataset
(http://www.caida.org/data/active/ipv4_routed_24_topology_dataset.xml) in
their presentation.
IPv4 Routed /24 Topology Dataset Access
Request Access to IPv4 Routed /24 Topology Data
Other Topology Datasets:
- Freely Available Datasets
- Restricted Access Datasets
References
For more information on topology measurements see:
For more information on the Archipelago (Ark) Measurement Infrastructure, see:
Acknowledgments
Special thanks to Matthew Luckie for development of and assistance with scamper.
The IPv4 Routed /24 Topology Dataset was sponsored by: