On Feb 8, 2008, after 10 years of data collection and 4TB of data, we deactivated skitter data collection and transitioned to our next generation topology measurement infrastructure named Archipelago (Ark). We already perform large-scale topology measurements on Ark, and we recommend researchers use this new dataset, which employs an improved measurement methodology. The new IPv4 Routed /24 Topology Dataset collected on Ark extends back to Sep 13, 2007 and overlaps with the last five months of skitter data.
Utilizing skitter Data
skitter data results in a spanning tree structure originating at the polling host and extending into the infrastructure toward the destination hosts in the polling set. We then aggregate data into a centralized database for correlation and depiction as a top-down, macroscopic view of a cross-section of the Internet from at least a small set of sources. Juxtaposition of such data sets has been a remarkably unattended area given its potential utility in Internet engineering and modelling. Analysis of real-world trends in routing behavior across the Internet has direct implications for next generation networking hardware, software and operational policies. Observations of macro-level traffic patterns provide insights into:
Mapping Dynamic Changes in Internet Topologies
by comparing collapsed skitter graphs across time
Tracking Related Performance Effects in Real-Time
using skitter's RTT data to indicate regions of the infrastructure experiencing abnormal delay.
Identifying Critical Paths
in the infrastructure, i.g., routers or exchange points that might be sources of significant network vulnerability
Performance Testing of Fielded Internet Hardware
e.g., initial skitter data identified statistically significant problems on certain routers using network route cache technology.
Skitter also offers promise in potential correlation to BGP data, to allow engineers to discern who is announcing what to whom over specific paths. Although such information will not answer why given events occur or if such traffic behavior is optimal, it will provide real-world inputs to traffic models/simulations designed to answer such questions. We hope to eventually integrate Skitter data with a comprehensive database of physical topologies (e.g., prototypes are CAIDA's java-based topology mapping tools for ISP backbones, the Mbone and caching hierarchy topologies). These data can also help pinpoint routing instabilities and other anomalies, and track their secondary, downstream effects, e.g. on round trip times, availability, packet loss across specific paths. A repository of these data/analyses will significantly enhance our predictive capabilities on the Internet, and holds promise for insights into the infrastructure as a whole.
Infrastructural Application: Root DNS Server Placement
RSSAC, the DNS root server technical advisory committee to ICANN includes existing root server operators, institutional representatives (from IESG, IANA, DOC, etc.) and technical measurement experts (CAIDA). One of the committee's responsibilities is to provide ICANN with recommendations regarding optimal locations for root name servers. There are currently 13 root name servers. RSSAC has asked CAIDA for assistance gathering data to help determine such architecturally strategic locations for planned root name servers within the Internet. Also available online is Internic's list from 1997 of root name servers.
CAIDA's skitter project will support RSSAC as follows. Since August 1999, a skitter host co-located with the F root name server, maintained by Paul Vixie at the Digital Palo Alto Internet Exchange, has measured connectivity and round trip latency to a target list of hosts taken from F's DNS query logs (hereafter called the F root client set). We hope to expand to other current or proposed root server locations as we refine the analysis process. In addition to ensuring that the measurements do not impact the operation of F, we are still investigating how much data we need to gather in order to analyze and interpret it in useful ways. In October 1999 we provided preliminary sample graphs to the root server operators.
The primary goal of the measurement effort is to assess two metrics of connectivity: round trip time and hop count from the root name server to the hosts in the target set. We will specifically explore three possible topological results:
Clusters of hosts that are particularly far,
measured by latency, from all of the roots, and that might thus suggest a
region that merits a new root server.
Insufficient redundancy in the root server
architecture might be reflected in skitter topologies from multiple roots that
suggest that the failure of a strategic intermediate router or sub-path would
render many end hosts unable to reach any root.
- Conversely, excessive redundancy in the infrastructure might be reflected in a set of skitter topologies from different roots where a large set of destination hosts are quite close to several of these roots.
Note that the methodology used here will be relevant beyond the DNS system, and is applicable to location research for any type data server of strategic infrastructural relevance. Since we will not have that many root name server locations instrumented before January 2000, we may do some comparisons to our current skitter sources using the F root client set.
Current skitter Sources
As of 2008, CAIDA maintained 19 skitter hosts all over the world. However, not all Skitter monitors are running the full destination set at all times.
Our short-term future plans for the skitter project include:
- 3D visualizations of skitter measurements
- enhancement and porting of the Arts++ binary file format library that is used to store active, passive, and routing data
- deployment of additional active and passive measurement hosts throughout the global infrastrastructure
- spectral analysis of delay data to identify periodicity in specific routers or paths
- correlation of active skitter data with passive measurements from Coral (OCxmon) monitors and flow statistics
- analysis of trends and identification of further measurement and analysis requirements
Many thanks to Bill Cheswick and Hal Burch (Lucent/Bell Laboratories) for providing us with their 2D graph layout code; http://www.cheswick.com/ches/map/index.html has information on Lucent's activities in this area.
CAIDA's skitter project is partially funded by DARPA NGI Cooperative Agreement N66001-98-2-8922, NSF ANIR grant NCR-9711092 (CAIDA), and with equipment support from both Sun Microsystems and Digital Equipment. There are currently 16 skitter sources; we expect to have 25 by January 2000. Countries with sources include Korea, Japan, Great Britain, Canada, Singapore, New Zealand, and the United States. ISPs sponsoring skitter boxes include MCI/Worldcom (at Mae-West), AboveNet, Qwest, Canarie, and APAN.