distance metrics
Bradley Huffaker
CAIDA, SDSC, UC San Diego bradley@caida.org
ISMA, Routing and Topology Analysis, December 2001
Overview
- methodology
- metrics
- success rate
- conclusion
background
- skitter project (data collection): IP forward path topology and RTT
- methodology: scoring the rate at which a metric can successfully predict low RTT
skitter project
- monitors
- trace = forward IP path and RTT between monitor and destination
- cycle = a single run launching traces to the entire destination list
- destination lists
- DNS clients
- one DNS client per prefix
- 53% of prefixes
- 8 to 14 cycles per day
- IPv4 list
- one IP address per /24 in routable prefix
- 54% of prefixes and 5.7% of /24s
- 1 cycle per day
- DNS clients
methodology
- scoring the ability of different metrics to correctly
- select servers with the lowest RTT
- client: IP trying to select best (IP) from a set of servers
- server: server (IP address) offering desired service
- caida monitors are client and destinations are servers
- allows different metrics to be compared on the basis
- of a single valuet>
evaluation algorithm>
if (rttA == rttB || serverA == serverB) throw out this data point elsif (metricA == metricB) unusable++ # no predictive value elsif (((metricA < metricB) && (rttA < rttB)) || ((metricA > metricB) && (rttA > rttB))) successful++; else failures++;
metrics
- IP path Length: lowest TTL for request to which the destination responsed
- AS path length: number of times the AS changed in an AS path
- geographic distance: great circle distance from source to destination
- median RTT: median RTT from previous day
IP path length
-
how to estimate without probing?
- shortest path in a shared IP topology
- collected by global infrastructure
- study
- lowest TTL which got a response
AS path length
-
how to estimate without probing?
- collect BGP tables that store AS paths
- requires only a single connection to a local router (no burden on wider infrastructure)
- study
- > no BGP table at most monitors
- abstracted AS path from IP path (using routeviews BGP tables)
- AS path length = number of times AS changed in the path
skitter's AS path length vs BGP's AS path length
|
- Negative spike for Filtered BGP Table indicates no matches
- Filtered - Skitter is the curve for the difference between AS path seen by Skitter and that in the BGP table
geographic distance
- great circle distance between geographic locations of destination and monitor
-
how to estimate without probing?
- location of client should be known by client
- database for IP-to-geographic location mapping for servers
- study
- location of monitors known and fixed
- location of destination received from IPMapper commerical geographic mapping tool (See: www.ipmapper.com)
geographic distance and RTT
- three clusters of high density on both RTT and geography
- West Coast, East Coast. Europe/Asia
median RTT
- median RTT from sample of the previous day
-
how to estimate without probing?
- requires the client to continuously monitor all servers
- generates much unnecessary traffic
- study
- same
percentage of successful trials
- RTT median unsurprising; provides over 90% success rate
- geographic distance provides 75% success within the US, but only two of the five non-US monitors achieved this
- IP path length is a little better then random
- AS path length is only as good as random
percentage of unusable trials
- AS path length has a large percentage of unusable trials due to tightness of AS path length's distribution
- even for those trials where AS path length was useful as a differentiating metric, it was no better than 60%
stability of results
- all metrics highly stable, with only minor local fluctuation
- IP path length, AS path length, geographic distance, median RTT
RTT accumulation
- success rate of each metric as number of cycles increases
- median enjoys greatest success up to 24 hour period
- taking a single RTT value near the current time of day is more effective than averaging all values in between
- storing RTT over a 24 hour period is not helpful for use in median calculation
unusual metrics
- 1st-to-3rd quartile average, average across the values within the 1st and 3rd quartile
- median group, only uses the traces taken 6 hours either side of the current time of day
conclusions
- geography provides a reasonable indicator of low RTT within the US, no probing of (zero cost to) network
- AS path length is easy to collect but not statistically useful
- when calculating median, storing RTT over a 24 hour period is not helpful
File translated from TEX by TTH, version 2.92.
On 21 Dec 2001, 16:32.