Comparison of end-to-end distance metrics

Comparison of end-to-end
distance metrics

Bradley Huffaker
CAIDA, SDSC, UC San Diego bradley@caida.org

ISMA, Routing and Topology Analysis, December 2001

Overview

methodology
metrics
success rate
conclusion

background

skitter project (data collection): IP forward path topology and RTT
methodology: scoring the rate at which a metric can successfully predict low RTT

skitter project

monitors
- trace = forward IP path and RTT between monitor and destination
- cycle = a single run launching traces to the entire destination list
destination lists
- DNS clients
  - one DNS client per prefix
  - 53% of prefixes
  - 8 to 14 cycles per day
- IPv4 list
  - one IP address per /24 in routable prefix
  - 54% of prefixes and 5.7% of /24s
  - 1 cycle per day

methodology

scoring the ability of different metrics to correctly
select servers with the lowest RTT
- client: IP trying to select best (IP) from a set of servers
- server: server (IP address) offering desired service
caida monitors are client and destinations are servers
allows different metrics to be compared on the basis
of a single valuet>

evaluation algorithm>


if (rttA == rttB || serverA == serverB)
    throw out this data point
elsif (metricA == metricB)
    unusable++   # no predictive value
elsif (((metricA < metricB) && (rttA < rttB))
  || ((metricA > metricB) && (rttA > rttB)))
    successful++;
else
    failures++;

metrics

IP path Length: lowest TTL for request to which the destination responsed
AS path length: number of times the AS changed in an AS path
geographic distance: great circle distance from source to destination
median RTT: median RTT from previous day

IP path length

how to estimate without probing?
- shortest path in a shared IP topology
- collected by global infrastructure
study
- lowest TTL which got a response

AS path length

how to estimate without probing?
- collect BGP tables that store AS paths
- requires only a single connection to a local router (no burden on wider infrastructure)
study
- > no BGP table at most monitors
- abstracted AS path from IP path (using routeviews BGP tables)
- AS path length = number of times AS changed in the path

skitter's AS path length vs BGP's AS path length

Negative spike for Filtered BGP Table indicates no matches
Filtered - Skitter is the curve for the difference between AS path seen by Skitter and that in the BGP table

geographic distance

great circle distance between geographic locations of destination and monitor
how to estimate without probing?
- location of client should be known by client
- database for IP-to-geographic location mapping for servers
study
- location of monitors known and fixed
- location of destination received from IPMapper commerical geographic mapping tool (See: www.ipmapper.com)

geographic distance and RTT

three clusters of high density on both RTT and geography
West Coast, East Coast. Europe/Asia

median RTT

median RTT from sample of the previous day
how to estimate without probing?
- requires the client to continuously monitor all servers
- generates much unnecessary traffic
study
- same

percentage of successful trials

RTT median unsurprising; provides over 90% success rate
geographic distance provides 75% success within the US, but only two of the five non-US monitors achieved this
IP path length is a little better then random
AS path length is only as good as random

percentage of unusable trials

AS path length has a large percentage of unusable trials due to tightness of AS path length's distribution
even for those trials where AS path length was useful as a differentiating metric, it was no better than 60%

stability of results

all metrics highly stable, with only minor local fluctuation
IP path length, AS path length, geographic distance, median RTT

RTT accumulation

success rate of each metric as number of cycles increases
median enjoys greatest success up to 24 hour period
taking a single RTT value near the current time of day is more effective than averaging all values in between
storing RTT over a 24 hour period is not helpful for use in median calculation

unusual metrics

1st-to-3rd quartile average, average across the values within the 1st and 3rd quartile
median group, only uses the traces taken 6 hours either side of the current time of day

conclusions

geography provides a reasonable indicator of low RTT within the US, no probing of (zero cost to) network
AS path length is easy to collect but not statistically useful
when calculating median, storing RTT over a 24 hour period is not helpful

File translated from T_EX by T_TH, version 2.92.
On 21 Dec 2001, 16:32.

Related Objects

See https://catalog.caida.org/media/2001_isma01_brad/ to explore related objects to this document in the CAIDA Resource Catalog.