Bradley Huffaker, Marina Fomenkov, Daniel J. Plummer, David Moore
and k claffy
CAIDA, SDSC, UC San Diego
{bradley,marina,djp,info,kc}@caida.org
IEEE International, Telecommunications Symposium, (ITS2002)
overview
- server selection problem
- data used in this study
- distance metrics
- metric success rates
- conclusions
server selection
- Many Internet services are provided by multiple servers.
- Clients want to select a server that optimizes their access to
a given service. - Possible optimizations:
server load, available bandwidth, loss rate, transit time. - We will look at the problem of optimizing minimum transit time
or Round Trip Time (RTT).
Approach for selecting minimum RTT
- A common solution to this problem is a selection system
which is local to a client and can do the selection for it. - This system requires a metric of distance in order to sort
the list of potential servers. - We will present four metrics which represent the distance between
two nodes on the Internet.
study background
- data: CAIDA Macroscopic Topology Project
- IP forward path topology and RTT
- Collected at 9 monitors around the world.
- Continuously monitors many thousands of destinations
- methodology: success rate
- br Providing a single value which represents the rate
- br at which a metric successfully predicts low RTT.
data: CAIDA Macroscopic Topology Project
- monitors
- trace
- br forward IP path and RTT between monitor and destination
- cycle
- br a single run through the destination list
- destination lists
- DNS clients
- one DNS client per routable prefix
- 8 to 14 cycles per day
- 58,000 destinations
- IPv4 list
- one IP address per /24 in routable prefix
- 1 cycle per day
- 300,000 destinations
- DNS clients
methodology: success rate
- For each pair of traces we compare RTTs and the metrics.
- lower metric & lower RTT = success
- lower metric & higher RTT = failure
- equal metric = useless
- brbr RTTs are never equal.
- lower metric & lower RTT = success
- For each metric, we count the total number of success, failure,
and useless trials.
distance metrics
metrics
- IP path Length (number of IP)
- br The number of routers, represented by their IP address.
- AS path length (number of AS)
- br The number of ISPs, represented by their AS.
- geographic distance (km)
- br The great circle distance from client to the server.
- median RTT (ms)
- br The median RTT sampled from midnight GMT to midnight
- br GMT on the previous day.
metric description
- possible deployable system
- br A system which could be created to provide user end
access to a given metric. - our approximation
- br Method used to estimate a given metric in our study.
IP path length
- possible deployable system
- Shortest path found between client and server in a IP graph.
- This IP graph can be built by a remote system and shared
between multiple distance resolvers.
- our approximation
- The actual forward IP path seen in the Internet.
AS path length
- possible deployable system
- AS paths can be collected from the Border Gateway Protocol
(BGP) annoucements. - This information is already distributed by the Internet routers
and so would not introduce additional traffic to the network.
- AS paths can be collected from the Border Gateway Protocol
- our approximation
- We could not collect BGP data for all our monitors.
- So we converted our IP paths to AS paths using information collected by Oregon's Routeviews Project (www.routeviews.org).
geographic distance
- possible deployable system
- A service that knows geographic location of IP address can
be used to find the location of servers. - The location of the client should already be known (or can be
retrieved from the same geographic service). - Then the distance between these two points can be calculated.
- A service that knows geographic location of IP address can
- our approximation
- The location of the skitter monitors is already known.
- We used a commercial geographic service IxMapper to find
geographic location.
- three clusters of high density on both RTT and geography
- West Coast, East Coast. Europe/Asia
median RTT
- median RTT from sample of the previous day
- possible deployable system
- Set of monitors which systematically sample possible client RTT.
- This monitoring increases traffic on the network.
- Due to high variability in RTT values previous values can not
specifically predict the next RTT value.
- our approximation
- We used our sampled data from the previous day and calculated
median value from these samples.
- We used our sampled data from the previous day and calculated
percentage of successful trials
- RTT median provides over 90% success rate
- geographic distance provides 75% success within the US,
but only two of the five non-US monitors achieved this - IP path length is a little better then random
- AS path length is only as good as random
- all metrics highly stable, with only minor local fluctuation
- br IP path length, AS path length, geographic distance, median RTT
RTT based metrics
- median RTT
- br the value in the middle of the sorted list of values
- single value RTT
- br a single RTT value previously observed
- average RTT
- br the sum of all values divided by the number of values
- the success rate of RTT metrics as number of cycles increases
- median has the greatest success up to 24 hour period
- taking a single RTT value near the current time of day
is more effective than averaging all values in between - storing RTT over a 24 hour period does not improve the
success rate of the median RTT metric
conclusions
- RTT based metrics provide a high success rate for server ranking.
- No more then 24 hours of data should be collected.
- The success rate is better then the next best even
when a single RTT value is used. - But these are hard to collect.
- Geography provides a reasonable indicator of low RTT within the US.
- This can be done at no cost to the network.
- This can be done at no cost to the network.
- AS Path length has no predictive power when selecting low RTT.
File translated from TEX by TTH, version 2.92.
On 17 Sep 2002, 11:04.