Skip to Content
[CAIDA - Center for Applied Internet Data Analysis logo]
Center for Applied Internet Data Analysis
Their share: diversity and disparity in IP traffic
A. Broido, Y. Hyun, R. Gao, and k. claffy, "Their share: diversity and disparity in IP traffic", in Passive and Active Network Measurement Workshop (PAM), Apr 2004, pp. 113--125.

Support for this work is provided by DARPA NMS (N66001-01-1-8909), DOE Contract No: DE-FC02-01ER 25466, and by NSF ANI-0221172, with support from the DHS/NCS.

|   View full paper:    PDF    gzipped postscript    html    |  Citation:    BibTeX   |

Their share: diversity and disparity in IP traffic

Andre Broido 1
Young Hyun 1
Ruomei Gao 2
kc claffy 1

CAIDA, San Diego Supercomputer Center, University of California San Diego


Georgia Institute of Technology

The need to service populations of high diversity in the face of high disparity affects all aspects of network operation: planning, routing, engineering, security, and accounting. We analyze diversity/disparity from the perspective of selecting a boundary between mice and elephants in IP traffic aggregated by route, e.g., destination AS. Our goal is to find a concise quantifier of size disparity for IP addresses, prefixes, policy atoms and ASes, similar to the oft-quoted 80/20 split (e.g., 80% of volume in 20% of sources). We define crossover as the fraction c of total volume contributed by a complementary fraction 1 - c of large objects. Studying sources and sinks at two Tier 1 backbones and one university, we find that splits of 90/10 and 95/5 are common for IP traffic. We compare the crossover diversity to common analytic models for size distributions such as Pareto/Zipf. We find that AS traffic volumes (by byte) are top-heavy and can only be approximated by Pareto with alpha = 0.5, and that empirical distributions are often close to Weibull with shape parameter 0.2-0.3. We also find that less than 20 ASes send or receive 50% of all traffic in both backbones' samples, a disparity that can simplify traffic engineering. Our results are useful for developers of traffic models, generators and simulators, for router testers and operators of high-speed networks.

Keywords: passive data analysis
  Last Modified: Wed Oct-11-2017 17:03:50 PDT
  Page URL: