Analysis of routing & topology data
Archived MagicPoint presentation slides, compiled into a single PDF document.
2001_routingtopology2001.pdf (96 slides, 5.2 MB)
Slide text transcript
Slide 1: analysis of routing & topology data
analysis of routing & topology data
sept/oct 2001
no aphorism is more frequently repeated...
than that we must ask Nature few questions,
or ideally, one question at a time.
...this view is wholly mistaken.
Nature will best respond to a logically
and artfully thought out questionnaire;
indeed if we ask her a single question, she will often refuse
to answer until some other topic has been dicussed.
-- perspectives in medicine and biology 1973 sir ronald a fisher
broido@caida.org
kc@caida.org
ucsd/sdsc/caida
Slide 2: outline
outline motivation: macroscopic topology study infrastructure-relevant research questions led us to analyze BGP data on its own realized that routing & topology inextricably related describe both kinds of data sets introduce techniques for analysis of Internet `core' graph theoretic model combinatorial core vs giant component dual graph subprefix connected components introduce metrics for analysis indegree/outdegree characteristics transit versus origin role new contributions to the field of topology analysis new granularities: BGP atoms, dual AS graph, ramified atoms size distributions (weibull) with explanatory model (coalesence) reachability functions topological resilience
Slide 3: motivating questions (a sample)
motivating questions (a sample)
number of network prefixes in table
reduction beneficial to [routing system] infrastructure
trends in routability of IP space given by registries
which ASes are most highly:
connected, vulnerable, controlling connectivity
is there an Internet `core'
does the core grow due to multihoming?
are more-specifics what is driving up the table size?
hop diameter of AS, AS diameter of IP paths
optimizing active measurement architectures
can topology be inferred from BGP tables
for the purposes of modeling, simulation,
and infrastructure analysis?
Slide 4: motivating questions (more specific)
motivating questions (more specific) how many prefixes does it take to cover the IPv4 address space? how many routing entries would be required if arbitrary intervals (as opposed to CIDR blocks) in IP space would be allowed as ranges of network addresses? what is the complexity of the system of AS paths associated with individual prefix? how many different types of networks are globally distinguishable with respect to routing policies? how many routing policies are applied by the Internet to addresses originated by one AS?
Slide 5: global AS topology (from skitter data)
global AS topology (from skitter data)
Slide 6: topology data: skitter
topology data: skitter
about 26 sources around world
500k destinations (diff by box)
over 60% of prefixes covered (still building lists)
parallel ICMP echo request reply
lightweight, coarse temporal granularity
used for variety of macroscopic studies
most comprehensive in world (low bar)
data available to researchers
www.caida.org/tools/measurement/skitter/
Slide 7: topology data: dynamic properties
topology data: dynamic properties all graphs change with time new equipment (nodes, links, firewalls) added new IP blocks alocated renumbering out of old IP blocks "death rate" of IP addresses paths oscillate (load balancing) paths fluctuate (routing instability) paths flip (manual config) outages, cuts, blackouts okay so data is messy...
Slide 8: topology data: ambiguities
topology data: ambiguities non-responding hops (no IP returned) rate-limited response private addresses multicast addresses addresses in 0.-2./8 blocks no matching BGP prefix prefixes with multiple origin AS okay so data is icky, too...
Slide 9: private address use
private address use Internet is partitioned by > 2 layers of NAT private addresses = 1.7% of skitter replies (18K out of 1.07M, Jul.02-31,2001) assumption: private IPs are misconfigs we assume a generic misconfig rate of 1% inadvertent transit ASes: 1.25% multi-origin prefixes: 1.3% (assume 100x more private addrs than we observe) => private/unique IPs ~ 1.7 (order of magnitude)
Slide 10: correlation among data sets (perf., topology, routing)
correlation among data sets (perf., topology, routing) preliminary findings ~1% IP destinations disappearing monthly (re-addressing, firewalls) route announced path not matching forward path indication of potential routing configuration errors by no means automatic paths much less static than expected
Slide 11: macroscopic question: DNS root server placement
macroscopic question: DNS root server placement RSSAC, DNS technical advisory committee to ICANN goal: optimize root nameserver locations (13) co-locate skitter hosts w root servers demonstrate root server performance in serving target community develop techniques for evaluating architectual optimality for root server placement visualization to correlate data sources/types collaborative project to encourage proactive participation (network operators, researchers, others) (www.caida.org/tools/measurement/skitter/)
Slide 12: skitter on-going daily summaries
skitter on-going daily summaries http://www.caida.org/tools/measurement/skitter/summary path length (in IP hops) distribution RTT distribution RTT versus longitude, path dispersion AS & country granularity AS core graph
Slide 13: inter-domain routing (BGP) data
inter-domain routing (BGP) data not much instrumentation/diagnostics/data sources UO's www.route-views RIPE's www.ripe.net/ris Merit's IPMA /www.merit.edu/ipma really need real-time instrumentation (w/o impacting forwarding performance) ideal: route-lookups in real-time w/o kernel need data for realistic inter-domain routing models identification/vis of flaps, outages, critical pts correlation of perf. w/some measure of path `length' comparison of forward path with BGP path shortest path reverse path
Slide 14: other uses for inter-domain BGP data
other uses for inter-domain BGP data
mapping topology data
aggregating IP to network prefixes
aggregating prefixes to origin AS
inferring contractual relations
"bird's eye view" of the net - (AScore)
predicting AS path taken by a packet
important question: can you get essentially the same information from either dataset? (hint: no)
indeed, even though often covering fewer ASes than a full BGP table,
skitter data shows bidirectional and transit connectivity
for a significantly more ASes than BGP data of the
best available quality and sampling. (totally non-obvious...)
Slide 15: RouteViews BGP data format
RouteViews BGP data format 192.172.226.0/24 134.24.127.30 64 1740 195 1909 i CAIDA network peer (cerf) med cerf sdsc caida 192.172.226.0/24 204.147.128.141 - 145 195 1909 i CAIDA network peer (vbns) med vbns sdsc caida AS path: 1740 195 1909 Grows from tail to head - "prepending": CAIDA advertizes prefix to SDSC SDSC advertizes CAIDA's prefix to CERF ergo, CERF "knows" how to send traffic to CAIDA
Slide 16: new notion: policy-constrained dual AS graph
new notion: policy-constrained dual AS graph nodes: peering sessions links: pairs of sessions w/common AS and followed by traffic in skitter or BGP data derived from observed AS triples example: 1909 195 1740 - AS path (skitter,BGP) CAIDA SDSC CERF nodes: 1909-195 and 195-1740 ordered AS pairs directed link: from 1909-195 to 195-1740 allows for capturing more policy
Slide 17: dual AS graph features
dual AS graph features
AS hop count = #sessions in a path
easy to compare with AS hops
If necessary, add: 0-AS & AS-0 for traffic origination/termination
adds rigidity to the graph
constrain path by continuity and direction
like differential equation, but more choice at each step
path continuation non-unique
general case:
nodes: AS n-tuples, links: n+1-tuples
Slide 18: RouteViews data characteristics, 25 nov 2000
RouteViews data characteristics, 25 nov 2000 2.627M lines (prefix, peer, AS path) 34 contributing peers 28 peers with 81K-95K prefixes 2.534M lines in 28 tables Of those: 2,243,524 (88.5%) have no repeated AS 290,392 (11.5%) lines with repeated AS 176,576 (7%) end with repeated AS 2/5 repeated had it only in the middle 0.45% suppressed, dampened or history
Slide 19: taxonomy of RouteViews BGP tables (aug 2001)
taxonomy of RouteViews BGP tables (aug 2001) table `type' prefix count # RV tables full size backbone 103K 37 filtered backbone 89-96K 6 non-backbone <50K 8 filtered tables only partly useful
Slide 20: analysis of BGP tables
analysis of BGP tables data is clean, easy to get, parse appears a reasonable substitute for traceroute data fluctuations NLANR's archives corrupted Dec.2000-Feb.2001 random daily fluctuations (~3%) if a single /8 is not announced - 0.4% IP addresses misconfigs routing loops (2 loops in 91 pref-peer pairs) discrepancies e.g. multi-origin prefixes, 1.3% caveats: extremely sensitive to peer choice
Slide 21: AS path length
AS path length primary knob of BGP hammer (default metric) All AS versus unique AS count: prepending - repeating of AS makes AS path look longer reduces traffic via this path bumps at AS path lengths 12,14 e.g., 202.183.247.0/24 3561 5400 5400 5727 4651 7568 7568 7568 7568 7568 5 unique + 5 repeated ASes == AS path length 10 routes are compared/chosen locally other metrics: multiexit discriminator (MED) communities
Slide 22
Slide 23: AS path length measured in unique AS
AS path length measured in unique AS Much smaller tail: exp(-1.8x) vs. exp(-0.84x) Much smoother, close to Gamma distrib. (reachability function has same shape, we'll see later) invariance across levels of aggregation (surprising)
Slide 24: evolution of average AS Path length
evolution of average AS Path length for 2 years 2000-2001 common Route View peers some peer AS increase average length slightly some other slightly decreased overall close to being invariant nov 2000: 3.60 hops may 2001: 3.64 hops same as noise level (1%) no statistically significant change reachability function invariant as a whole, 1999-2001 (exception: RIPE)
Slide 25: Internet evolution - BGP table view
Internet evolution - BGP table view
Internet trends:
expansion (numeric growth)
sophistication (structural complexity)
refinement/fragmentation
churn (objects grow, split, recombine, die)
multiple causes of changes compose,
so net change does not reflect trends
Internet is a "two nines" system
1% noise a rule, not exception
=> trends w/in few % are likely noise/uncertainty
Slide 26: Building an AS topology graph from BGP data
Building an AS topology graph from BGP data directed from backbone downstream to router more hierarchical, captures less `lateral' peering Gao e.a. find 90% provider-customer links In reality, # peers ~ #customers (order of magnitude) Zocalo: medium-sized ISP in Bay Area 8 transit providers 100 downstream networks 12 have their own AS 225 peers => total degree ~ 250 in sum of 43 backbone tables: indeg.11, outdeg.63, total 74 => 3 times less
Slide 27: building a `core' AS graph from two data types
building a `core' AS graph from two data types deriving topological combinatorial core recursively strip outdeg.0 and 2-loops bidirectional connectivity remains should be 100% if sufficient coverage compare using (1) BGP and (2) forward IP topology data (1) BGP routing data AS graph (RouteViews tables) 2.77% in combinatorial core (327 out of 11823) accumulation does not change much adding links from less to more specifics: adds 4400 links to 25K increases combin.core from 300 to 900 AS (2) probed forward topology AS graph, 1 Aug 2001: 9858 nodes out of 11.5K 2556 (26%) in comb.core 2102 (21%) in g.c. which reaches 9835 caveat: traceroute links with third party addresses multipath noise (may be 1% cases, exact number unknown)
Slide 28: transit versus origin AS
transit versus origin AS AS 701 is in 31% of all paths (and is origin for about 2.5% of prefixes)
Slide 29
Slide 30: AS in- and outdegrees in Route Views data
AS in- and outdegrees in Route Views data
Slide 31: navigating complexity of BGP AS graph
navigating complexity of BGP AS graph measured by downstream peering richness entropy of outbound links distribution (for ~2K transit AS) sum p log p where p ~ #prefixes routed via an AS link takes into account how often each link is taken nov 2000: 4.23 bits may 2000: 4.24 bits lots of individual ASes changed though opposite changes w.net effect 0 another invariant?
Slide 32: taxonomy of prefixes
taxonomy of prefixes standalone (no more specifics, no less specifics) root (have more specifics, are not more specific) more specifics - subsets subsets of other prefixes express fine-grained policies litter routing space? getting free ride? taxonomy by prevalence in tables: globally visible: all BGP tables highly visible: in most BGP tables semi-global: in 1/2 of all BGP tables local (a few tables) highly visible ~= semi-global "bathtub curve" phenonenon (fuzzy dichotomy)
Slide 33: BGP table prefix growth
BGP table prefix growth route-views data does not confirm conjecture that more specifcs are currently driving up the BGP table sizes indeed, for 1.3 years since aug 2001, standalone prefixes grow faster than more specifics (standalone 38%, more spec 32%) currently at parity more specifics do churn (appear/disappear) faster also convert to standalones but overall change in more specifics not statistically significant increase in # of top prefixes is caused by: coverage of new address space 55% deaggregation/contraction of existing prefixes 37.5% expansion of existing prefixes 5% aggregation 2.5% but growth was in top prefixes, not more specifics
Slide 34: BGP origination policies
BGP origination policies AS indeg >=3, pf indeg >=3 26.2% AS indeg 1, pf indeg 1 14.8% AS indeg 2, pf indeg 2 11.6% AS indeg 2, pf indeg 1 6.0% AS indeg 1,od.2, pf indeg 1 4.6%
Slide 35: new unit of connectivity analysis: BGP atoms
new unit of connectivity analysis: BGP atoms
BGP atoms are:
equivalence class of prefixes that
share same set of AS paths as seen
e.g. by peers in Oregon Route Views
BGP tables
unit of granularity in between prefixes and ASes
grow proportionally
new framework to analyze routing system
reduce granularity
strong potential
Slide 36: atom sizes
atom sizes about 50% of atoms contain just one prefix largest atom contains over 2200 prefixes size distribution is close to Weibull curve or power fn address space size distribution more Kantor-like ramified atom several paths to same prefix branch or loop at an AS maps to graph with positive outdegree/cycl.# reducing the number of atoms focal point AS present in all paths of atom, at same relative position crown point AS with largest number of following ASes among focal points crown atom atom whose system of AS paths has been truncated at crown point
Slide 37: Crown and ramified atoms in RouteViews
Crown and ramified atoms in RouteViews
(01 september 2001 tables)
crown atoms
take advantage of leaf and AS/leaf prefixes
chop away non-branching "stalk" from the atom
reduce to 72% atoms (Sept.01: from 24270 to 17566)
ramified atoms
different AS paths from the same-AS peers
0 4578
1 8636
2 2028
3 70
different path lengths in ramified atom
1 22142
2 5887
3 14
1 1
Slide 38: atom size in prefixes
atom size in prefixes
Slide 39: address space size of atoms
address space size of atoms
Slide 40: atoms #AS percent
atoms #AS percent 1 4943 59.84 2 1647 19.94 3 750 9.08 4 347 4.20 5 191 2.31 6 111 1.34 7 67 0.81 8 49 0.59 9 34 0.41 10 18 0.22 11 23 0.28 12 9 0.11 13 10 0.12 14 11 0.13 15 13 0.16 16 9 0.11 17 3 0.04 18 4 0.05 19 5 0.06 21 2 0.02 22 1 0.01 23 2 0.02 ... Total 8261
Slide 41: new unit of connectivity analysis: cones
new unit of connectivity analysis: cones set of nodes that start at a given node go through only non-core nodes reachable from this node. nodes that wholly or in part depend on a given node (tip of cone) for connectivity
Slide 42: new connectivity unit: connected subcomponents
new connectivity unit: connected subcomponents
intra-prefix
motivation
aggregating IPs into prefixes can create artificial connectivity
152.130 (customer gateways for large backbone, not intraprefix-connected)
prefix outdegree thus misleading/specious
...so: deaggregate prefixes to find (really)
connected subcomponents
Slide 43
Slide 44: BGP observations
BGP observations estimates for number of distinct policies somewhat independent of peer selection internal details of global routing varies between 2% and 70% of atoms ramified, depending on peer choice suggests vastly different transit policies among/within networks BGP tangle atom with directed loop (apparent routing loop) should not happen... use of atoms can reduce table size in half while still preserving AS path and specificity relationships theoretical result...
Slide 45: prefix indegree counts
prefix indegree counts
Slide 46: trees of more specifics
trees of more specifics
Slide 47: tangent vectors (prefix indegree multisets)
tangent vectors (prefix indegree multisets)
Slide 48: global invariant: Broido's `200 formula'
global invariant: Broido's `200 formula'
Slide 49: end part 1
end part 1
Slide 50: global Internet topology analysis on IP level
global Internet topology analysis on IP level caida probed topology data of Jul.15-18 2001 826265 nodes 60329 (7%) in combinatorial core stripped in 24 iterations 52330 nodes in giant connected component bidirectionally connected components of the core 200 times larger than any other connected component almost 90% of the core largest of small components are /24s, /25s
Slide 51: global Internet topology analysis on IP level
global Internet topology analysis on IP level
Most Internet object size distributions
are closer to Weibull
than to power functions.
P(X>x) = a exp(-(x/b)^c)
decreases faster than power function, slower than exponential
exception: AS outdegree in RouteViews BGP AS graph is
on some days closer to power function
another possible family more commonly used by physicists
a x^c exp(-x/b)
Slide 52: full IP graph degree distribution, Jul 2001
full IP graph degree distribution, Jul 2001
Slide 53: full IP graph degree distribution, jul 01
full IP graph degree distribution, jul 01 Approximation properties: outdegree interval 1-300 Relative error: max(y(x)/a, a/y(x) | 1<=x<=300) - 1 Weibull with three parameters: 5.4% Weibull with 1 parameter: 19.2% Shifted power function: 20.3%
Slide 54: tree size distribution
tree size distribution 55% IP nodes are in stub trees (Trees node counts do not include roots)
Slide 55: cone size distribution
cone size distribution
Slide 56: Size of [least specific and all] prefixes, and of ASes
Size of [least specific and all] prefixes, and of ASes
Slide 57: why do Weibull/power functions arise?
why do Weibull/power functions arise? simple model: full binary tree 2^k nodes k hops from the root subtree for each node has size 2^(n-k) #subtrees = 2^n / subtree size
Slide 58: Why do all size distributions look Weibull?
Why do all size distributions look Weibull?
Where do Weibull or shifted/cut-off power functions come from?
Suggestion: Use Coalescence Theory
used to model size distributions of populations of
physical objects that merge in nature
applied to: formation of cloud droplets, bubbles, polymers,
stars, Universe...
Idea: Objects merge together, size = sum of both sizes
Rate of aggregation depends upon both objects' sizes
Special case: multiplicative coalescence, where
probability of merging ~ product of sizes
Slide 59: Why do all size distributions look Weibull?
Why do all size distributions look Weibull?
Math model: Smoluchowski coagulation equation
M. v.Smoluchowski, Drei Vortraege ueber Diffusion,
Brownische Bewegung und Koagulation von Kolloidteilchen.
Physik. Z., 17:557-585, 1916
D.J.Aldous. Deterministic and stochastic models for coalescensce
(aggregation, coagulation): a review of mean-field
theory for probabilists. July 10, 1997
www.stat.berkeley.edu/users/aldous
(Thanks to Reka Albert for the reference)
Slide 60: Coalescence probability is size-dependent
Coalescence probability is size-dependent (rationally moderated chaos) merging 100,000 objects of size 1 randomly select two objects to merge fractional multiplicative coalescence P(x1,x2) = P(x1) P(x2|x1) P(x1) ~ x1^a P(x2|x1) ~ x2^a, with x1 is excluded (=> asymmetric) a has the meaning of inverse temperature higher a => lower temperature => larger clusters => freezing (gelation) sooner a=0 (size-independent): infinite temperature very small clusters take snapshots every 10K size distributions (after 10K, 20K,...99.99K mergers)
Slide 61: Coalescence simulation
Coalescence simulation next few graphs will show for coalescence simulation object size distributions maximum object size distribution for size-independent or multiplicative or submultiplicative (sqrt(multipl)) coalescence resulting size distribution: exponential as predicted by analytic solution to Smoluchowski equation
Slide 62: Coalescence object size (additive)
Coalescence object size (additive)
Slide 63: Coalesence: evolution of maximum obj size
Coalesence: evolution of maximum obj size
Slide 64: If coalescence probability ~ product of sizes
If coalescence probability ~ product of sizes size distribution is close to power function giant component emerges after 50% are aggregated
Slide 65: Coalescence: evolution of maximum obj size
Coalescence: evolution of maximum obj size probability ~ product of sizes
Slide 66: Coalescence prob. ~ sqrt(product of sizes)
Coalescence prob. ~ sqrt(product of sizes) size distribution is close to power function giant component emerges after 90% are aggregated size distributions close to Weibull (rel. err. < 20% for generic sizes)
Slide 67: Coalesence: evolution of maximum obj size
Coalesence: evolution of maximum obj size
Slide 68: topology: combinatorial core vs gcc
topology: combinatorial core vs gcc filtering of the graph combinatorial core remove nodes of outdegree 0 (no outgoing links) destinations have outdegree 0 strip nodes of outdegree 0 stub network's routers now have outdegree 0 repeat (recursive) stripping of outdegree 0 giant connected component bidirectionally connected components of the graph
Slide 69: topology data: connected components
topology data: connected components
Slide 70: hop metric of `centrality'
hop metric of `centrality'
shortest path distance on IP graph
find maximum distance from any given node to all core
center of the core: those with min maximum distance
e.g., take top 50 by this metric
top 3 prefixes
Prefix Outdegree
157.130.0.0/16 1172
144.232.0.0/16 757
4.0.0.0/8 655
Slide 71: reachability functions
reachability functions previous work: counts of nodes at shortest path distance x (H.Tangmunarunkit, R.Govindan) here: reachability functions are for individual nodes.
Slide 72: reachability functions
reachability functions
Slide 73: neighbourhood size vs. hop distance
neighbourhood size vs. hop distance `neighbourhood': nodes at radius R hypothesis: there exists a mother function that fits all these curves such a function would provide a parameteric method for finding topological distance from internet core
Slide 74: Neighbourhood size
Neighbourhood size
Slide 75: Neighbourhood size: (step 1000 thru giant comp.)
Neighbourhood size: (step 1000 thru giant comp.) Skitter data of Jul.02-29, 2001
Slide 76: Neighbourhood size: (step 1000 thru giant comp.)
Neighbourhood size: (step 1000 thru giant comp.) Skitter data of Jul.02-29, 2001
Slide 77: So... neighbourhood size suggests:
So... neighbourhood size suggests: mother function is invariant of graph shifted and stretched for each node shift, stretch = average, std dev of hop distance from a node likely implications: there IS a core of the graph neighbourhood is small before we reach core starts growing as soon as we reach core "Gaussian" because minimums are a limiting case of addition
Slide 78: shortest path distances in GCC
shortest path distances in GCC
Slide 79: topology: forward & reverse expansion sets
topology: forward & reverse expansion sets
Slide 80: topological resilience to outbound link removal
topological resilience to outbound link removal
algorithm:
forwarding deactivated (outbound links removed)
on all nodes with outdegree over x
rationale:
will still have these nodes reachable
(do not remove inbound links and/or nodes)
removing all nodes with outdegree > x is unambiguous
Slide 81: topological resilience to outbound link removal
topological resilience to outbound link removal IP giant component (skitter, Jul 2001)
Slide 82: topological resilience to outbound link removal
topological resilience to outbound link removal IP giant component (skitter, Jul.02-29 2001) giant component breaks only when outdegree 1-6 remain maximum outdegree was 329 forwarding deactivated on 21% nodes 70% links removed astonishingly robust graph
Slide 83: topological resilience to outbound link removal
topological resilience to outbound link removal IP giant component (skitter, Jul 2001)
Slide 84: topological resilience to outbound link removal
topological resilience to outbound link removal combinatorial IP core (skitter, 1 Aug 2001, 10 days)
Slide 85: topological resilience to outbound link removal
topological resilience to outbound link removal combinatorial IP core (skitter, 1 Aug 2001)
Slide 86: topological resilience to outbound link removal
topological resilience to outbound link removal Giant component of skitter AS core (skitter, Jul 2001)
Slide 87: topological resilience to outbound link removal
topological resilience to outbound link removal Giant component of skitter AS core (skitter, Jul 2001) Contains 30% of AS nodes (BGP AS graph's core: 3.3%, Route Views Aug.01) Giant component breaks only when outdeg.1-10 remain maximum outdegree was 949 forwarding deactivated on 10% nodes 68% links removed
Slide 88: topological resilience to outbound link removal
topological resilience to outbound link removal symmetrized skitter AS graph (Jul 2001)
Slide 89: topological resilience to outbound link removal
topological resilience to outbound link removal symmetrized skitter AS graph (Jul 2001) Giant component breaks only when outdeg.1-10 remain maximum outdegree was 949 forwarding deactivated on 6.4% nodes 66% links removed
Slide 90: (end part 2 (end talk...))
(end part 2 (end talk...))
Slide 91: note on forward IP topology vs BGP RouteViews data
note on forward IP topology vs BGP RouteViews data
for july 2001
BGP routeviews: 34 peers, 9.4k ASes, ~100K prefixes
skitter: 17 monitors, 300k dsts, 50K prefixes
to equalize, have to reduce skitter to AS graph:
0) pick one trace per prefix per day
ignore nodes next to non-responding hops
1) strip IP leaves
2) convert IP -> prefixes
3) strip prefix leaves
4) convert prefix -> AS
5) strip AS leaves
Slide 92: forward IP topology vs BGP RouteViews data
forward IP topology vs BGP RouteViews data
basically a ton of decimation on a day of skitter data
skitter AS graph: 6949 AS nodes, 16145 AS links (peering sessions)
remove any potential advantage of skitter probing
frequency
finer (IP) granularity
nonetheless, skitter captures much larger share of the Internet's bidirectional connectivity.
skitter combinatorial core (full-transit portion) contains 988 AS nodes,
as opposed to BGP's 299 nodes (3.3X more)
Slide 93: ranked list of skitter vs RV outdegrees
ranked list of skitter vs RV outdegrees level 2 transit AS (those with at least one outbound connection to another transit AS, as oppposed to stub AS (customers)). Of those: 732 in Oregon Nov.29 BGP data and 1189 in skitter data.
Slide 94: upshot
upshot even the best available inter-domain routing (BGP) data serves very weak substitute for IP probed topology data and yet this BGP data is an essential tool for sensible comprehensive macroscopic Internet topology analysis
Slide 95: ongoing/future
ongoing/future continue building infrastructure sources destination lists dynamically discover topology, b/w correlation among sources with perf, topology identify critical pieces of infrastructure continue w metrics and analysis `center', `core', `close' (optimize dist. server architectures) filter new root locations? determination of connectivity metrics persistence of paths metrics for `change' better GLS support model of the Internet (no, really) graphing/layout techniques....
Slide 96: www.caida.org/Presentations/
www.caida.org/Presentations/ andre broido kc claffy UCSD/SDSC/CAIDA broido@caida.org kc@caida.org www.caida.org

