Analysis of routing & topology data

Archived MagicPoint presentation slides, compiled into a single PDF document.

2001_routingtopology2001.pdf (96 slides, 5.2 MB)

Slide text transcript

Slide 1: analysis of routing & topology data

analysis of routing & topology data 

sept/oct 2001

no aphorism is more frequently repeated...  
than that we must ask Nature few questions, 
or ideally, one question at a time.  
...this view is wholly mistaken.
Nature will best respond to a logically 
and artfully thought out questionnaire; 
indeed if we ask her a single question, she will often refuse 
to answer until some other topic has been dicussed.
             -- perspectives in medicine and biology 1973 sir ronald a fisher


broido@caida.org
kc@caida.org
ucsd/sdsc/caida

Slide 2: outline

outline

motivation: macroscopic topology study
infrastructure-relevant research questions
led us to analyze BGP data on its own
realized that routing & topology inextricably related
describe both kinds of data sets
introduce techniques for analysis of Internet `core'
graph theoretic model
combinatorial core vs giant component
dual graph
subprefix connected components
introduce metrics for analysis
indegree/outdegree characteristics
transit versus origin role
new contributions to the field of topology analysis
new granularities: BGP atoms, dual AS graph, ramified atoms
size distributions (weibull) with explanatory model (coalesence) 
reachability functions
topological resilience

Slide 3: motivating questions (a sample)

motivating questions (a sample)

number of network prefixes in table
reduction beneficial to [routing system] infrastructure
trends in routability of IP space given by registries
which ASes are most highly:
connected, vulnerable, controlling connectivity
is there an Internet `core'
does the core grow due to multihoming?
are more-specifics what is driving up the table size?
hop diameter of AS, AS diameter of IP paths
optimizing active measurement architectures

     can topology be inferred from BGP tables 
     for the purposes of modeling, simulation, 
     and infrastructure analysis?

Slide 4: motivating questions (more specific)

motivating questions (more specific)


how many prefixes does it take to cover the IPv4 address space?

how many routing entries would be required if arbitrary intervals (as opposed to CIDR blocks) in IP space would be allowed as ranges of network addresses?

what is the complexity of the system of AS paths associated with individual prefix?

how many different types of networks are globally distinguishable with respect to routing policies?

how many routing policies are applied by the Internet to addresses originated by one AS?

Slide 5: global AS topology (from skitter data)

global AS topology (from skitter data)

Slide 6: topology data: skitter

topology data: skitter


about 26 sources around world
500k destinations (diff by box) 
over 60% of prefixes covered (still building lists)
parallel ICMP echo request reply
lightweight, coarse temporal granularity
used for variety of macroscopic studies
most comprehensive in world (low bar)
data available to researchers
    www.caida.org/tools/measurement/skitter/

Slide 7: topology data: dynamic properties

topology data: dynamic properties


all graphs change with time
new equipment (nodes, links, firewalls) added
new IP blocks alocated
renumbering out of old IP blocks
"death rate" of IP addresses
paths oscillate (load balancing)
paths fluctuate (routing instability)
paths flip (manual config)
outages, cuts, blackouts

okay so data is messy...

Slide 8: topology data: ambiguities

topology data: ambiguities


non-responding hops (no IP returned)
rate-limited response
private addresses
multicast addresses
addresses in 0.-2./8 blocks
no matching BGP prefix
prefixes with multiple origin AS

okay so data is icky, too...

Slide 9: private address use

private address use 


Internet is partitioned by > 2 layers of NAT
private addresses = 1.7% of skitter replies
(18K out of 1.07M, Jul.02-31,2001)
assumption: private IPs are misconfigs
we assume a generic misconfig rate of 1%
inadvertent transit ASes: 1.25%
multi-origin prefixes: 1.3%
(assume 100x more private addrs than we observe)
=> private/unique IPs ~ 1.7 (order of magnitude)

Slide 10: correlation among data sets (perf., topology, routing)

correlation among data sets (perf., topology, routing)


preliminary findings
~1% IP destinations disappearing monthly 
(re-addressing, firewalls)
route announced path not matching forward path
indication of potential routing configuration errors
by no means automatic
paths much less static than expected

Slide 11: macroscopic question: DNS root server placement

macroscopic question: DNS root server placement

RSSAC, DNS technical advisory committee to ICANN

goal: optimize root nameserver locations (13)
co-locate skitter hosts w root servers
demonstrate root server performance in serving target community
develop techniques for evaluating architectual optimality for root server placement
visualization to correlate data sources/types

collaborative project to encourage proactive participation (network operators, researchers, others)

(www.caida.org/tools/measurement/skitter/)

Slide 12: skitter on-going daily summaries

skitter on-going daily summaries


http://www.caida.org/tools/measurement/skitter/summary

path length (in IP hops) distribution
RTT distribution
RTT versus longitude,
path dispersion
AS & country granularity
AS core graph

Slide 13: inter-domain routing (BGP) data

inter-domain routing (BGP) data  


not much instrumentation/diagnostics/data sources
UO's www.route-views
RIPE's www.ripe.net/ris
Merit's IPMA /www.merit.edu/ipma
really need real-time instrumentation
(w/o impacting forwarding performance)
ideal: route-lookups in real-time w/o kernel
need data for realistic inter-domain routing models
identification/vis of flaps, outages, critical pts
correlation of perf. w/some measure of path `length'
comparison of forward path with
BGP path
shortest path
reverse path

Slide 14: other uses for inter-domain BGP data

other uses for inter-domain BGP data 
        

mapping topology data
aggregating IP to network prefixes
aggregating prefixes to origin AS
inferring contractual relations 
"bird's eye view" of the net - (AScore) 
predicting AS path taken by a packet

important question: can you get essentially the same information from either dataset?   (hint:  no)

indeed, even though often covering fewer ASes than a full BGP table,
skitter data shows bidirectional and transit connectivity
for a significantly more ASes than BGP data of the
best available quality and sampling. (totally non-obvious...)

Slide 15: RouteViews BGP data format

RouteViews BGP data format


192.172.226.0/24   134.24.127.30   64  1740 195  1909 i
CAIDA network      peer (cerf)   med  cerf sdsc caida

192.172.226.0/24   204.147.128.141  -   145 195  1909 i
CAIDA network      peer (vbns)   med vbns sdsc caida

AS path: 1740 195 1909
Grows from tail to head - "prepending":

CAIDA advertizes prefix to SDSC
SDSC advertizes CAIDA's prefix to CERF
ergo,
CERF "knows" how to send traffic to CAIDA

Slide 16: new notion: policy-constrained dual AS graph

new notion: policy-constrained dual AS graph


nodes: peering sessions
links: pairs of sessions w/common AS and followed by traffic in skitter or BGP data
derived from observed AS triples
example:
1909   195 1740 - AS path (skitter,BGP)
CAIDA SDSC CERF
nodes: 1909-195 and 195-1740
ordered AS pairs
directed link: from 1909-195 to 195-1740
allows for capturing more policy

Slide 17: dual AS graph features

dual AS graph features


AS hop count = #sessions in a path
easy to compare with AS hops
If necessary, add: 0-AS & AS-0 for traffic origination/termination
adds rigidity to the graph
constrain path by continuity and direction
like differential equation, but more choice at each step
path continuation non-unique

general case:
        nodes: AS n-tuples, links: n+1-tuples

Slide 18: RouteViews data characteristics, 25 nov 2000

RouteViews data characteristics, 25 nov 2000


2.627M lines (prefix, peer, AS path)
34 contributing peers
28 peers with 81K-95K prefixes 
2.534M lines in 28 tables
Of those:
2,243,524 (88.5%) have no repeated AS
290,392 (11.5%) lines with repeated AS
176,576 (7%) end with repeated AS 
2/5 repeated had it only in the middle
0.45% suppressed, dampened or history

Slide 19: taxonomy of RouteViews BGP tables (aug 2001)

taxonomy of RouteViews BGP tables (aug 2001)


table `type'	      prefix count	# RV tables 
full size backbone       103K		37 
filtered backbone     89-96K 		6 
non-backbone             <50K  		8

filtered tables only partly useful

Slide 20: analysis of BGP tables

analysis of BGP tables


data is clean, easy to get, parse
appears a reasonable substitute for traceroute data
fluctuations
NLANR's archives corrupted Dec.2000-Feb.2001
random daily fluctuations (~3%)
if a single /8 is not announced - 0.4% IP addresses 
misconfigs
routing loops (2 loops in 91 pref-peer pairs)
discrepancies
e.g. multi-origin prefixes, 1.3%
caveats: extremely sensitive to peer choice

Slide 21: AS path length

AS path length


primary knob of BGP hammer  (default metric)
All AS versus unique AS count:
prepending - repeating of AS 
makes AS path look longer
reduces traffic via this path
bumps at AS path lengths 12,14 
e.g., 202.183.247.0/24  3561 5400 5400 5727 4651 7568 7568 7568 7568 7568
5 unique + 5 repeated ASes == AS path length 10

routes are compared/chosen locally 
other metrics: 
multiexit discriminator (MED)
communities

Slide 22

Slide 23: AS path length measured in unique AS

AS path length measured in unique AS

Much smaller tail: exp(-1.8x) vs. exp(-0.84x)
Much smoother, close to Gamma distrib. 
(reachability function has same shape, we'll see later)
invariance across levels of aggregation (surprising)

Slide 24: evolution of average AS Path length

evolution of average AS Path length 


for 2 years 2000-2001
common Route View peers
some peer AS increase average length slightly
some other slightly decreased
overall close to being invariant
nov 2000: 3.60 hops
may 2001: 3.64 hops
same as noise level (1%)
no statistically significant change
reachability function invariant as a whole, 1999-2001
(exception: RIPE)

Slide 25: Internet evolution - BGP table view

Internet evolution -  BGP table view

Internet trends:
expansion (numeric growth)
sophistication (structural complexity)
refinement/fragmentation
churn (objects grow, split, recombine, die)

multiple causes of changes compose, 
      so net change does not reflect trends  

Internet is a "two nines" system
1% noise a rule, not exception
=> trends w/in few % are likely noise/uncertainty

Slide 26: Building an AS topology graph from BGP data

Building an AS topology graph from BGP data


directed from backbone downstream to router
more hierarchical, captures less `lateral' peering 
Gao e.a. find 90%  provider-customer links
In reality, # peers ~ #customers (order of magnitude)

Zocalo: medium-sized ISP in Bay Area
8 transit providers
100 downstream networks
12 have their own AS
225 peers
=> total degree ~ 250
in sum of 43 backbone tables: indeg.11, outdeg.63, total 74
=> 3 times less

Slide 27: building a `core' AS graph from two data types

building a `core' AS graph from two data types

deriving topological combinatorial core
recursively strip outdeg.0 and 2-loops
bidirectional connectivity remains
should be 100% if sufficient coverage 
compare using (1) BGP and (2) forward IP topology data
(1) BGP routing data AS graph (RouteViews tables)
2.77% in combinatorial core (327 out of 11823)
accumulation does not change much
adding links from less to more specifics:
adds 4400 links to 25K
increases combin.core from 300 to 900 AS
(2) probed forward topology AS graph, 1 Aug 2001: 
9858 nodes out of 11.5K
2556 (26%) in comb.core
2102 (21%) in g.c. which reaches 9835
caveat: traceroute links with third party addresses
multipath noise
(may be 1% cases, exact number unknown)

Slide 28: transit versus origin AS

transit versus origin AS 

AS 701 is in 31% of all paths (and is origin for about 2.5% of prefixes)

Slide 29

Slide 30: AS in- and outdegrees in Route Views data

AS in- and outdegrees in Route Views data

Slide 31: navigating complexity of BGP AS graph

navigating complexity of BGP AS graph

measured by downstream peering richness
entropy of outbound links distribution (for ~2K transit AS)
sum p log p where p ~ #prefixes routed via an AS link
takes into account how often each link is taken
nov 2000: 4.23 bits 
may 2000: 4.24 bits
lots of individual ASes changed though
opposite changes w.net effect 0
another invariant?

Slide 32: taxonomy of prefixes

taxonomy of prefixes

standalone (no more specifics, no less specifics)
root (have more specifics, are not more specific)
more specifics - subsets
subsets of other prefixes
express fine-grained policies
litter routing space?
getting free ride?

taxonomy by prevalence in tables:
globally visible: all BGP tables
highly visible: in most BGP tables
semi-global: in 1/2 of all BGP tables
local (a few tables)

highly visible ~= semi-global 
"bathtub curve" phenonenon (fuzzy dichotomy)

Slide 33: BGP table prefix growth

BGP table prefix growth

route-views data does not confirm conjecture that more specifcs are currently driving up the BGP table sizes
indeed, for 1.3 years since aug 2001, standalone prefixes grow faster than more specifics (standalone 38%, more spec 32%)
currently at parity
more specifics do churn (appear/disappear) faster
also convert to standalones
but overall change in more specifics not statistically significant

increase in # of top prefixes is caused by:
coverage of new address space 		55%
deaggregation/contraction of existing prefixes 37.5%
expansion of existing prefixes	                5%
aggregation				2.5% 

but growth was in top prefixes, not more specifics

Slide 34: BGP origination policies

BGP origination policies 


AS indeg >=3, pf indeg >=3	26.2%
AS indeg 1,   pf indeg   1	14.8% 
AS indeg 2,   pf indeg   2	11.6% 
AS indeg 2,   pf indeg   1	6.0% 
AS indeg 1,od.2, pf indeg   1	4.6%

Slide 35: new unit of connectivity analysis: BGP atoms

new unit of connectivity analysis: BGP atoms


BGP atoms are:

equivalence class of prefixes that 
      share same set of AS paths as seen 
      e.g. by peers in Oregon Route Views 
      BGP tables

unit of granularity in between prefixes and ASes
grow proportionally 

new framework to analyze routing system
reduce granularity
strong potential

Slide 36: atom sizes

atom sizes 

about 50% of atoms contain just one prefix
largest atom contains over 2200 prefixes
size distribution is close to Weibull curve or power fn
address space size distribution more Kantor-like
ramified atom
several paths to same prefix branch or loop at an AS
maps to graph with positive outdegree/cycl.# 

reducing the number of atoms
focal point
AS present in all paths of atom, at same relative position
crown point
AS with largest number of following ASes among focal points
crown atom
atom whose system of AS paths has been truncated at crown point

Slide 37: Crown and ramified atoms in RouteViews

Crown and ramified atoms in RouteViews 


(01 september 2001 tables) 

crown atoms
take advantage of leaf and AS/leaf prefixes
chop away non-branching "stalk" from the atom
reduce to 72% atoms (Sept.01: from 24270 to 17566)

ramified atoms
different AS paths from the same-AS peers
                       0  4578
                       1  8636
                       2  2028
                       3     70
different path lengths in ramified atom
                      1  22142
                      2    5887
                      3       14
                      1         1

Slide 38: atom size in prefixes

atom size in prefixes

Slide 39: address space size of atoms

address space size of atoms

Slide 40: atoms #AS percent

atoms   #AS  percent

1	4943	59.84
2	1647	19.94
3	750	9.08
4	347	4.20
5	191	2.31
6	111	1.34
7	67	0.81
8	49	0.59
9	34	0.41
10	18	0.22
11	23	0.28
12	9	0.11
13	10	0.12
14	11	0.13
15	13	0.16
16	9	0.11
17	3	0.04
18	4	0.05
19	5	0.06
21	2	0.02
22	1	0.01
23	2	0.02
...
Total   8261

Slide 41: new unit of connectivity analysis: cones

new unit of connectivity analysis: cones


set of nodes that 
start at a given node 
go through only non-core nodes reachable from this node. 
nodes that wholly or in part depend on a given node (tip of cone) for connectivity

Slide 42: new connectivity unit: connected subcomponents

new connectivity unit: connected subcomponents


intra-prefix 

motivation
aggregating IPs into prefixes can create artificial connectivity 
152.130 (customer gateways for large backbone, not intraprefix-connected)
prefix outdegree thus misleading/specious

    ...so: deaggregate prefixes to find (really) 
           connected subcomponents

Slide 43

Slide 44: BGP observations

BGP observations

estimates for number of distinct policies
somewhat independent of peer selection
internal details of global routing varies
between 2% and 70% of atoms ramified, depending on peer choice
suggests vastly different transit policies among/within networks
BGP tangle
atom with directed loop (apparent routing loop)
should not happen...

use of atoms can reduce table size in half while still 
preserving AS path and specificity relationships
theoretical result...

Slide 45: prefix indegree counts

prefix indegree counts

Slide 46: trees of more specifics

trees of more specifics

Slide 47: tangent vectors (prefix indegree multisets)

tangent vectors (prefix indegree multisets)

Slide 48: global invariant: Broido's `200 formula'

global invariant: Broido's `200 formula'

Slide 49: end part 1

end part 1

Slide 50: global Internet topology analysis on IP level

global Internet topology analysis  on IP level


caida probed topology data of Jul.15-18 2001

826265 nodes

60329 (7%) in combinatorial core
stripped in 24 iterations

52330 nodes in giant connected component
bidirectionally connected components of the core
200 times larger than any other connected component
almost 90% of the core
largest of small components are /24s, /25s

Slide 51: global Internet topology analysis on IP level

global Internet topology analysis  on IP level


Most Internet object size distributions
are closer to Weibull 
than to power functions.


P(X>x) = a exp(-(x/b)^c)
decreases faster than power function, slower than exponential
exception: AS outdegree in RouteViews BGP AS graph is 
         on some days closer to power function
another possible family more commonly used by physicists 
         a x^c exp(-x/b)

Slide 52: full IP graph degree distribution, Jul 2001

full IP graph degree  distribution, Jul 2001

Slide 53: full IP graph degree distribution, jul 01

full IP graph degree  distribution, jul 01


Approximation properties:
outdegree interval 1-300 

Relative error: max(y(x)/a, a/y(x) | 1<=x<=300) - 1

Weibull with three parameters: 5.4%
Weibull with 1 parameter:        19.2%
Shifted power function:            20.3%

Slide 54: tree size distribution

tree size distribution 

55% IP nodes are in stub trees

(Trees node counts do not include roots)

Slide 55: cone size distribution

cone size distribution

Slide 56: Size of [least specific and all] prefixes, and of ASes

Size of [least specific and all] prefixes, and of ASes

Slide 57: why do Weibull/power functions arise?

why do Weibull/power functions arise?


simple model:  full binary tree
2^k nodes k hops from the root
subtree for each node has size 2^(n-k)
#subtrees = 2^n / subtree size

Slide 58: Why do all size distributions look Weibull?

Why do all size distributions look Weibull?


Where do Weibull or shifted/cut-off power functions come from?
Suggestion: Use Coalescence Theory
used to model size distributions of populations of
         physical objects that merge in nature 
applied to: formation of cloud droplets, bubbles, polymers, 
         stars, Universe...

Idea: Objects merge together, size = sum of both sizes
Rate of aggregation depends upon both objects' sizes
Special case: multiplicative coalescence, where
         probability of merging ~ product of sizes

Slide 59: Why do all size distributions look Weibull?

Why do all size distributions look Weibull?


Math model: Smoluchowski coagulation equation

M. v.Smoluchowski, Drei Vortraege ueber Diffusion,
     Brownische Bewegung und Koagulation von Kolloidteilchen.
     Physik. Z., 17:557-585, 1916

D.J.Aldous. Deterministic and stochastic models for coalescensce
     (aggregation, coagulation): a review of mean-field
     theory for probabilists. July 10, 1997
     www.stat.berkeley.edu/users/aldous

      (Thanks to Reka Albert for the reference)

Slide 60: Coalescence probability is size-dependent

Coalescence probability is size-dependent

(rationally moderated chaos)

merging 100,000 objects of size 1
randomly select two objects to merge
fractional multiplicative coalescence
P(x1,x2) = P(x1) P(x2|x1)
P(x1) ~ x1^a
P(x2|x1) ~ x2^a, with x1 is excluded (=> asymmetric)
a has the meaning of inverse temperature
higher a => lower temperature => larger clusters
=> freezing (gelation) sooner
a=0 (size-independent):
infinite temperature
very small clusters
take snapshots every 10K size distributions 
(after 10K, 20K,...99.99K mergers)

Slide 61: Coalescence simulation

Coalescence simulation


next few graphs will show for coalescence simulation
object size distributions
maximum object size distribution
for size-independent or multiplicative or submultiplicative (sqrt(multipl)) coalescence

resulting size distribution: exponential 
as predicted by analytic solution to Smoluchowski equation

Slide 62: Coalescence object size (additive)

Coalescence object size (additive)

Slide 63: Coalesence: evolution of maximum obj size

Coalesence: evolution of maximum obj size

Slide 64: If coalescence probability ~ product of sizes

If coalescence probability ~ product of sizes

size distribution is close to power function
giant component emerges after 50% are aggregated

Slide 65: Coalescence: evolution of maximum obj size

Coalescence: evolution of maximum obj size

probability ~ product of sizes

Slide 66: Coalescence prob. ~ sqrt(product of sizes)

Coalescence prob. ~ sqrt(product of sizes)

size distribution is close to power function
giant component emerges after 90% are aggregated
size distributions close to Weibull  (rel. err. < 20% for generic sizes)

Slide 67: Coalesence: evolution of maximum obj size

Coalesence: evolution of maximum obj size

Slide 68: topology: combinatorial core vs gcc

topology: combinatorial core vs gcc


filtering of the graph

combinatorial core
remove nodes of outdegree 0 (no outgoing links)
destinations have outdegree 0
strip nodes of outdegree 0
stub network's routers now have outdegree 0
repeat (recursive) stripping of outdegree 0

giant connected component
bidirectionally connected components of the graph

Slide 69: topology data: connected components

topology data: connected components

Slide 70: hop metric of `centrality'

hop metric of `centrality'


shortest path distance on IP graph
find maximum distance from any given node to all core
center of the core: those with min maximum distance
e.g., take top 50 by this metric

top 3 prefixes
          Prefix          Outdegree
          157.130.0.0/16  1172
          144.232.0.0/16    757
          4.0.0.0/8              655

Slide 71: reachability functions

reachability functions


previous work: counts of nodes at shortest path distance x
(H.Tangmunarunkit, R.Govindan)

here: reachability functions are for individual nodes.

Slide 72: reachability functions

reachability functions

Slide 73: neighbourhood size vs. hop distance

neighbourhood size vs. hop distance


`neighbourhood': nodes at radius R
hypothesis: there exists a mother function that fits all these curves
such a function would provide a parameteric method for finding topological distance from internet core

Slide 74: Neighbourhood size

Neighbourhood size

Slide 75: Neighbourhood size: (step 1000 thru giant comp.)

Neighbourhood size: (step 1000 thru giant comp.)

Skitter data of Jul.02-29, 2001

Slide 76: Neighbourhood size: (step 1000 thru giant comp.)

Neighbourhood size: (step 1000 thru giant comp.)

Skitter data of Jul.02-29, 2001

Slide 77: So... neighbourhood size suggests:

So...  neighbourhood size suggests:


mother function is invariant of graph
shifted and stretched for each node
shift, stretch = average, std dev of hop distance from a node

likely implications:
there IS a core of the graph
neighbourhood is small before we reach core
starts growing as soon as we reach core
"Gaussian" because minimums are a limiting case of addition

Slide 78: shortest path distances in GCC

shortest path distances in GCC

Slide 79: topology: forward & reverse expansion sets

topology: forward & reverse expansion sets

Slide 80: topological resilience to outbound link removal

topological resilience to outbound link removal


algorithm:
forwarding deactivated (outbound links removed)
     on all nodes with outdegree over x

rationale: 

will still have these nodes reachable
    (do not remove inbound links and/or nodes)

removing all nodes with outdegree > x is unambiguous

Slide 81: topological resilience to outbound link removal

topological resilience to outbound link removal

IP giant component (skitter, Jul 2001)

Slide 82: topological resilience to outbound link removal

topological resilience to outbound link removal


IP giant component (skitter, Jul.02-29 2001)

giant component breaks only when outdegree 1-6 remain
maximum outdegree was 329
forwarding deactivated on 21% nodes
70% links removed

astonishingly robust graph

Slide 83: topological resilience to outbound link removal

topological resilience to outbound link removal

IP giant component (skitter, Jul 2001)

Slide 84: topological resilience to outbound link removal

topological resilience to outbound link removal

combinatorial IP core (skitter, 1 Aug 2001, 10 days)

Slide 85: topological resilience to outbound link removal

topological resilience to outbound link removal

combinatorial IP core (skitter, 1 Aug 2001)

Slide 86: topological resilience to outbound link removal

topological resilience to outbound link removal

Giant component of skitter AS core (skitter, Jul 2001)

Slide 87: topological resilience to outbound link removal

topological resilience to outbound link removal


Giant component of skitter AS core (skitter, Jul 2001)
Contains 30% of AS nodes
(BGP AS graph's core: 3.3%, Route Views Aug.01)


Giant component breaks only when outdeg.1-10 remain
maximum outdegree was 949
forwarding deactivated on 10% nodes
68% links removed

Slide 88: topological resilience to outbound link removal

topological resilience to outbound link removal

symmetrized skitter AS graph (Jul 2001)

Slide 89: topological resilience to outbound link removal

topological resilience to outbound link removal


symmetrized skitter AS graph (Jul 2001)

Giant component breaks only when outdeg.1-10 remain 
maximum outdegree was 949
forwarding deactivated on 6.4% nodes
66% links removed

Slide 90: (end part 2 (end talk...))

(end part 2 (end talk...))

Slide 91: note on forward IP topology vs BGP RouteViews data

note on forward IP topology vs BGP RouteViews data


for july 2001
BGP routeviews: 34 peers, 9.4k ASes, ~100K prefixes
skitter:  17 monitors, 300k dsts, 50K prefixes

to equalize, have to reduce skitter to AS graph:
0) pick one trace per prefix per day
         ignore nodes next to non-responding hops
1) strip IP leaves
2) convert IP -> prefixes
3) strip prefix leaves
4) convert prefix -> AS
5) strip AS leaves

Slide 92: forward IP topology vs BGP RouteViews data

forward IP topology vs BGP RouteViews data


basically a ton of decimation on a day of skitter data
skitter AS graph: 6949 AS nodes, 16145 AS links (peering sessions)
remove any potential advantage of skitter probing
frequency
finer (IP) granularity 
nonetheless, skitter captures much larger share of the Internet's bidirectional connectivity. 
skitter combinatorial core (full-transit portion) contains 988 AS nodes, 
      as opposed to BGP's 299 nodes (3.3X more)

Slide 93: ranked list of skitter vs RV outdegrees

ranked list of skitter vs RV outdegrees 

level 2 transit AS (those with at least one outbound connection to
another transit AS, as oppposed to stub AS (customers)). 
Of those: 732 in Oregon Nov.29 BGP data and 1189 in skitter data.

Slide 94: upshot

upshot



even the best available inter-domain 
routing (BGP) data serves very weak 
substitute for IP probed topology data 

and yet this BGP data is an essential tool 
for sensible comprehensive macroscopic
Internet topology analysis

Slide 95: ongoing/future

ongoing/future

continue building infrastructure 
sources
destination lists
dynamically discover topology, b/w
correlation 
among sources
with perf, topology 
identify critical pieces of infrastructure
continue w metrics and analysis 
`center', `core', `close'
(optimize dist. server architectures)
filter new root locations?
determination of connectivity metrics
persistence of paths
metrics for `change'
better GLS support
model of the Internet (no, really)
graphing/layout techniques....

Slide 96: www.caida.org/Presentations/

www.caida.org/Presentations/

andre broido 
kc claffy
UCSD/SDSC/CAIDA
broido@caida.org
kc@caida.org
www.caida.org

Related Objects

See https://catalog.caida.org/media/2001_routingtopology2001/ to explore related objects to this document in the CAIDA Resource Catalog.