Introduction to Relationship-based AS Ranking

| 
|
|
Analysis of the the Internet Service Provider (ISP) hierarchy
is critical to a deeper understanding of technical, economic and
regulatory aspects of the Internet inter-domain routing system. As
part of our research agenda to measure and analyze macroscopic
Internet structure, we have developed a new procedure to rank
Autonomous Systems (AS Rank) by their location in the Internet
hierarchy. Our ranking relies upon AS relationship information that
we discover using our new inference algorithms. Our approach is rooted
in economic AS relationships, ranking each AS as a function of the number
of IP prefixes advertised by this AS, its customer ASes, their customers ASs,
and so on.
| 
|

|
Goal
The goal of this project is to provide
ISP ranking based on business relationships between ASes.
Accurate knowledge of AS business relationships is relevant to both
technical and economic aspects of the Internet's inter-domain
structure. First, AS relationships determine routing policies that
introduce a non-trivial set of constraints to paths over which
Internet traffic can flow, with implications for network
robustness, traffic engineering, macroscopic topology measurement
strategies, and other research and operational considerations.
Second, macroscopic analysis of AS relationships not only yields
insight into the economic underpinnings of business realities in the
current Internet, but also provides a solid validation framework for
economy-based modeling of Internet topology evolution. Indeed, the
Internet AS-level topology and its evolutionary dynamics are
consequences of business decisions that Internet players make.
Therefore, the ability to infer AS relationships is a promising
tool for understanding and modeling the economic forces
that drive the evolution of the Internet topology and its hierarchy.
AS relationships
Figure 1. Types of AS relationships. The ASes at the bottom of the graph,
D, E, and F, are customers of those above. ISPs in the middle, B and C,
are both providers of ASes below and customers of ISPs above. ISPs B and C
are also peers of each other. ISP A at the top is a provider to B and C
and a customer of no one.
Background
Although business agreements between ISPs can be complicated,
the model introduced by Gao [GAO] disregards peculiar details
and identifies the following three most common types of relationships:
customer-to-provider
(c2p) (or if looked at from the opposite direction, provider-to-customer
p2c), peer-to-peer (p2p), and sibling-to-sibling (s2s).
The justification for this classification is that an AS must buy
transit services for any traffic destined to parts of the Internet
that this AS neither owns nor can reach through its customers. In
Figure 1, where arrow directions reflect flows of money, ASes at
lower levels pay ISPs at higher levels in exchange for access to the
rest of the Internet. The ISP who pays is the customer and the
ISP who is being paid is the provider. Links between a
customer and a provider are c2p (p2c) links. In Figure
1, examples of c2p links are D->B, E->B, F->C, B->A, and C->A.
A p2p link connects two ISPs who have agreed to
exchange traffic on a quid pro quo basis. Peers exchange traffic
only between each other and each other's customers. This relationship
allows peering ISPs to save money on transit costs they would
otherwise have to pay to their transit providers for such traffic.
In Figure 1, B-C is a p2p link. Note the lack of direction of the
link, indicating that neither B nor C is paying each other for
the traffic they exchange.
An s2s link connects two ASes administratively belonging to the same ISP.
Such links usually appear as a result of mergers and acquisitions, or under
certain network management scenarios.
Figure 2. The top two paths 1 and 2 are valid, while the bottom two
3 and 4 are invalid.
We also use the notion of money transfers between ASes to define valid
and invalid AS paths. A valid path between source and
destination ASes is one in which for every ISP providing transit (a
transit provider), there is a payee. The payee of the transit provider must be
its immediate neighbor in the path. An invalid path is one in which there
is at least one transit provider not paid by a neighbor in the path.
In Figure 2 the top two examples are valid paths, while the bottom two are
invalid. In Example 1 the transit providers are A, B, and C. ISPs B and C pay to
A, D pays to B, and F pays to C. In Example 2 the transit providers are B and C,
and they are paid by D and F respectively. In contrast, in Example 3 the transit
provider is B, but not only does no one pay B, but B itself pays both A and Z.
Example 4 also illustrates a situation where nobody pays transit provider B.
We conclude that a valid path must have the following valid path pattern:
zero or more c2p links, followed by zero or one p2p link, followed by zero or
more p2c links. In addition, s2s links can appear in any number anywhere in the path.
History of inference algorithms
Service providers consider the policy details of their business
relationships as proprietary information and do not
generally make them public. Therefore, Internet researchers have
to rely upon indirect AS relationship inference algorithms in
order to build a picture of Internet business structure.
Gao's pioneering work [GAO] inspired many researchers to seek
approaches to inferring ISP business relationships using information
from publicly available BGP routing tables. Gao used the concept of
valid paths as the basis for her inference heuristic and identified
the top provider in a given path based on AS degree (the number of
ASes connected to a given AS).
Subramanian et al. [SARK] slightly relaxed the problem by not inferring
s2s links, and provided a more elegant mathematical formulation based on
the concept of valid paths. Assuming maximization of the
number of valid paths as a natural objective, they formulated the
AS relationship inference problem as a combinatorial optimization
problem: given an undirected graph G derived from a set of
BGP paths P, assign the edge type (c2p or p2p) to every
edge in G such that the total number
of valid paths in P is maximized. [SARK] called the
problem the type-of-relationship (ToR) problem, conjectured
that it is NP-complete, and provided a heuristic solution.
Di Battista et al. [DPP] and independently
Erlebach et al. [EHS] proved that the ToR problem is
indeed NP-complete. EHS proved also that it is even harder,
APX-complete. More importantly for practical purposes, both
DPP and EHS demonstrated that p2p links cannot be
inferred in the ToR problem formulation and developed
mathematically rigorous approximate solutions
to the ToR problem but inferring only c2p and p2c links.
We note that neither [SARK] nor [GAO] technique offers a solution to the
problem of reliable identification of p2p links due to their
low accuracy as demonstrated by Xia et al. [XG].
In addition to its inability to infer p2p links, there are other issues with
the ToR formulation that we identified in [DKH1]. In particular,
for some links either relationship (c2p or p2c)
results in the same number of invalid paths. As a result, ToR labels such links
randomly, classifying them as c2p or p2c with 50%-50% probability. In some cases
this approach leads to obviously incorrect inferences, e.g., well-known
large providers are inferred as customers of small ASes. In [DKH1] we resolved
this issue by using multiobjective optimization techniques incorporating both
the notion of valid paths and AS importance as reflected in AS degree. In
[DKH2] we introduce new improved algorithms to determine not only c2p but also
p2p links. These improvements achieve high levels of accuracy of AS relationship
inference as we demonstrate via direct validation with network administrators
of a set of ASes.
ISP hierarchy
We implemented the following methodology of the relationship-based
AS ranking.
-
Build an AS-level graph of the Internet from publicly available
BGP table data and annotate links in this graph with inferred AS relationships.
-
Define the AS customer cone of an AS A as the AS A itself plus
all the ASes that can be reached from A following only p2c and s2s links
(but not c2p or p2p!). In other words, AS A's customer cone is A, plus A's
customers, plus its customers' customers, and so on.
-
Rank ASes by the following three customer cone size metrics:
the number of ASes in the cone, the number of
unique prefixes advertised by these ASes, and the
number of /24 blocks in the union of these prefixes.
The size of the AS customer cone in terms of the number of ASes in the
cone is a coarse measure since individual AS sizes can be drastically
different. A more precise measure weighs each AS by the number of IP prefixes
the AS advertises. The prefix customer cone of an AS is the
union of its own prefix set and the prefix set of all the ASes in
its AS customer cone. Note that small ASes with little IP address
space allocated to them, as well as older ASes with large IP
blocks, tend to advertise few prefixes.
On the contrary, large ASes with relatively high IP address
utilization and diversified routing policies tend to
advertise many prefixes of various lengths.
Not only ASes but IP prefixes are of drastically different sizes as well.
Advertised IPv4 prefixes range in size from a /8, with 224 IP
addresses, down to a /32, with just one. The prefix length of 24 bits is
generally the smallest IPv4 prefix size that is globally routed. This cut-off
justifies our last metric of the customer cone size: the number of unique /24
prefixes found within the union of address space covered by an AS's
prefix customer cone. The number of /24 prefixes in the
cone better captures the contribution from smaller ASes that use
and advertise smaller portions of IP address space. However, this
metric is likely to overestimate the contribution of many older ASes
(those that appeared in the Internet early--e.g., some military and
academic networks) since they often have huge chunks of assigned IP address
space that they utilize scarcely.
ASes with larger customer cones have an especially
important role in the Internet's capital and governance structure. At the top of
this hierarchy are ISPs commonly known as Tier-1 ISPs. They do not pay for
any transit to any upstream providers at all; instead they peer with
each other to guarantee their connectivity to all destinations in the
Internet. At the bottom of the hierarchy are customer ASes who do not
have their own customers and pay their providers to reach all
destinations in the Internet.
Sample AS Ranking
We provide a CGI script that
calculates AS ranking for different dates. AS
relationship information for all pairs of AS neighbors is also available for download at
http://as-rank.caida.org/data.
The table below illustrates our AS ranking results. In this example, we rank ASes by the
four metrics in the following order:
1) the number of /24s in their customer cone;
2) the number of prefixes in their customer cone;
3) the number of ASes in their customer cone; and 4) AS degrees.
| metric |
|
description |
|
| ASes |
number of ASes in the customer cone (ASes that can be reached from a given AS by following c2p
links first through to its customers, then on to its customers' customers,
and so on)
|
| prefixes |
number of unique prefixes announced by all ASes in the
customer cone
|
| /24s |
number of unique /24 prefixes in the IP address space covered by
the customer cone
|
|
| degree |
number of unique ASes connected to this AS via any kind of links (p2c, c2p, p2p, or s2s)
|
| rank |
|
AS number |
|
AS information |
|
customer cone |
|
degree |
| ISP's name | country | /24s | prefixes | ASes |
|
| 1 | 1239 | Sprint | US | 5,273,498 | 165,870 | 18,843 | 1,710 |
|
| 2 | 701 | UUNET Technologies, Inc. | US | 5,213,989 | 163,413 | 18,588 | 2,346 |
|
| 3 | 7018 | AT&T WorldNet Services | US | 5,081,471 | 154,688 | 17,970 | 1,917 |
|
| 4 | 3356 | Level 3 Communications, LLC | US | 4,829,166 | 147,442 | 16,987 | 1,228 |
|
| 5 | 209 | Qwest | US | 4,602,598 | 143,905 | 16,625 | 1,105 |
|
| 6 | 174 | Cogent Communications | US | 4,178,620 | 137,960 | 15,902 | 840 |
|
| 7 | 3549 | Global Crossing | US | 4,178,620 | 137,960 | 15,902 | 634 |
|
| 8 | 3561 | Savvis | US | 4,178,620 | 137,960 | 15,902 | 568 |
|
| 9 | 702 | MCI EMEA | NL | 4,178,620 | 137,960 | 15,902 | 548 |
|
| 10 | 7132 | SBC Internet Services | US | 4,178,620 | 137,960 | 15,902 | 545 |
| 10 | 3303 | Swisscom Enterprise Solutions Ltd | CH | 4,178,620 | 137,960 | 15,902 | 545 |
|
ranking mode: relationship based
alpha parameter: 0.01
Whois: June 10, 2004 - APNIC, ARIN, and RIPE
AS links: BGP RIBs from RouteViews (rv2)
AS links: 5 days starting on Mar 1, 2005 (15 snapshots at 8 hour intervals)
prefix-to-AS mappings: RouteViews BGP snapshot on Mar 1, 2005
|
Caveats and limitations
Although we know of no more rigorous empirical analysis of macroscopic
Internet topology enriched with AS relationships, we recognize that resource limitations
constrain the quality of the science we can do.
- AS relationships are more complex than allowed for in
our approach. The semantics of routing relationships between the same two ASes
can differ by peering location or even by prefix; our model oversimplifies
these cases by assigning a single relationship to each pair of ASes.
-
A truly accurate picture of the Internet topology would
require collection of data from every AS, while our automated ranking procedure
is limited to the measurement points publicly available at RouteViews.
-
As in all analyses of massive datasets, our heuristics have a number of
associated external parameters. We fine tune the values of these parameters
based on our pre-existing (but limited) notion of the correct answer as
well as experience with the algorithm that suggests auspicious ranges.
More monitor points, more probing, cross-correlative analysis in conjunction
with other sources of data, and more powerful data processing techniques to
support larger topology samples would improve the integrity and utility
of the relationship-based AS ranking.
References
| [GAO] |
L. Gao,
On Inferring Autonomous System Relationships in the Internet,
IEEE/ACM Transactions on Networking, December 2001.
|
| [SARK] |
L. Subramanian, S. Agarwal, J. Rexford, and R. H. Katz,
Characterizing the Internet Hierarchy from Multiple Vantage
Points,
IEEE INFOCOM, 2002.
|
| [DPP] |
G. Di Battista, M. Patrignani, and M. Pizzonia,
Computing the Types of the Relationships between Autonomous
Systems,
IEEE INFOCOM, 2003.
|
| [EHS] |
T. Erlebach, A. Hall, and T. Schank,
Classifying Customer-Provider Relationships in the Internet,
Proceedings of the IASTED International Conference on
Communications and Computer Networks (CCN), 2002.
|
| [XG] |
J. Xia and L. Gao,
On the Evaluation of AS Relationship Inferences,
IEEE Globecom, 2004.
|
| [DKH1] |
X. Dimitropoulos, D. Krioukov, B. Huffaker, kc claffy, and G. Riley,
Inferring AS Relationships: Dead End or Lively Beginning,
International Workshop on Efficient and Experimental Algorithms (WEA), 2005.
|
| [DKH2] |
X. Dimitropoulos, D. Krioukov, M. Fomenkov, B. Huffaker, Y. Hyun,
kc claffy, and G. Riley,
AS Relationships: Inference and Validation,
submitted to CCR, 2006.
|
Support for this work was provided by the Cisco University Research
program and by NSF CNS-0434996, with previous support from DHS/NCS.
|
|