CAIDA Home
 Current Research | Historical Research  
 www.caida.org > research : topology : : rank_as
    visit     contact     search:
CAIDA: Cooperative Association for Internet Data Analysis
Introduction to Relationship-based AS Ranking

-----summary of contents-----
Analysis of the the Internet Service Provider (ISP) hierarchy is critical to a deeper understanding of technical, economic and regulatory aspects of the Internet inter-domain routing system. As part of our research agenda to measure and analyze macroscopic Internet structure, we have developed a new procedure to rank Autonomous Systems (AS Rank) by their location in the Internet hierarchy. Our ranking relies upon AS relationship information that we discover using our new inference algorithms. Our approach is rooted in economic AS relationships, ranking each AS as a function of the number of IP prefixes advertised by this AS, its customer ASes, their customers ASs, and so on.
-----end summary of contents-----

Goal

The goal of this project is to provide ISP ranking based on business relationships between ASes.

Accurate knowledge of AS business relationships is relevant to both technical and economic aspects of the Internet's inter-domain structure. First, AS relationships determine routing policies that introduce a non-trivial set of constraints to paths over which Internet traffic can flow, with implications for network robustness, traffic engineering, macroscopic topology measurement strategies, and other research and operational considerations. Second, macroscopic analysis of AS relationships not only yields insight into the economic underpinnings of business realities in the current Internet, but also provides a solid validation framework for economy-based modeling of Internet topology evolution. Indeed, the Internet AS-level topology and its evolutionary dynamics are consequences of business decisions that Internet players make. Therefore, the ability to infer AS relationships is a promising tool for understanding and modeling the economic forces that drive the evolution of the Internet topology and its hierarchy.

AS relationships


Figure 1. Types of AS relationships. The ASes at the bottom of the graph, D, E, and F, are customers of those above. ISPs in the middle, B and C, are both providers of ASes below and customers of ISPs above. ISPs B and C are also peers of each other. ISP A at the top is a provider to B and C and a customer of no one.

Background

Although business agreements between ISPs can be complicated, the model introduced by Gao [GAO] disregards peculiar details and identifies the following three most common types of relationships: customer-to-provider (c2p) (or if looked at from the opposite direction, provider-to-customer p2c), peer-to-peer (p2p), and sibling-to-sibling (s2s).

The justification for this classification is that an AS must buy transit services for any traffic destined to parts of the Internet that this AS neither owns nor can reach through its customers. In Figure 1, where arrow directions reflect flows of money, ASes at lower levels pay ISPs at higher levels in exchange for access to the rest of the Internet. The ISP who pays is the customer and the ISP who is being paid is the provider. Links between a customer and a provider are c2p (p2c) links. In Figure 1, examples of c2p links are D->B, E->B, F->C, B->A, and C->A.

A p2p link connects two ISPs who have agreed to exchange traffic on a quid pro quo basis. Peers exchange traffic only between each other and each other's customers. This relationship allows peering ISPs to save money on transit costs they would otherwise have to pay to their transit providers for such traffic. In Figure 1, B-C is a p2p link. Note the lack of direction of the link, indicating that neither B nor C is paying each other for the traffic they exchange.

An s2s link connects two ASes administratively belonging to the same ISP. Such links usually appear as a result of mergers and acquisitions, or under certain network management scenarios.


Figure 2. The top two paths 1 and 2 are valid, while the bottom two 3 and 4 are invalid.

We also use the notion of money transfers between ASes to define valid and invalid AS paths. A valid path between source and destination ASes is one in which for every ISP providing transit (a transit provider), there is a payee. The payee of the transit provider must be its immediate neighbor in the path. An invalid path is one in which there is at least one transit provider not paid by a neighbor in the path.

In Figure 2 the top two examples are valid paths, while the bottom two are invalid. In Example 1 the transit providers are A, B, and C. ISPs B and C pay to A, D pays to B, and F pays to C. In Example 2 the transit providers are B and C, and they are paid by D and F respectively. In contrast, in Example 3 the transit provider is B, but not only does no one pay B, but B itself pays both A and Z. Example 4 also illustrates a situation where nobody pays transit provider B.

We conclude that a valid path must have the following valid path pattern: zero or more c2p links, followed by zero or one p2p link, followed by zero or more p2c links. In addition, s2s links can appear in any number anywhere in the path.

History of inference algorithms

Service providers consider the policy details of their business relationships as proprietary information and do not generally make them public. Therefore, Internet researchers have to rely upon indirect AS relationship inference algorithms in order to build a picture of Internet business structure.

Gao's pioneering work [GAO] inspired many researchers to seek approaches to inferring ISP business relationships using information from publicly available BGP routing tables. Gao used the concept of valid paths as the basis for her inference heuristic and identified the top provider in a given path based on AS degree (the number of ASes connected to a given AS).

Subramanian et al. [SARK] slightly relaxed the problem by not inferring s2s links, and provided a more elegant mathematical formulation based on the concept of valid paths. Assuming maximization of the number of valid paths as a natural objective, they formulated the AS relationship inference problem as a combinatorial optimization problem: given an undirected graph G derived from a set of BGP paths P, assign the edge type (c2p or p2p) to every edge in G such that the total number of valid paths in P is maximized. [SARK] called the problem the type-of-relationship (ToR) problem, conjectured that it is NP-complete, and provided a heuristic solution.

Di Battista et al. [DPP] and independently Erlebach et al. [EHS] proved that the ToR problem is indeed NP-complete. EHS proved also that it is even harder, APX-complete. More importantly for practical purposes, both DPP and EHS demonstrated that p2p links cannot be inferred in the ToR problem formulation and developed mathematically rigorous approximate solutions to the ToR problem but inferring only c2p and p2c links. We note that neither [SARK] nor [GAO] technique offers a solution to the problem of reliable identification of p2p links due to their low accuracy as demonstrated by Xia et al. [XG].

In addition to its inability to infer p2p links, there are other issues with the ToR formulation that we identified in [DKH1]. In particular, for some links either relationship (c2p or p2c) results in the same number of invalid paths. As a result, ToR labels such links randomly, classifying them as c2p or p2c with 50%-50% probability. In some cases this approach leads to obviously incorrect inferences, e.g., well-known large providers are inferred as customers of small ASes. In [DKH1] we resolved this issue by using multiobjective optimization techniques incorporating both the notion of valid paths and AS importance as reflected in AS degree. In [DKH2] we introduce new improved algorithms to determine not only c2p but also p2p links. These improvements achieve high levels of accuracy of AS relationship inference as we demonstrate via direct validation with network administrators of a set of ASes.

ISP hierarchy

We implemented the following methodology of the relationship-based AS ranking.

  1. Build an AS-level graph of the Internet from publicly available BGP table data and annotate links in this graph with inferred AS relationships.
  2. Define the AS customer cone of an AS A as the AS A itself plus all the ASes that can be reached from A following only p2c and s2s links (but not c2p or p2p!). In other words, AS A's customer cone is A, plus A's customers, plus its customers' customers, and so on.
  3. Rank ASes by the following three customer cone size metrics: the number of ASes in the cone, the number of unique prefixes advertised by these ASes, and the number of /24 blocks in the union of these prefixes.

The size of the AS customer cone in terms of the number of ASes in the cone is a coarse measure since individual AS sizes can be drastically different. A more precise measure weighs each AS by the number of IP prefixes the AS advertises. The prefix customer cone of an AS is the union of its own prefix set and the prefix set of all the ASes in its AS customer cone. Note that small ASes with little IP address space allocated to them, as well as older ASes with large IP blocks, tend to advertise few prefixes. On the contrary, large ASes with relatively high IP address utilization and diversified routing policies tend to advertise many prefixes of various lengths.

Not only ASes but IP prefixes are of drastically different sizes as well. Advertised IPv4 prefixes range in size from a /8, with 224 IP addresses, down to a /32, with just one. The prefix length of 24 bits is generally the smallest IPv4 prefix size that is globally routed. This cut-off justifies our last metric of the customer cone size: the number of unique /24 prefixes found within the union of address space covered by an AS's prefix customer cone. The number of /24 prefixes in the cone better captures the contribution from smaller ASes that use and advertise smaller portions of IP address space. However, this metric is likely to overestimate the contribution of many older ASes (those that appeared in the Internet early--e.g., some military and academic networks) since they often have huge chunks of assigned IP address space that they utilize scarcely.

ASes with larger customer cones have an especially important role in the Internet's capital and governance structure. At the top of this hierarchy are ISPs commonly known as Tier-1 ISPs. They do not pay for any transit to any upstream providers at all; instead they peer with each other to guarantee their connectivity to all destinations in the Internet. At the bottom of the hierarchy are customer ASes who do not have their own customers and pay their providers to reach all destinations in the Internet.

Sample AS Ranking

We provide a CGI script that calculates AS ranking for different dates. AS relationship information for all pairs of AS neighbors is also available for download at http://as-rank.caida.org/data.

The table below illustrates our AS ranking results. In this example, we rank ASes by the four metrics in the following order: 1) the number of /24s in their customer cone; 2) the number of prefixes in their customer cone; 3) the number of ASes in their customer cone; and 4) AS degrees.

metric description
ASes number of ASes in the customer cone (ASes that can be reached from a given AS by following c2p links first through to its customers, then on to its customers' customers, and so on)
prefixes number of unique prefixes announced by all ASes in the customer cone
/24s number of unique /24 prefixes in the IP address space covered by the customer cone
degree number of unique ASes connected to this AS via any kind of links (p2c, c2p, p2p, or s2s)

 

rank AS
number
AS information customer cone degree
ISP's namecountry/24sprefixesASes
11239Sprint US5,273,498165,87018,8431,710
2701UUNET Technologies, Inc. US5,213,989163,41318,5882,346
37018AT&T WorldNet Services US5,081,471154,68817,9701,917
43356Level 3 Communications, LLC US4,829,166147,44216,9871,228
5209Qwest US4,602,598143,90516,6251,105
6174Cogent Communications US4,178,620137,96015,902840
73549Global Crossing US4,178,620137,96015,902634
83561Savvis US4,178,620137,96015,902568
9702MCI EMEA NL4,178,620137,96015,902548
107132SBC Internet Services US4,178,620137,96015,902545
103303Swisscom Enterprise Solutions LtdCH4,178,620137,96015,902545
ranking mode: relationship based
alpha parameter: 0.01
Whois: June 10, 2004 - APNIC, ARIN, and RIPE
AS links: BGP RIBs from RouteViews (rv2)
AS links: 5 days starting on Mar 1, 2005 (15 snapshots at 8 hour intervals)
prefix-to-AS mappings: RouteViews BGP snapshot on Mar 1, 2005

Caveats and limitations

Although we know of no more rigorous empirical analysis of macroscopic Internet topology enriched with AS relationships, we recognize that resource limitations constrain the quality of the science we can do.

  • AS relationships are more complex than allowed for in our approach. The semantics of routing relationships between the same two ASes can differ by peering location or even by prefix; our model oversimplifies these cases by assigning a single relationship to each pair of ASes.
  • A truly accurate picture of the Internet topology would require collection of data from every AS, while our automated ranking procedure is limited to the measurement points publicly available at RouteViews.
  • As in all analyses of massive datasets, our heuristics have a number of associated external parameters. We fine tune the values of these parameters based on our pre-existing (but limited) notion of the correct answer as well as experience with the algorithm that suggests auspicious ranges.
More monitor points, more probing, cross-correlative analysis in conjunction with other sources of data, and more powerful data processing techniques to support larger topology samples would improve the integrity and utility of the relationship-based AS ranking.


References

[GAO] L. Gao, On Inferring Autonomous System Relationships in the Internet, IEEE/ACM Transactions on Networking, December 2001.
[SARK] L. Subramanian, S. Agarwal, J. Rexford, and R. H. Katz, Characterizing the Internet Hierarchy from Multiple Vantage Points, IEEE INFOCOM, 2002.
[DPP] G. Di Battista, M. Patrignani, and M. Pizzonia, Computing the Types of the Relationships between Autonomous Systems, IEEE INFOCOM, 2003.
[EHS] T. Erlebach, A. Hall, and T. Schank, Classifying Customer-Provider Relationships in the Internet, Proceedings of the IASTED International Conference on Communications and Computer Networks (CCN), 2002.
[XG] J. Xia and L. Gao, On the Evaluation of AS Relationship Inferences, IEEE Globecom, 2004.
[DKH1] X. Dimitropoulos, D. Krioukov, B. Huffaker, kc claffy, and G. Riley, Inferring AS Relationships: Dead End or Lively Beginning, International Workshop on Efficient and Experimental Algorithms (WEA), 2005.
[DKH2] X. Dimitropoulos, D. Krioukov, M. Fomenkov, B. Huffaker, Y. Hyun, kc claffy, and G. Riley, AS Relationships: Inference and Validation, submitted to CCR, 2006.

Support for this work was provided by the Cisco University Research program and by NSF CNS-0434996, with previous support from DHS/NCS.

Cooperative Association for Internet Data Analysis (CAIDA)
  Last Modified: Tues Feb-26-2008 17:7:11 PDT
  Maintained by: Bradley Huffaker
  Page URL: http://www.caida.org/research/topology/rank_as/index.xml