Twelve Years in the Evolution of the Internet Ecosystem

This document provides supplemental material for the IEEE/ACM Transactions on Networking 2011 paper, Twelve Years in the Evolution of the Internet Ecosystem.

The Internet, as a network of Autonomous Systems (ASes), resembles in several ways a natural ecosystem. ASes of different sizes, functions, and business objectives form a number of AS species that interact to jointly form what we know as the global Internet. ASes engage in competitive transit (or customer-provider) relations, and also in symbiotic peering relations. These relations, which are represented as inter-AS logical links, transfer not only traffic but also economic value between ASes. The Internet AS ecosystem is highly dynamic, experiencing growth (birth of new ASes), rewiring (changes in the connectivity of existing ASes), as well as deaths (of existing ASes). The dynamics of the AS ecosystem are determined both by external "environmental" factors (such as the state of the global economy or the popularity of new Internet applications) and by complex incentives and objectives of each AS. Specifically, ASes attempt to optimize their utility or financial gains by dynamically changing, directly or indirectly, the ASes they interact with. For instance, the objective of a transit provider may be to maximize its profit, and it may approach this goal through competitive pricing and selective peering. The objective of a content provider, on the other hand, may be to have highly reliable Internet access and minimal transit expenses, and it may pursue these goals through aggressive multihoming and an open peering policy.

Our study is motivated by the desire to better understand this complex ecosystem, the behavior of entities that constitute it (ASes), and the nature of interactions between those entities (AS links). How has the Internet ecosystem been growing? Is growth more important than rewiring in terms of the formation of new links? Is the population of transit providers increasing (implying diversification in the transit market) or decreasing (consolidation)? Given that the Internet grows in size, does the average AS-path length also increase? Which ASes engage in aggressive multihoming? What is the preferred type of transit provider for AS customers? Which ASes tend to constantly adjust their set of providers? Are there regional differences in how the Internet evolves? These are some of the questions we ask in this paper. This document provides access to the processed data used in the paper. Access to the raw data that was used is available on request.


BGP data
 

The AS topology snapshots used in this paper were created from publicly available BGP dumps provided by Routeviews and RIPE NCC's collectors. To create a topology snapshot, we first collected 5 routing table dumps from all available Routeviews and RIPE collectors over the course of a month. We then applied a majority filtering algorithm to only retain those AS paths that were seen in a majority of the samples. We then extracted AS links from the set of persistent AS paths. Note that these topologies are not the most complete topologies available in each time period. The topology here represents the primary topology, meaning that we remove backup and transient links that may occur during routing events. Please refer to the paper for details of the majority filtering algorithm. The raw data (AS paths, BGP tables etc.) is available on request. We provide the processed topology snapshots after annotating AS links with business relationships (see below).

AS relationships
 

We used the BGP data to annotate each interdomain link with one of three simplified business relationships -- customer-provider (the customer pays the provider), settlement-free peer (typically no money is exchanged), and sibling (both ASes belong to the same organization) -- using the classification algorithm by Lixin Gao.

AS classification
 

We classify ASes according to their business functions. Our classification is based on the average customer and peer degrees of an AS over the entire lifetime of that AS. Please refer to the paper for details of the classification.

Additional AS information
 

We extract additional information about each AS using WHOIS queries. For this purpose, we use the WHOIS service provided by Team Cymru. Their WHOIS service provides information such as the registry where an AS is registered (arin/ripencc/apnic/lacnic/afrinic), the country code for that AS, and a brief description of the AS.

Datasets
 

We provide two versions of these datasets, corresponding to the Internet Measurement Conference (2008) and the IEEE/ACM Transactions on Networking (2011) versions of this paper.

  • imc08|ton11.rel_files.tar.gz (relationship snapshots ) is the set of relationship snapshots, one file for each snapshot.
    File format:
          <AS1>   <AS2>   <rel>  
    where the relationships are:
    0 = sibling
    1 = customer-provider
    2 = provider-customer
    3 = peer

  • AS.train (AS classification training data ) is the training data used for creating the decision tree classifier.
    File format:
          <AS_number>|<AS_description>|<AS_type>  

  • imc08|ton11.sn.3m.AS.class (AS classification data ) is the output of the decision tree classifier which classifies each AS according to its business type.
    File format:
          <AS_number>   <AS_type>  
    where the AS types are:
    1 = Enterprise Customer
    2 = Small Transit Provider
    3 = Large Transit Provider
    3 = Content/Access/Hosting Provider

  • imc08|ton11.sn.3m.AS.info (AS information data ) gives, for each ASN, the country code, registry (ARIN/RIPENCC/APNIC/LACNIC/AFRINIC) where the ASN is registered, and a brief description of that AS.
    File format:
          <AS_number>   <AS_type>  

Related Objects

See https://catalog.caida.org/paper/2011_twelve_years_evolution/ to explore related objects to this document in the CAIDA Resource Catalog.