For various topology-related projects, we need a mapping from an IP
address to the Autonomous System (AS) who is assigned (some say
"owns") that IP address. The most common approach to map IP addresses
to ASes is to use BGP table dumps from public sources such as
Amogh Dhamdhere did this analysis while at CAIDA in spring 2010.
General table statisticsWe collected routing table dumps from each Routeviews and RIPE collector from March 05, 2010 at approximately the same time (00:00 hrs). We first calculate the following statistics separately for each table dump from Routeviews and RIPE.
1) The address space coverage provided by the table, in terms
of the number of IP addresses.
2) The number of prefixes seen in the table.
2) The number of unique AS paths seen in the table.
3) The number of AS links seen in the table.
3) The number of origin ASes seen in the table.
4) The mean number of Addresses Per AS (APA) which is the mean of the distribution of the number of IP addresses owned by an AS.
5) The standard deviation of the number of Addresses Per AS, which is a measure of the diversity of the IP-AS mapping provided by the table.
The entries of the table below are color-coded, with red indicating the largest value in each column. If a particular row (corresponding to a routing table) has red entries all through, this routing table is best with respect to each metric analyzed.
|table||coverage||AS paths||prefixes||AS links||origin ASes||APA mean||APA std dev|
|table||coverage||AS paths||prefixes||AS links||origin ASes||APA mean||APA std dev|
For Routeviews, we found that table RV2 maximized the number of AS paths, unique prefixes, unique AS links and origin ASes, although the RV_linx table covered the most IPv4 address space. For RIPE, we found that no single table dump (from March 05, 2010) was best with respect to all metrics; RRC01 covered the most IPv4 address space.
The utility of adding additional tables
Next we studied the utility of adding additional routing tables. We started with a base table -- the table dump from March 05, 2010 that gave the largest coverage of IP address space. We then added a single table at a time in decreasing order of address space coverage, calculating changes in the following metrics due to the newly added table:
1) Additional address space coverage.
2) Address space allocation changed.
3) Unique ASes and origin ASes.
4) Unique AS links.
5) Unique prefixes.
6) Unique more specific prefixes.
7) More specific prefixes with different origin AS.
Tables ordered by overall address space coverage
We first performed this analysis starting with the table dump that covered the largest amount of IPv4 address space as the base table. For the set of table dumps we collected on March 05, 2010, RV_linx covered the largest amount of address space. We then added individual tables in decreasing order of address space coverage, measuring the change in the above listed metrics caused by the additional table. We found that additional tables led to less than 1% increase in address space coverage, address allocation change, and numbers of unique ASes and unique origin ASes. As far as the number of ASes (origin as well as non-origin ASes) is concerned, this result confirms previous observations that most ASes are observable from even a few vantage points.
However, for other metrics additional tables matter: adding a table can yield up to 4.8% more AS links than seen in the base table, up to 4.6% more prefixes and 4.7% more specific prefixes. For most of the added tables, between 10% and 70% of the more specific prefixes actually give a different origin AS.
Choosing the best additional table
Starting from the base table RV_linx (which covers the largest amount of IPv4 address space) from the Routeviews and RIPE dumps collected on March 05, 2010, we evaluated which table was the best to add in order to optimize a certain metric in the aggregated table. We selected the metric of IP-AS mapping changes caused by the new table, which quantifies how much new IP-AS mapping information the new table provides. We let tcuml be the cumulative prefix-AS mapping at a given point, i.e., after adding the Nth table. Of the remaining tables, we must determine which table, when added to tcuml, causes the largest change in IP-AS mapping. We found that even by adding the "best" table for this purpose at every step resulted in less than 1% change in the address space coverage and IP-AS mapping.
Comparison with Team Cymru's WHOIS service
As a form of independent validation, or at least calibration, we compared the IP-AS mappings we obtained from Routeviews and RIPE BGP dumps with the IP-AS mapping service provided by Team Cymru on March 22, 2010. For this purpose, we constructed a sample list of 24M IP addresses collected from Ark traces seen during our IPv4 topology probing during January 2010. To limit the load on Cymru servers, we queried (on March 22, 2010) only one address per /24 (the *.1 address), thus reducing the set of queried addresses to 2.7M. For each address, we compare the AS returned by Cymru with the AS obtained from the prefix-AS mappings derived from Routeviews and RIPE table dumps collected on the same date. (Cym = mapping from Cymru's service, and Tab = mapping from RV+RIPE tables).
|table||IP addresses||IP-AS mismatches||mismatch %||Single Cym undef||%||Single Tab undef||%||Single mismatch||%||MOAS missing in Tab||%||MOAS missing in Cym||%||MOAS missing both||%|
- The "single Cym undef" column refers to the number of mismatches that were because Cymru did not have a matching prefix for that IP address.
- "Single Tab undef" is for mismatches where our combination of routing tables did not have a match for the IP address.
- "Single mismatch" refers to the case where both the table and Cymru each found a different (but only one) matching AS.
- "MOAS missing in Tab" refers to the cases where the IP address mapped to multiple origin ASes, and one of the ASes was missing in our tables.
- "MOAS missing in Cym" refers to the cases where the IP address mapped to multiple origin ASes, and one of the ASes was missing in the Cymru lookup.
- "MOAS missing in both" refers to the cases where the IP address mapped to multiple origin ASes, and some AS from that set was missing in both the table dumps and Cymru lookups.
We also analyzed how the previously defined metrics changed in routing
tables over time. We collected 10 routing table dumps in March 2010
from Routeview's RV-LINX and RIPE's RRC01, which are the largest
collectors in terms of IPv4 address space coverage. We labeled these
table dumps chronologically as table 0 to table 9. Note that
consecutive tables are themselves 3 days apart. We then calculated our
list of metrics for the following cases:
1) Across immediately adjacent tables: For i= 0 .. 8, consider table i as the base table, and compute the effect of integrating table i+1 with table i on the list of metrics.
2) Compare the first table with each successive table: Consider table 0 as the base table, and study the effect of integrating each table i with table 0 on the previous metrics.
Comparing adjacent tables
We first compare immediately adjacent tables from Routeviews and RIPE with respect to the difference in IP-AS mapping, number of unique ASes and origin ASes, and the number of unique AS links. We consider each pair of immediately adjacent tables, using the first of the pair as the base table, and computing changes in our metrics if the second table were combined with the first. Overall, the metrics did not change significantly across the adjacent tables we analyzed. Over all pairs of adjacent tables, we see < 2% increase in the address space coverage, < 3% change in IP-AS mapping < 1% of unique ASes, < 1% of unique origin ASes, and < 3% of unique AS links.
Comparing additional tables with the first table
We performed a slightly different temporal analysis, treating the first table (of the 10) collected in the month as the base table. We added successive individual tables collected over the month to this first table, and computed the change in IP-AS mapping, number of unique ASes and AS links over the duration of the month. Note that we still only consider pairs of tables, with the first table fixed.
As expected, the difference from comparing the first table with each subsequent table in the month is greater than when we compare consecutive table pairs. Also, the difference increases as we combine the first table with tables later in the month. We see the largest difference in the number of unique AS links, where the difference between the first and the last table of the month is about 5%.