Inspired by the promising foundation presented in Mehmet Gunes' APAR, we wrote a highly optimized implementation for production use on large-scale Internet topologies, as well as fixing a few bugs and experimenting with our own improvements to the algorithm. We call the resulting tool "kapar".
The most significant optimization of kapar over APAR was to avoid storing the complete set of paths in memory. Instead, kapar makes a single pass over the set of traces and extracts only the minimum information it needs.
- First, kapar finds all unique 3-hop segments to use for the alias resolution phase.
- Second, it identifies common prefixes of length 24 or greater among addresses in the same trace to generate a list of subnets that cannot exist according to the subnet accuracy condition.
- Finally, it assigns a unique ID number to each trace, and stores a list of all observed addresses and a compressed bitmap of the IDs of the traces in which each address appeared. These trace ID sets contain sufficient information for checking the no-loop condition.
Kapar also improves upon the APAR algorithm in several ways.
- First, it can load a set of aliases obtained from another source, e.g. results of a fingerprint technique or published "ground truth" topologies.
- Second, during the subnet formation phase, kapar optionally uses a stricter test for point-to-point subnet existence.
- Third, during the alias inference phase, kapar uses stricter tests for the common neighbor condition.
- Finally, the kapar implementation can make use of TTL data obtained from multiple vantage points, which constrains the subnet formation and alias resolution phases, further reducing the rate of false positives in each.
Additional details about the motivation, algorithms, and performance of kapar are available in Ken Keys' "Internet-Scale IP Alias Resolution Techniques", published in ACM Computer Communications Review (CCR), January 2010.