MIDAR

Creating accurate maps of Internet topology requires not just discovering how interfaces are connected, but also which interfaces belong to the same router, a process called IP alias resolution. MIDAR (Monotonic ID-based Alias Resolution) is one tool for doing this.

Many routers generate IP ID values using a simple counter that is shared across interfaces on the router. If we send probe packets to two interface addresses belonging to the same router, we can detect that the IP ID values in the response packets were generated by a shared counter and thus that the two addresses are aliases. Ally, RadarGun, and MIDAR all use this basic idea to detect aliases, but differ in how they detect shared counters. MIDAR achieves higher accuracy than previous IP ID based techniques, and is the first to be practical for production use on large-scale Internet topologies.

At CAIDA, we use MIDAR in conjunction with iffinder and kapar to perform alias resolution for the Internet Topology Data Kit. The three tools complement each other, giving a more accurate and complete result than any one tool alone.

MIDAR Features

MIDAR introduces several major new contributions relative to previous IP ID based techniques:

The Monotonic Bounds Test (MBT) for accurate testing of shared counters
Like RadarGun, MIDAR compares time series of IP ID samples collected from multiple interfaces. But whereas RadarGun checks for a small difference in interpolated ID value, using arbitrary thresholds, MIDAR's MBT checks strictly for a necessary condition: if two time series are derived from a single shared counter, then the union of those time series must be monotonic. To avoid uncertainty in inferring the time that the response was generated by the router, MIDAR uses the accurately measurable time range between sending the probe and receiving the response. The MBT yields virtually zero false negatives for shared counters since a counter must be monotonic by definition, and a very low false positive rate for shared counters since it keeps the bounds as tight as possible.
Sliding window scheduling algorithm for scaling up probing
Probing addresses in parallel within a time period short enough to yield useful measurements becomes problematic when the number of addresses reaches a million or more. To avoid this, MIDAR relies on the fact that when addresses share a counter, their IP ID time series must have similar velocity (rate of ID change). After collecting velocity estimates, MIDAR probes a continuously changing window (subset) of targets such that all targets with similar velocity will be probed in parallel, but targets with dissimilar velocities may not be.
Four independent probing methods
During the velocity estimation stage, MIDAR probes addresses with four different methods, and then chooses the best method to use in later stages:
- TCP: send TCP ACK to port 80 on target; target replies with TCP RST
- UDP: send UDP to unused port on target; target replies with ICMP Port Unreachable
- ICMP: send ICMP Echo Request to target; target replies with ICMP Echo Reply
- Indirect: send ICMP echo request to a host past the target, with its TTL set to expire at the target; target replies with ICMP Time Exceeded
Using multiple methods allows MIDAR to collect time series from targets that may be unresponsive to one particular method. Our experiments show that it is common (though not universal) for routers to use a shared counter for the responses to all or most of these methods, so it is fruitful to compare time series collected with different methods.
Probing from multiple monitors
By probing from multiple vantage points, MIDAR can achieve a higher aggregate probing rate than would be practical from a single host, and can also assign targets to monitors in a way that optimizes responsiveness. This feature does introduce some uncertainty in time measurement since the monitors' clocks can never be perfectly synchronized, but we can account for that by widening the time range in the MBT without sacrificing the rigor of the test.
Formalized multiple testing stages to eliminate false positives
When testing N addresses for pairs that share a counter, the number of actual positives (pairs that share a counter) is proportional to N, but the number of false positives (address pairs that appear to share a counter but actually do not) is proportional to the total number of possible pairs, O(N²). Thus, when N is large, even a test with what intuitively seems like a very low false positive rate will generate a large number of false positives relative to the number of actual positives. MIDAR uses a series of probing stages carefully designed to efficiently eliminate the false positives and increase confidence in the true positives.

A full scale MIDAR run consists of four probing stages:

Estimation: for each address, determine velocity and best probe method for use in subsequent stages.
Discovery: probe all target addresses with a sliding window schedule to efficiently discover pairs of addresses that potentially share a counter.
Elimination: re-probe the discovered potential alias pairs to rule out most false positives.
Corroboration: probe each candidate alias set as a whole to increase confidence in true positives and rule out remaining false positives. The Corroboration stage can also be used standalone to test alias sets discovered by another alias resolution technique.

Requirements

Detailed requirements and installation instructions are available in the distribution package. The following list gives a general idea.

POSIX or unix-like operating system
a modern C++ compiler
perl
ruby version 1.9.x or 2.x
the mper probing engine
the rb-mperio mper client library
the arkutil library
gnuplot (optional)
additional requirements for the midar-full, distributed mode: Judy library; amalgalite and open4 third-party Ruby gems; and Ark Ruby gems eva and marinda bundled in the MIDAR distribution package.

Downloads

midar-0.8.0.tar.gz - 2020-12-17 (see README for a complete list of changes)

In the 0.8.0 release, we include fixes to various C++ compiler incompatibilities. In 0.7.0, we vastly improved the fault-tolerance of MIDAR to a variety of run-time problems, so that runs can complete with little to no manual intervention in most cases. The current implementation should be able to handle failures like the following (not an exhaustive list):

prober fails at startup for any reason (e.g., missing target file, missing required Ruby gem, Marinda not running, out of processes, out of memory, prober missing)
prober dies midway for any reason (e.g., Marinda dies, out of memory/swap, disk full, prober crashes due to bug)
elim/cor multi-src hangs or fails
monitor reboots/crashes at any time
network connection lost with monitor at any time

Failures are automatically detected by midar-full (actually, by auxillary commands started by midar-full), which then restarts a given MIDAR stage with failed monitor(s) excluded (that is, targets are regenerated to redistribute targets to non-failed monitors, probers are restarted with new targets, and the elim/cor multi-src driver is restarted).

This release provides three front-ends to MIDAR:

midar-cor
the MIDAR small-scale resolution tool, capable of testing a small (< 200) set of IP addresses with the MIDAR IP ID test, corroboration stage only, using a single probe method, from a single monitor host.
midar-full (local mode)
the medium-scale MIDAR resolution system, capable of testing a medium-size (< 40000) set of IP addresses with the MIDAR IP ID test, with all MIDAR stages and multiple probe methods, from a single monitor host.
midar-full (distributed mode)
the large-scale MIDAR system, capable of testing an Internet-scale (at least 2 million) set of IP addresses with the MIDAR IP ID test, with all MIDAR stages and probe methods, from multiple monitor hosts.

MIDAR is distributed under the GNU General Public License, version 2.

Authors

MIDAR is written and maintained by Ken Keys and Young Hyun.

Related Objects

See https://catalog.caida.org/software/midar/ to explore related objects to this document in the CAIDA Resource Catalog.