Skip to Content
[CAIDA - Center for Applied Internet Data Analysis logo]
Center for Applied Internet Data Analysis > publications : papers : 2020 : learning_extract_use_asns
Learning to Extract and Use ASNs in Hostnames
M. Luckie, A. Marder, M. Fletcher, B. Huffaker, and k. claffy, "Learning to Extract and Use ASNs in Hostnames", in ACM Internet Measurement Conference (IMC), Oct 2020.
|   View full paper:    PDF    Data Supplement    Related Presentation    Related Video    |  Citation:    BibTeX    Resource Catalog   |

Learning to Extract and Use ASNs in Hostnames

Matthew Luckie2
Alexander Marder1
Marianne Fletcher2
Bradley Huffaker1
kc claffy1

CAIDA, San Diego Supercomputer Center, University of California San Diego


University of Waikato

We present the design, implementation, evaluation, and validation of a system that learns regular expressions (regexes) to extract Autonomous System Numbers (ASNs) from hostnames associated with router interfaces. We train our system with ASNs inferred by RouterToAsAssignment and bdrmapIT using topological constraints from traceroute paths, as well as ASNs recorded by operators in PeeringDB, to learn regexes for 206 different suffixes. Because these methods for inferring router ownership can infer the wrong ASN, we modify bdrmapIT to integrate this new capability to extract ASNs from hostnames. Evaluating against ground truth, our modification correctly distinguished stale from correct hostnames for 92.5% of hostnames with an ASN different from bdrmapIT’s initial inference. This modification allowed bdrmapIT to increase the agreement between extracted and inferred ASNs for these routers in the January 2020 ITDK from 87.4% to 97.1% and reduce the error rate from 1/7.9 to 1/34.5. This work presents a new avenue for collecting validation data, opening a broader horizon of opportunity for evidence-based router ownership inference.

Keywords: measurement methodology, software/tools, topology
  Last Modified: Tue Nov-17-2020 04:47:43 UTC
  Page URL: