FANTAIL Project Overview
Internet cartography has emerged as a new field of computer as well as network science, with several global Internet measurement infrastructures executing comprehensive topology mapping measurement experiments, continuously, for years. The network community has vast amounts of traceroute data: CAIDA has operated the longest-running of these measurement infrastructure platforms (Archipelago), which has collected 90 billion traceroutes in 39 TB of files, growing 16 billion traces and 7 TB annually (5-year doubling rate). RIPE Atlas collects ~700 million IPv4/IPv6 traceroutes each month, and Measurement Lab (M-Lab) collects millions of traceroutes/day. The biggest remaining obstacle to even more productive scientific use of this unbounded wealth of information is infrastructural: the lack of an easy-to-use and analytically powerful exploratory interface to the data.
In response to community feedback, the goals of the FANTAIL system - Facilitating Advances in Network Topology Analysis - is to enable discovery of the full potential value of massive raw Internet end-to-end path measurement data sets, allowing researchers to more easily use community topology data to search traceroute data using high-level queries to perform data processing and analysis tasks on matching traces without owning/operating a cluster, and without learning big data programming. FANTAIL is a four-component system:
- an interactive web interface;
- an API built on web standards;
- a full-text search system based on Elasticsearch; and
- a big data processing system based on Apache Spark.
Although our goal is to enhance the general accessibility and utility of this data, our project will be driven by specific compelling use cases, in response to research community needs for interactive exploratory capabilities. To this end, we will identify and implement reusable components, analysis modules, which will serve as primitives for constructing more complex data-processing pipelines. Users will specify, via the web interface or API, a sequence of analysis modules to execute on the set of traceroute paths matched by their queries. The FANTAIL system will then perform the queries, run the analysis modules, and provide the output for download for further analysis or processing by researchers on their own systems. We will implement analysis modules that are useful for (1) performing data reduction (to minimize the amount of data users have to download and process), (2) enhancing raw traceroute data with various annotations available publicly or created by us, and (3) offloading commonly-needed analysis/data processing tasks from users.
- FANTAIL: Facilitating Advances in Network Topology Analysis, FANTAIL Advisory Committee, Oct 2020.
- FANTAIL: Facilitating Advances in Network Topology Analysis, Workshop on Active Internet Measurements: Knowledge of Internet Structure: Measurement, Epistemology, and Technology (AIMS-KISMET), Feb 2020.
Please send e-mail to email@example.com with any questions about the FANTAIL system, or if interested in very early access (pre-alpha).