Internet Voyager for Gathering Cyber Threat Intelligence

This project deploys dual-stack IPv4/IPv6 telescopes and honeypots across distributed vantage points to collect datasets for analyzing malicious Internet activity and supporting ML/AI-driven cybersecurity research.

Sponsored by:
National Science Foundation (NSF)

Principal Investigators: Ka Pui Mok kc claffy

Funding source:  CNS-2450552 Period of performance: October 1, 2025 - September 30, 2028.


Overview

For years, researchers have deployed network telescopes on unused IPv4 address spaces and/or public cloud infrastructure to passively capture incoming unsolicited traffic, known as Internet Background Radiation (IBR), to identify victims of denial-of-service attacks and malicious Internet activities. With consistent U.S. government support, CAIDA has sustained operation of the world’s largest IPv4 network telescope for over two decades, helping hundreds of CISE (NSF Directorate for Computer and Information Science and Engineering) researchers to study Internet-wide cybersecurity incidents and produce datasets for cybersecurity education.

However, this passive approach has limitations, as it captures only a few types of security events, and malicious actors are evolving their tactics to evade detection. Researchers have also deployed honeypots which react to unsolicited traffic to lure further engagement by attackers, yielding attack fingerprints, victim identification, and malware samples. However, both telescope and honeypots face a daunting challenge with the growing use of IPv6. Scanning the vast IPv6 address space, or even small IPv6 networks, is practically infeasible, so scanners must more strategically target likely-active networks. Furthermore, most existing honeypot implementations support one IPv4 address per instance; scaling to monitor significant segments of IPv6 address space is resource-prohibitive.

We propose iVoyager, a transformative cyberinfrastructure (CI) designed to enable CISE researchers to effectively explore the landscape of Internet threats by scalably gathering cyber threat intelligence. Specifically, iVoyager will provide three capabilities:

  1. A flexible virtualized environment for researchers to facilitate the rapid development and scalable deployment of distributed dual-stack (IPv4 and IPv6) telescopes and honeypots.
  2. A proactive telescope that applies novel active techniques to attract malicious IPv6 traffic.
  3. Deployment of lightweight telescope and honeypot vantage points in public clouds and IXPs in collaboration with Internet2 and DREN.

We will operationalize our reference design of iVoyager to collect longitudinal datasets that facilitate use of machine learning/artificial intelligence (ML/AI) for cyber threat hunting, anomaly detection, and malware analysis. iVoyager will not replace existing network telescopes or honeypots — it will complement these CIs by providing additional datasets for more comprehensive cyber threat analysis.

Intellectual Merit

This project aligns with the CIRC (Community Infrastructure for Research in Computer and Information Science and Engineering) goal to enable CISE communities to pursue a focused research agenda, in this case to effectively explore the landscape of Internet threats by scalably gathering cyber threat intelligence. The infrastructure will collect datasets that facilitate use of ML/AI for cyber threat hunting, anomaly detection, and malware analysis. The high-quality datasets we collect will enable researchers to better characterize IP spoofing and other malicious Internet activities, as well as provide labeled data for use with ML/AI approaches to address cybersecurity challenges.

Broader Impacts

This project aligns with CIRC’s goal to provide new research opportunities for a broad-based CISE community with a focused research agenda. This project is structured to solve engineering challenges, overcome policy barriers, and foster a strong, sustainable community focused on operational network security research and able to navigate IPv6 security challenges. The PIs and collaborators include faculty who will integrate the data into their curriculum materials.

Acknowledgment of awarding agency’s support

National Science Foundation (NSF)

This material is based on research sponsored by the National Science Foundation (NSF) grant CNS-2450552. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of NSF.

Published
Last Modified