Funding source: NSF CNS-1059439. Period of performance: July 1, 2011 - June 30, 2014.
All proposed tasks were completed as scheduled.
In the last decade, network telescopes have been used to observe unsolicited Internet traffic sent to unassigned address space ("darkspace"). Network telescopes represent a unique type of instrumentation that allows global visibility into and historical trend analysis of a wide range of security-related Internet events, including scanning address space for vulnerable targets, denial-of-service attacks, the automated spread of Internet worms or viruses, and miscellaneous misconfigurations. In this project we expanded network telescope instrumentation at UCSD to enable researchers to exploit this invaluable global data source to improve our collective understanding of security-related events such as large-scale attacks and malware spread. Recently, we also demonstrated that analysis of darkspace traffic has broader potential, as it enables inference of macroscopic phenomena that are not directly related to the malware generating such traffic (for example, large-scale Internet outages and usage of the IPv4 address space). Funding for the telescope measurement infrastructure is therefore vital to support many diverse active (and NSF-funded) projects.
Three pervasive challenges in network traffic research guided our expansion: collection and storage, efficient curation, and sharing large volumes of data. We deployed and evaluated an innovative shift in network monitoring that explicitly addressed all three challenges: enabled near-real-time sharing of traffic data, in a way that maximizes data utility for research and analysis, while protecting user privacy.
Specifically, we have deployed a flexible and multi-level framework for archiving both data and meta-data. The infrastructure is now capable of supporting (i) analysis, reporting and visualization in near-real time, and (ii) longitudinal analysis on historical data, in some cases within dramatically reduced processing times.
We also developed and released as open source the Corsaro package, a modular software framework for high-speed analysis of trace data on a per-packet basis and aggregation of results in customizable time intervals. The trace analysis logic is separated into a set of plugins for geolocation, IP to AS lookup, filtering by prefix, filtering by geographical area, anonymization, and other processing. The low overhead to create a new plugin, coupled with the efficiency and reliability of Corsaro, allows users to both perform ad-hoc exploratory investigations as well as carry out large-scale near-realtime analyses. Although specifically designed to be used with passive traces captured by telescopes, Corsaro is flexible enough to work with any type of passive network trace data.
Finally, we built infrastructure to allow vetted researchers to run analysis programs approximately one hour after data collection, significantly shortening the average time to access telescope data for research purposes. For safe and ethical data sharing, we used our Privacy-Sensitive Sharing Framework, which integrates privacy-enhancing technology with a policy framework using proven and standard privacy principles and obligations of data seekers and data providers.
After deployment, we started serving requests from universities and research institutions (both American and foreign) to access our near-realtime telescope data. In the majority of cases we also gave guidance on improving and optimizing their data analysis. Access to our telescope traffic data sets and near-realtime traces has been supported also through DHS's Protected REpository for the Defense of Infrastructure against Cyber Threats (PREDICT) program, intended to promote empirical research into network infrastructure security.
The data collection and the tools we developed supported successful analyses of a number of macroscopic Internet events: large-scale probing from botnets (Sality and Carna), country-level censorship events (Egypt, Libya, Syria), AS-level outages caused by BGP misconfiguration, and global effects of Microsoft "Patch Tuesday" on the volume and characteristics of malicious traffic. These results were presented at major conferences and workshops (IMC, PAM, CoNEXT, TMA, SIGCOMM).
The intellectual merit of our project lies in our methodology and instrumentation enhancements, which increased the utility of network telescope instrumentation, transforming it into a more accessible, practically useful source of security-relevant data. The results of this project contributed to developing efficient early detection, reaction and mitigation strategies thus enabling more scientific pursuit of cybersecurity research and critical advances in the global fight against pervasive malware.
The broader impacts of this project are diverse. We broadly disseminated the results of this project to academic and security experts community via conferences and invited talks (about 20), web sites, blogs (4 blog posts on CAIDA web site), scientific publications (14) and a workshop that we organized in 2012. We also created an immediate link between research and education by curating and releasing an educational data kit out of samples of telescope data containing security event signatures, as well as hosting and training 9 undergraduate students, supported by the NSF Research Experience for Undergraduate program. Most importantly, this project enabled convenient remote access to a wealth of valuable and insightful scientific data, high-level computing resources and expertise of CAIDA researchers, lowering barriers to engaging in network security research for institutions serving underrepresented minorities.