Collection, curation and sharing of data for scientific analysis of Internet traffic, topology, routing, performance, and security-related events is one of CAIDA's core objectives. Our Overview of available CAIDA Data, has links to data descriptions, request forms (restricted data), download locations (public data), realtime reports, and other meta-data. Note that since April 2016 some CAIDA datasets are distributed exclusively through IMPACT (Information Marketplace for Policy and Analysis of Cyber-risk and Trust).
Dataset Curation and Access to Datasets
CAIDA curates datasets resulting from both active and passive measurement of the Internet. We provide access to these datasets for researchers in accordance with University of California, San Diego, policy. We maintain servers that allow researchers to download data via secure login and encrypted transfer protocols, and provide access for researchers to CAIDA computers to analyze data using CAIDA resources.
Active measurements: CAIDA's flagship Macroscopic Topology Project,
measures Internet connectivity and latency using active probing to a
stratified cross-section of the commodity IPv4 and IPv6 Internet.
Passive measurements: CAIDA collaborates with organizations that operate network infrastructure in academic, non-profit, and commercial, and dark address space to passively monitor traffic on selected links, anonymize IP addresses to allow trace sharing, and in some cases publish (close to) real-time statistics of traffic captured from these links.
User Activity: During the three years 2015 thru 2017 over 1.2 million unique visitors browsed our main website. During this time CAIDA granted 1716 researchers from 82 countries access to restricted data. The countries with the most users were the U.S. (504), China (265), and India (134). Collectively these users downloaded 179.4 TB of data. Over the same period approximately 36,000 users downloaded 136.3 TB of public CAIDA data.
Research Publications using CAIDA Data
CAIDA data provide an empirical foundation for Internet research. Researchers worldwide have used these data to publish papers in the scientific literature. We maintain lists of both publications by CAIDA researchers and collaborators, as well as publications by external researchers who report back use of CAIDA data as part of our Acceptable Use Policy (AUP).
User activity: During the three years from 2015 through 2017 we found 529 papers by external authors using data provided by CAIDA. The most-used data are our AS-relationship data (190 papers) and our Anonymized Internet Traces (166 papers). The affiliated institutions for the first authors were located in 53 different countries; the US (140 papers), China (63 papers), India (34 papers) and Germany (27 papers).
CAIDA has developed a privacy sensitive data sharing framework that employs technical and policy means to balance individual privacy, security, and legal concerns against the needs of governments researchers, and scientists for access to data in an attempt to address the the inevitable conflict between data privacy and science.