Our activities for this project include collection, curation, hosting, and distribution of active and passive Internet measurement data as well as providing advice on technical, legal, and practical aspects of PREDICT policies and procedures.
Funding source: DHS S&T cooperative agreement FA8750-12-2-0326. Period of performance: September 28, 2012 - September 27, 2013 (optional till September 27, 2017).
Statement of Work
CAIDA provides fundamental research on a reasonable efforts basis and in accordance with UC policy. We shall accomplish the following data providing, data hosting, and project support tasks:
I. Providing Data
CAIDA will provide the following data to PREDICT:
| Dataset Name | Dates | Content | Notes |
| IPv4 Routed /24 Topology | 2008 - ongoing | forward IPv4 paths, reply Time-to-Live (TTL), Round-Trip-Time (RTT), and ICMP responses | Measured from Ark platform |
| IPv4 Routed /24 DNS Names | 2008 - ongoing | Fully-qualified domain names for IP addresses in the IPv4 Routed /24 dataset | Uses a custom-built bulk DNS lookup service |
| IPv6 Topology | 2008 - ongoing | forward IPv6 paths, TTL, RTT, and ICMP responses | Measured from Ark platform |
| Internet Topology Data Kits (ITDK) | 2010 - ongoing | Router-level topology data, router-to-AS assignments, geographic location of each router, DNS lookups of all observed IP addresses | Derived from Ark measurements once every 3-6 months |
| Active Internet Topology Measurements with Skitter | 1998 - 2008 | IP paths, RTT | Legacy data |
| Near Real-time UCSD Network Telescope Data | 2001 - ongoing | IP packets in PCAP format | The most recent 2 months of data |
| Archived samples of Near Real-time UCSD Network Telescope Data | 2001 - ongoing | IP packets in PCAP format | Archived periodically, TBD |
| OC48 Peering Point IP Packet Headers | 2002 - 2003 | Three packet traces in PCAP format captured in 2002-2003 | Legacy data |
Tasks
| 1 | Describe the physical, logical and functional configuration of the collection mechanism for datasets provided to PREDICT |
| 2 | Document and implement a process for managing datasets listed above for dissemination via the PREDICT legal framework |
| 3 | Support the maintenance of the PREDICT data catalog (hosted by the PREDICT Coordinating Center, PCC) consistent with the data collection and data curation tasks |
| 4 | Describe any restrictions on the use of any datasets provided to PREDICT including international dissemination |
| 5 | Describe efforts and methods employed to ensure that the data is legally collected and compliant with privacy laws, anonymization techniques (if applied), and appropriate disclosure control processes |
| 6 | Describe and implement any Institution Review Board (IRB), or ethics review processes related to dataset requests, including nominal timelines, the issues to consider, and the expected frequency of reviews |
| 7 | Provide a risk analysis of any dataset provided to PREDICT that addresses federal, state, local, and international laws relevant to the collection and dissemination of the dataset, as well as any ethical issues |
| 8 | Describe how data collections are planned to evolve as devices, architectures, and protocols evolve |
II. Data Hosting
Tasks
| 1 | Provide a data hosting infrastructure to support the PREDICT project. |
| 2 | Describe any expansion plans for hosts and bandwidth to be needed as a result of traffic growth. |
| 3 | Describe scenarios and processes for the dissemination of data that may occur via media. |
| 4 | Provide a description of data hosting infrastructure to be employed in the performance of this SOW (hardware, software, logical configuration, and mirroring or redundancy equivalence) |
| 5 | If willing to host external data, indicate the availability of infrastructure and provide a plan for curating data from external sources. |
III. Project Support
Implement and document PREDICT project support, as follows:
Tasks
| 1 | Attend PI Meetings (not to exceed 3 times per year) and provide an on-site venue for hosting PI Meetings (as requested). |
| 2 | Provide regular status briefings, participate in monthly teleconferences, project planning efforts, program reviews, and other technical interchange meetings. |
| 3 | Collaborate with other PREDICT project participants on the establishment and monitoring of project-level metrics to describe the utility of the provided and hosted datasets, evaluate the dataset popularity, track the growth of the data collection, and other. |
| 4 | Describe how UCSD will meet PREDICT goals for reviewing, negotiating, and executing PREDICT legal documents including Data Provider and Data Host Memoranda of Agreement (MOAs) with PCC. Discuss the availability of decision making legal support and any other required entity needed to respond to these documents consistent with UCSD policy and within seven (7) business days. |
| 5 | Support the PREDICT ARB activities to screen researcher organizations, referring organizations and researchers for legitimacy of purpose and intent of PREDICT data use. |
| 6 | Describe and implement a plan to publicize the availability of the data provided via PREDICT. |
| 7 | Support project outreach via Government approved workshops, conferences, and other technical forums. |
Deliverables
In the course of the project, CAIDA shall provide the following deliverables:
| Deliverable | Date | Status | |
| 1 | Project Management Plan | November 15, 2012 | done |
| 2 | Hosting Infrastructure Description Document | November 15, 2012 | done |
| 3 | Technical Status Report | quarterly | |
| 4 | Financial Status Report | monthly | ongoing |
| 5 | Briefings and Research Papers | as available | |
| 6 | Final Report | upon completion of the project |
![[CAIDA - Cooperative Association for Internet Data Analysis logo]](/images/caida_globe_faded.png)