Skip to Content
[CAIDA - Cooperative Association for Internet Data Analysis logo]
The Cooperative Association for Internet Data Analysis
www.caida.org > data : passive : passive_2007_dataset.xml
The CAIDA Anonymized 2007 Internet Traces Dataset

|  Data Sources:    Realtime Monitors    Passive    Active    Other    External  |

This page describes the CAIDA Anonymized 2007 Internet Traces Dataset. We don't give this dataset out anymore, since we think more relevant datasets are available from CAIDA and elsewhere.

The CAIDA Anonymized 2007 Internet Traces Dataset

This dataset was retired on 11 January 2010, and is no longer available

This dataset contains anonymized passive traffic traces from CAIDA's AMPATH monitor on an OC12 link at the AMPATH Internet Exchange during the DITL 2007 measurement event. The payload has been removed from all packets.

CAIDA only operated one passive Internet trace collection monitor in 2007, we expanded our monitoring capabilities in 2008 and strongly encourage people to use data in the Anonymized 2008 Internet Traces Dataset, which we think contains traffic traces more representative of current Internet traffic. Information on other CAIDA datasets is available on the data section of our website, and by searching in DatCat, the Internet Measurement Data Catalog.

These traces can be read with any software that reads the pcap (tcpdump) format, including The CoralReef Software Suite, tcpdump, Wireshark, and many others.

Data Use Restrictions

  1. The anonymized traffic traces will not be distributed beyond authorized users.
  2. I will notify CAIDA of the names and email addresses of any persons (and their respective affiliations) assisting me in research using the anonymized traffic traces. This includes graduate students and interns.
  3. The IP addresses in these traces are all anonymized to preserve the privacy of end users (hosts) and networks monitored in the collection of the data. The anonymization is prefix-preserving; if the original IP addresses had N bits in common, the anonymized addresses will have those same N bits in common. The traces in a dataset are all anonymized with the same key, so one original IP address that appears in multiple traces in a dataset will appear as the same anonymized IP address across those traces. In so far as possible, privacy of end users (hosts) and networks monitored in the creation of these traces will be respected by the researchers. Researchers will make no attempts to reverse engineer, decrypt, or otherwise identify the original IP addresses collected in the trace. Researchers will also not attempt to extract unanonymized IP addresses from encapsulated headers. Researchers will make no attempts to connect to, probe, or in any other way initiate contact with a machine or machine administrator identified via the anonymized traffic traces.
  4. Anyone who publishes a document (including web pages and papers) that uses data from this dataset must provide CAIDA with a copy of the publication and must cite:
    The CAIDA Anonymized 2007 Internet Traces - Jan 2007, Colleen Shannon, Emile Aben, kc claffy, Dan Andersen, http://www.caida.org/data/passive/passive_2007_dataset.xml
  5. All users are encouraged, but not required, to include the following attribution in their acknowledgments section:
    Support for CAIDA's Internet Traces is provided by the National Science Foundation, the US Department of Homeland Security, and CAIDA Members.
  6. All users who create a publicly available presentation using data from this dataset must provide CAIDA with a copy of the publication and must use the full name of the dataset ("The CAIDA Anonymized 2007 Internet Traces") in the presentation. Users are encouraged, but not required, to include the url for the dataset (http://www.caida.org/data/passive/passive_2007_dataset.xml).
  7. At the end of the research, or semi-annually (whichever is less), a summary of the research and any findings/conclusions will be reported to CAIDA. If any research is described on the WWW, a URL will be provided. This information is primarily used in reports to our funding agencies.

More Information

  Last Modified: Tues Feb-23-2010 11:32:17 PDT
  Page URL: http://www.caida.org/data/passive/passive_2007_dataset.xml