Procedure for locating publications using CAIDA data
We maintain our collection of papers using CAIDA datasets using information collected from several sources, aiming for a reasonably complete list of references.
Users of CAIDA datasets agree, as part of the Acceptable Use Agreement, to provide CAIDA with information of their publications using CAIDA data. Our Data Publication Report Page provides instructions on how to report papers most easily. To make it as easy as possible for users to include these references in their papers, the AUA includes a cut-and-paste template that can be used for this purpose. As an added benefit this template, when used, makes it easier for us to locate papers in literature searches by allowing us to use non-trivial search phrases based on the standard reference template. A typical example of this format is:
The CAIDA UCSD Anonymized Internet Traces 2012 - 2012/05/17 13:00:00 UTC
https://catalog.caida.org/dataset/passive_2012_pcap
In spite of our best efforts, we do not expect that this list of publications is complete. If you know of a publication that should be on our list, but isn't yet, we would like to hear from you (e.g. use the Data Publication Report Page mentioned above).
The following list provides an outline of all factors that contribute to our list of papers that use CAIDA data:
- A few times each year we receive an email from a user reporting a publication the way they are supposed to according to the AUA (you know who you are, and thank you very much!).
-
Twice a year (in April and October) we send out a email on the CAIDA
data-announce mailing list, reporting on new developments on CAIDA
datasets. In this email we also remind people to send information about
publications. This typically results in a few more responses.
These two factors combined contribute a few percent (< 5%). -
Also twice a year a fairly extensive literature search is done trying to
locate relevant papers. The first cut is done using
Google scholar using phrases
derived from the names of CAIDA datasets, guided by the reference
format specified in the AUA (fortunately, most users actually now do
use this format). Typical search phrases are "CAIDA anonymized internet
traces", "CAIDA passive traces", "CAIDA topology", "CAIDA skitter",
"CAIDA AS relationships", etc. The typical pattern: always include CAIDA,
then add a string that is expected to be in the reference, and is specific enough
to narrow down the number of responses to a manageable amount.
This search gives the largest number of hits. A really succesful search provides a few dozen new papers, or typically 80-90% of all hits. -
A couple of other, more explicitly (computer-)science oriented, search
engines are used to complement the results from Google scholar:
IEEE Xplore Digital Library,
ACM Digital Library,
ScienceDirect.com,
Springer, etc.
These more targeted searches do not provide many additional hits that
are not on Google scholar already (about 5-15% of the total), but they do
turn up papers in topics that lie outside computer science proper.
An example in this category is
H. Wu and G. Kvizhinadze
Martingale limit theorems of divisible statistics in a multinomial scheme with mixed frequencies
Statistics and Probability Letters 81 (8), 1128-1135, March 2011 - We also search sites of recent and upcoming computer science meetings, and skim abstracts looking for CAIDA-related items. Candidate conferences are selected from online lists.