Internet measurement data management challenges
Effective Internet measurement raises daunting issues for the research community and funding agencies. Improved understanding of the structure and dynamics of Internet topology, routing, workload, performance, and vulnerabilities remains disturbingly elusive, in part for lack of realistic and representative datasets available to scientific researchers. The dearth is understandable; measurement of operational Internet infrastructure involves managing more complex and interconnected dimensions than measurement in most scientific disciplines: logistical, financial, methodological, technical, legal, and ethical. CAIDA has been navigating these challenges with modest success fifteen years, collecting, coordinating, curating, and sharing data sets for the Internet research and operational community in support of Internet science. Our three current biggest challenges which we hope to explore at the workshop are: sustainable collection, curation, and storage of large volumes of data; privacy-respecting sharing; and long-term archiving for reproducibility.