Internet data acquisition and analysis: Status and next steps
Most large providers currently collect basic statistics on the performance of their own infrastructure, typically including measurements of utilization, availability, and possibly rudimentary assessments of delay and throughput. In today's commercial Internet, the only baseline against which these networks can evaluate performance is their past performance metrics. No data or even standard formats are available against which to compare performance with other networks or against some baseline. Nor are there reliable performance data that users can use to assess the performance of providers. Data characterization and traffic flow analysis are also virtually non-existent at this time, yet they remain essential to for understanding the internal dynamics of the Internet infrastructure.
Increasingly, both users and providers need information on end-to-end performance and traffic flows, beyond the realm of what is realistically controllable by individual networks or users. Path performance measurement tools enable users and providers to better evaluate and compare providers and to monitor service quality. Many of these tools treat the Internet as a black box, measuring end-to-end characteristics, e.g., response time and packet loss (ping) and reachability (traceroute), from points originating and terminating outside individual networks. Traffic flow characterization tools focus on the internal dynamics of individual networks and cross-provider traffic flows, enabling network architects to: better engineer and operate networks, better understand global traffic trends and behavior, and better adopt / respond to new technologies and protocols as they are introduced into the infrastructure.
This paper has three goals. We first provide background on the current Internet architecture and describe why measurements are a key element in the development of a robust and financially successful commercial Internet. We then discuss the current state of Internet metrics analysis and steps underway within various forums to encourage the development and deployment of Internet performance monitoring and workload characterization tools. Finally, we describe the rationale and near-term plans for the Cooperative Association for Internet Data Analysis (CAIDA).