Internet Statistics and Metrics Analysis (ISMA-97)
Workshop Report
May 1-2, 1997
San Diego, CA
Contents
The Internet Statistics and Metrics Analysis (ISMA-97) Workshop was an invitational meeting for individuals involved in developing or deploying Internet traffic measurement or analysis tools. Sixty-five (65) people attended, representing Internet service providers (ISP), the research and education (R&E) community, vendors, and end-users. The meeting was held at the San Diego Supercomputer Center (SDSC) on the campus of the University of California, San Diego (UCSD). Sponsorship was by SDSC and the National Laboratory for Applied Network Research (NLANR), with funding provided by the National Science Foundation's Division of Networking & Communications Research Infrastructure (NSF/NCRI).
The goals for the meetings included:
|
Randy Bush, Verio |
Findings and Conclusions
Participants at ISMA-97 were critical of current understanding of Internet traffic behavior and the overall inability of providers and users alike to provide real-time monitoring of internetworking traffic performance. However, the overall sense of the meeting was that each segment of the Internet community has legitimate interests and needs for assessing and monitoring varying granularities of Internet performance data. Participants emerged from ISMA-97 with a greater appreciation of the unique requirements of these diverse communities (provider, researcher, vendor, and end-user) and of the inadequacies of current tools and methodologies. As explained by Steve Feldman (Worldcom),
"After a while most of us seemed to realize 'measuring the Internet' means different things to different people. It all depends on where you are watching from, and what you're trying to find out. ...Modem connection statistics won't be too useful for backbone engineers, and your typical AOL user couldn't care less about BGP packet traces. But that shouldn't mean that it's not important to collect them; each is useful to somebody."
The need for better tools to monitor and analyze routing behavior and diagnose route flaps and related problems was viewed as the most important challenge facing internetworking today. Router vendors, according to participants, have been lax in upgrading the functionality of their products, particularly as pertains to improving network management utilities. Of notable concern is the inability of tools to explain adequately the phenomenon resulting from topology changes, e.g., routing loops or black holes. The availability of realtime topology information and data on adjacency state changes (asynchronously in a vendor-independent manner) would provide valuable assistance to providers, enabling them to detect problems (e.g., brownouts) and react in a timely manner. Current tools, such as traceroute, provide some insights but are limited due to protocol (and filtering) issues. The IP record-route option is better in consistency, but has a short horizon (9 hops). Dumping routing tables is another option, however, it is time consuming and does not provide information on 'when' a particular route changed. In the near-term, instrumenting routers to answer simple questions in a timely manner, e.g., via SNMP polling, would improve the quality and quantity of information available to network operation centers.
While many new measurement research initiatives will emerge over the coming months, ISMA-97 discussions highlighted the need for the R&E community to better articulate the goals and hypotheses of their network measurement research and to identify more precisely which problems are to be solved by these efforts. Such clarity will be critical to securing necessary partnerships with industry (ISPs and vendors).
Near-term attention to end-user requirements and tools is also imperative. There is growing demand for methodologically sound approaches to end-to-end performance measurement and monitoring/analysis of ISP services. Loss measurements being taken by DOE/SLAC on behalf of the high energy physics community and commercial ISP performance assessment conducted by Inverse Technologies were cited as examples of responsible, useful efforts at data acquisition and analysis by end-users. Approaches such as the recently released NetMedic tool were sharply criticized as lacking in scalability and technical accuracy.
As Daniel McRobb (ANS) explained:
"Users trust the tools (especially if there's nothing probably better), and that's often unwise. Some tools are inconclusive or lack fundamental rigors, and not all the users have the knowledge to interpret the tool's output. This is bad. What's worse is that I'm not certain of the rigor of any of the end-user tools I saw at ISMA, and some are even obviously lacking. I want to be able to use this data, as an end user and as a provider (esp. the latter). If it lacks the necessary rigor to draw anything but highly subjective conclusions, it's of little to use to me."
re scaling, "I don't think new traffic was the real issue here (I still question how many users would use such tools, and how often. After all, most of us just use the net; only a handful of us actually poke around when things go awry). I've never heard complaints about capacity issues w/ measurement traffic of the traceroute or ping type (and it wouldn't be a valid complaint anyway, by the numbers I have). However, tools that do hammer the network could be dangerous. I just don't think I've seen a measurement tool yet that had a significant aggregate impact. Of course, pathchar isn't widely deployed yet."
"I think the complaint I heard about scaling that I took to heart was the NET.MEDIC-like issue, where ISPs might be spammed by users from tools that contain little rigor. I'm not sure how we deal with this kind of issue as technical folks, except to tell people what they're doing right and doing wrong in terms of measurement, and keep our networks in good shape. I'm sure there's some fear just from exposure. I don't think any of the good providers are so worried about that (the louder people complain about real problems, the sooner they're fixed)."
re usability, "As a provider, I'd love to have valid end-user data. The more, the better. But if it's as disparate as sitting on hard disks on everyone's PC (note I'm assuming it's useful data), that doesn't do me (as a provider) any good. I want the data, but how do I get a large aggregate in a reasonable fashion? (of course Jamshid and others talked about some ways of moving in this direction)."
Correctly interpreting measurement data is extremely difficult, particularly for individuals who lack a thorough understanding of the underlying network topologies. This point was reiterated repeatedly throughout the workshop and used to highlight weaknesses in end-user measurement activities. According to participants, direct communications with ISPs regarding anomalies and overall measurement results should be a component of any large-scale measurement endeavor. However, this approach is not scalable except for large institutional users and measurement service companies.
Accuracy and scalability are fundamental challenges that the end-user community will need to address. As Curtis Villamizar (ANS) explained,
..."there was support for institutional end users that were making measurements to assess what they were buying, that knew what they were measuring and were systematic about their measurements, and that didn't add significant traffic load in the process of making measurement."
Current public attention is focused on tasks associated with data acquisition and related tools, yet accurate storage/warehousing, analysis and correlation of these data will be as difficult a challenge as actual measurement. These technical requirements will necessitate near-term attention and resource investments by providers, researchers (government and industry), and end-users alike.
|
One of the features we're expecting in the next round of flowswitching/flowexport additions is the ability to include the BGP next hop of the source and destination IP address of a flow. In our backbone architecture (as is generally the case), routes for customer networks are injected into the backbone IBGP mesh at the aggregation router that the customers are homed off. Thus, such data can provide a nexthop to nexthop matrix (i.e. aggregation router to aggregation router matrix), or perhaps nexthop to foreign peer AS. Obviously this gets you the "city" granularity (better, in fact). ...from an ISMA-97 ISP |
Other areas identified by ISPs which hold promise for improving their ability to engineer and operate the Internet include:
As with many meetings of this type, the hallway discussions were among the most important features of ISMA-97. Several potential new collaborations were brought to the attention of workshop organizers during and following the workshop. Several individuals also indicated that they plan to make specific modifications to their tools and/or changes in their measurement methodologies as a result of insights gained at the workshop. Increased use of Cflowd and OC3mon/Coral flow tools by participating ISPs and large institutional participants also appear likely, as does testing of new research tools (particularly tcpanaly and pathchar).
A summary of the ISMA-97 sessions and discussions and links to specific presentations is available at http://www.caida.org/workshops/isma/9705/isma97_sessions.html.
ISMA-97 was an invitational meeting, targeting participation by select Internet engineers possessing hands-on experience in Internet traffic measurement and analysis. Sixty-five people representing ISPs, the R&E community, vendors and end-users attended the meeting, see http://www.caida.org/workshops/isma/9705/isma97_participants.html. The high caliber of these individuals and the limited attendance were essential ingredients to ISMA's success.
Unfortunately, the workshop organizers had to turn away many capable and qualified individuals who expressed interest in attending the meeting. We would like to apologize to these individuals and urge the community to continue discussions of traffic measurement and analysis and related topics given their importance to the continuing evolution of the Internet.
Demos
Twenty (20) measurement / analysis tools were demonstrated during ISMA. ISMA's goal for these demos was to bring together tool developers and users in the hopes that constructive criticisms would foster broader understanding of existing capabilities and limitations and would identify potential areas for improvement. Feedback from participants suggests that the demos were beneficial, but that future meetings should include an opportunity for a brief summary overview of available tools prior to actual demonstrations.
In advance of ISMA-97, organizers distributed a survey questionnaire. The purpose of the questionnaire was to provide background information on the concerns and priorities of the participants and to assist in framing discussions during the workshop. The majority of the participants responded to the survey. Their inputs are grouped according to the community they represent: commercial vendors, ISP/exchange points, or R&E community. The questions asked in this survey are included below:
Survey responses are available at www.caida.org/workshops/isma/9705/isma97_survey.html and a codification of the responses in aggregate is at the bottom of the summary page: http://www.caida.org/workshops/isma/9705/isma97_sessions.html.
ISMA Mail List/Other
Several relevant mailing lists discuss traffic measurement and analysis tools. The ISMA mail list (isma@caida.org) can be subscribed to by sending a request to isma-request@caida.org. The IETF's IPPM working group mail list is also a valuable source of information in this field, see www.advanced.org/IPPM.
A preliminary copy of the Cooperative Association for Internet Data Analysis' (CAIDA) Measurement Tool Taxonomy was distributed to ISMA-97 participants as background prior to the meeting. This living tool taxonomy provides an overview of Internet and TCP/IP performance measurement tools and efforts. Its development is sponsored by the National Science Foundation (NSF) and CISCO. The taxonomy is available at www.caida.org/tools/taxonomy.
Workshop Organization/Acknowledgements
ISMA-97 was organized by Tracie Monk and k claffy (UCSD/NLANR/CAIDA).
Many thanks to the staff at SDSC and UCSD who contributed to ISMA's success. In particular, we would like to thank Charlotte Smart (SDSC) for managing the logistical arrangements and Jay Dombrowski (SDSC) for managing the demonstrations of IP and ATM-based measurement and analysis tools and coordinating the MBONE arrangements.
Last Updated 14 June 1997
Questions or comments should be directed to Tracie Monk at tmonk@caida.org.