Report from the
ISMA Network Visualization Workshop
Introduction and Goals
On April 15-16, 1999 CAIDA hosted an Internet Statistics and Metrics
Analysis (ISMA) workshop on Network Visualization. The meeting engaged
researchers and practitioners experienced in both the visualization and
networking fields in discussions aimed at identifying:
Participation at the workshop was by invitation-only based on individuals' current involvement in these topics. Approximately 55% of attendees were from the research and education fields; the remainder represented Internet service providers (ISPs), network access points (NAPs), or vendors. The meeting was held at the San Diego Supercomputer Center (SDSC) on the campus of the University of California, San Diego (UCSD), and was sponsored by the Cooperative Association for Internet Data Analysis (CAIDA) and funded by the National Science Foundation (NSF) under grant number ANI-9996248.
For example, the Multi-Router Traffic Grapher (MRTG) tool was cited as a useful visualization tool for ISPs. Its simplicity, cross platform functionality, and web interface, and the fact that it is free, are key factors to its popularity. Perhaps most attractive is its flexibility: it can be used to graph utilization of network segments, but it is just as useful for graphing NNTP transfer error occurrences on a news server. These sorts of powerful, narrowly-scoped tools that can be adapted to many purposes are really what ISPs, and perhaps IS organizations in general, can really use. Commercial offerings, such as Hewlett Packard's OpenView, are generally expensive often fail to address backbone ISP requirements for flexibility and comprehensive network/data coverage. According to ISP and NAP representatives at the workshop, the visualization tools that they require would include functionality to:
An example of a tool of this sort that does not yet exist is a pared-down C implementation of a (probably receive-only) BGP stub. This tool would run on a host, establish peering sessions to each router for each outbound peer-groups, and get a real-time feed of the sets of routes that each of my routers were advertising to each of their peers. Such a tool could alarm problematic changes in a simple way. But the point is that ISPs do not need a polished package with a fixed and limited set of high-level features as much as building-blocks that they could integrate into tools of their own (just like MRTG).
Participants concluded that different communities have different needs for visualization of networking data. Attendees identified several critical visual data types required by specific groups, including:
While logical depictions of networks are generally of the greatest use to engineers and researchers alike, geographic structure can be helpful. Several participants cited examples of logical depictions that were questioned by management and others based on simple layout features such as placement of East coast nodes on the left side of topology maps. Cartographic principles have a long standing in human interpretation of visual mappings. Incorporating cartographic features, where appropriate, can assist in making Internet traffic more comprehensible to laypersons. Being able to navigate between logical and geographic depictions, as exemplified by the Plankton tool's depiction of the global cache hierarchy, can serve similar purposes.
Researchers and network planners are in particular need of tools to facilitate modeling and simulation of various real-world and theoretical data sets. Graph layout, fast rendering, and correlation/drill down features are of great importance to these groups. Examples of these kinds of tools include nam and MakeSys' products. Several participants noted that they are licensing Tom Sawyer's graph layout code for their tools. Visual diagnostic tools, such as VisualRoute and MakeRoute, also have promise, however the general unavailability of latitude and longitude data in the DNS records of Internet routers and hosts [RFC1876] undermines the accuracy or usefulness of these tools.
The adequacy of existing visual metaphors in describing Internet networks and traffic behavior merits continued evaluation by the community. However, participants agreed that the near-term priority will continue to be (1) developing better means of gathering and analyzing data and (2) better mapping of this information to existing visual metaphors and better linkage between alternative visualizations (maps, layouts, plots, histograms, 3-D images, etc.) and summary data. Key data sources of these visualizations currently include:
Several participants described the usefulness of the MBONE for testing and validating both analytic methods and visualization techniques due to its relatively small size (compared to the global Internet), as well as the heterogeneous nature of its data sources, multitude of administrative domains, and its experimental character. The user-defined topology of the Mbone is also somewhat more conducive to data collection without requiring coordination/cooperation with ISPs.
Based on suggestions from participants, CAIDA staff agreed to host a website that will include various datasets for experimentation and visualization. Organizations using the data will be encouraged to share results of their visualizations with others in the community. Initial datasets will be posted at http://www.caida.org/analysis/ and will include:
Workshop participants identified the following topics as meriting attention by users, researchers and/or vendors:
Sessions and Presentations
The sections that follow describe highlights from the workshop and
individual presentations. Recurring themes discussed by these
individuals and other participants included: visualizing and scaling
large datasets, management of IP networks,
enhancing existing tools, and developing new tools/analysis
in specific areas. Participants discussed problems faced by ISPs, what
tools are in use now, what is currently working or not working, and
where they think visualization should go in the future.
Bernard Pailthorpe (SDSC) described examples of scientific visualization techniques developed at the University of Sydney Visualization Lab in Australia and SDSC. These included various techniques for weather modeling, San Diego Bay modeling and mapping, HIV modeling, adaptive radar array imagery, and Visible Human visualizations. Challenges facing researchers in the National Partnership for Advanced Computational Infrastructure (NPACI) led by SDSC include rendering of extremely large datasets (terabytes in size), developing visualization and related interactive environments toolkits for researchers, and developing tools that help identify the relevant information of the large datasets.
Mike Bailey (SDSC) explained that, in his opinion, we are "losing the data wars" -- faster computers are producing more data, creating immense databases for analysis, but visualization software cannot keep up with those data sizes. Filtering and culling of data are increasingly critical for all disciplines. Feature detection is critical to determining the importance of select datasets since it assists in prioritorizing regions of interest. According to Bailey, an adjacency pixel map scales well with large datasets and may provide better info and point out interesting results. Gaussian curvature analysis is another technique used to partition volumes (or find interesting aspects of the data) and can be further manipulated to include additional parameter dimensions.
Bill Cheswick (Lucent/Bell Labs) is actively monitoring and mapping the connectivity of the Internet. Every morning he scans 10% of a 90,000 node dataset; once a month he scans the entire dataset. In his presentation, there were visualizations of outgoing traceroutes to these nodes (roughly 160 top-level domains) using gradient descent techniques. These maps, oriented from the perspective of the source host, demonstrate concentrations of certain networks and changes in connectivity over time.
Stephen North (AT&T Research) described methods of graph layout utilized within AT&T. His group focuses on interactive 3D maps and developing novel visualization metaphors and algorithms for large graphs. Graph layouts using hierarchical trees, dags, orthogonal layouts, and circular layouts are effective because they control eye motion, avoid artifacts, emphasize regularity and symmetry, and use data pixels efficiently. According to North, overlays and crossovers are bad because they tend to suggest connections that may not exist. Portable, modular tools that do specific things well are important for the research field. Finally, he considers the priority research tasks to be: identifying what data needs to be shown, collecting that data efficiently, and maintaing a focus on tools rather then vertical applications, scale, and metaphors.
Evolving Visual Metaphors
Carl Malamud (Invisible Worlds) is exploring means of communicating visual information on the Internet to end users, including topology and performance information. People think in spatial terms, he explained, therefore we should develop means of portraying physical and behavioral aspects of the Internet in ways that are meaningful to users. ISPs in turn will benefit from a more informed customer community.
Martin Dodge (UCL) discussed the history of cartography and its applicability to the Internet. Dodge provided specific examines of maps used in describing physical infrastructures, and the application of old and new visual metaphors to descriptions of the Internet infrustructure. Specific examples were draw from the Cybergeography website.
ISPs and NAPs
Steve Feldman (MCI Worldcom) elaborated on the need for concise, usable visualizations of traffic data in managing NAPs. First, developers and researchers must determine who needs the visualization. Engineers use tools to monitor network performance, analyze trends, create topologies, and run simulations, operators for monitoring net status and fault isolation, and marketing departments for marketing the ISP. Different tools are needed for the different roles in running a network; there's no one-size-fits-all tool.
Second, what happens in a NOC helps determine which tools are useful. Technicians wait for something to break by watching status displys or waiting for customer complaints, then they try to fix the problem using diagnostic tools; if they understand the problem, they fix it; otherwise the page the next level of support. Currently, useful tools for network management include commercial packages like Netcool (an alarm consolidation/display tool) freely available building blocks such as MRTG, and powerful scripting languages like scotty/tkined and perl. There was a discussion of pricing and availability for viz tools (and network management tools in general), which included concern that commercial vendors don't build ISP-specific tools because the market is too small to recover development costs. One participant noted that ISPs already pay large sums for HP Openview and similar tools, prices that are still in the noise compared to equipment and bandwidth costs.
In summary, Steve emphasized the two-sided challenge: for both researchers/developers and engineers/NOC technicians to communicate with eachother about what is really needed before tools are built. One interesting conflict was that a lot of ISPs actually still want Unix rather than PC/Windows platforms, in contrast to the emphasis of most tool development efforts represented in the room.
Linda Leibengood (UUNet Technologies, previously ANS) described what UUNet measures, how the data is presented and used, and the value of traffic visualizations. Packet loss, RTT measurements, and SNMP utilization statistics are the critical data fields. Real-time alerts are reported in text for high impact loss, daily text reports aggregate metrics, loss and RTT are shown in matrices, while weekly plots of aggregate metrics are used for trending information. From an ISP perspective, they need tools that scale from 50 nodes to the upper bound of 1000-2000 nodes. Tools are needed both for work in real-time and recent trending (1-3 months) to study RTT, latency and packet loss. Visualization tools would be helpful to highlight the sources, help with loss patterns due to topology, and help see beyond individual data points. ISPs tend to avoid using network visualization tools because of support, rather than install, costs (or because they run on specialized high end hardware platforms that are not typical in NOC infrastructure).
Shankar Rao (Qwest) described the need for a link between the network visualization tools and their contribution to the conduct of business at an ISP. Higher order services such as voice over IP and multicast are illustrative of services that will require more advanced tools to monitor metrics such as jitter and packet loss. ISPs need enhanced tools to support increasingly sophisticated networks. Visualization tools for peering and performance monitoring could also help ISPs communicate information to end users, particular those engaged in Service Level Agreements (SLA).Bill Woodcock (Zocalo) (not present at workshop but involved in discussions and mailing list) noted that one difference in expectations that ISPs encounter is between users who are primarily serving data, who want to aggregate a large number of streams of user data together into one pipe, and keep it as full as possible, versus users who are primarily consuming data, who want the full availability of their pipe on an instantaneous basis for a single stream. It is difficult for ISPs to determine which type of availability each customer is looking for at any time, and it is difficult to distinguish between single-stream performance and aggregated-stream performance, under some circumstances. This issue will eventually manifest itself in terms of QoS differentiation.
Arne Frick (Tom Sawyer) discussed graph layout tools and their layout styles. The network visualization tool offered by Tom Sawyer is an automatic graph layout package with good scaling and interactive graph navigation properties. The toolkit also showcases four layout styles: circular networks, hierarchical dependencies, orthogonal modeling and symmetric communications. There are no restrictions on graph topology and the packet supports flexible integration of layout techniques that do not hinder the speed of the program.
Stuart Levy (NCSA) gave an operational perspective on what tools would be useful for visualizing network parameters. His group focuses on 3-D interactive graphics tools and believes that some metaphors would work well in that environment. The data analysis involved is largely comprised of bandwidth, capacity, delay, loss, derived data and simulations over time.
Greg Staple (TeleGeography) is developing tools that are engaging to the user, but is still trying to figure out the relevant applications for them. At a time when we are still deciding if the Internet should be public or private, visual images could be an excellent resource lending support to specific arguments. In his opinion, we cannot get enough bandwidth, a majority of the bandwidth is yet to be laid, and the last-mile details are the areas in which we should be focusing our time and energy. His company is interested in access to tools that visualize bandwidth usage and flow analysis, since he considers flows to be determining policy, (i.e. money flows, e-commerce flows, etc.). There is also an interest in how traffic flows map onto commercial flows, and whether or not there is any correlation or influence among them.
Stephen Eick from Visual Insights gave a preview of the commercial Advizor visualization toolkit featuring several components that can be linked to each other, including circular graph layouts, scatterplots, bar and line charts, histograms, and parallel coordinates. Rob Rice briefly described the network planning tool from Make Systems. Jerry Jongerius , developer of VisualRoute, a tool that uses pings to perform trace routes. Bryan Christianson of IHUG, has a tool that runs on a Mac OS for trace routes called WhatRoute.
kc claffy (CAIDA) focused on four areas of Internet measurement: topology, workload characterization, performance evaluation and routing. Many macroscopic infrastructure data sets can be visualized using CAIDA tools, such as skitter, otter, and manta. According to Claffy, the research priorities for topology visualization are being able to depict latency, identify key routers or networks, aggregate into AS granularity (i.e, graph inbound and outbound traffic volume as function of source/destination AS and of entry/exit point from local network), and render geographic representations of data. Obstacles to visualization priorities are the inability to map with strong precision IP addresses to any useful entity, e.g., router, geography, AS, ISP. Claffy emphasized the importance of researchers' need for flexibility in both the data collection tools and visualization tools.
Tamara Munzner (Stanford) gave an overview of the tool H3 that she is developing for network visualization. Munzner chose the hyperbolic sphere to display real networking data for several reasons. The distortion layout and navigation creates a hyperbolic metric space in which the entire volume of data is visible from an external viewpoint. There are a few trade-offs when using this format compared to other metaphors for visualization. Possible future work includes incremental layout alternate circle packing schemes, and disk-based support for processing graphs too large for main memory.
John Heidemann (ISI) presented images of packet traffic using the network simulator/animator package ns/nam. Ns/nam are useful for protocol debugging, visual recognition of patterns and communication of transport protocol concepts. They are easy to use, supporting actions that are repeatable and therefore easy to analyze, and do not require much disk space. These tools are targeted for people designing new protocols. Future directions include providing converters from other packet trace formats and support for larger graphs.
Prashant Rajvaida from UCSB showed the network visualization tool, Mantra. This tool monitors multicast traffic and depicts global trends by collecting routing table information once an hour. In his opinion, the Mbone is a perfect domain for applying visualization tools because it is outside the realm of ISP's internal infrastructure, therefore offering easier access to data, and involves a lot of comprehensive data confined to one environment, but not so much to render geographic depictions prohibitive. The Mbone is also a domain that is in desperate need of decent management tools, visually-based or otherwise.
last edited 26 may 1999