Report of the NSF-sponsored workshop on
Internet statistics measurement and analysis

19-20 February 1996
San Diego Supercomputer Center

image, sorry

Contents
Background
Introductions (Mark Garrett, Hans-Werner Braun)
Meaningful Internet measurement
(Vern Paxson, Walter Willinger)
Measurement infrastructure
(Vern Paxson, Guy Almes, Nevil Brownlee, Robert Moskowitz)
Flow characterization and control (Craig Partridge, Peter Newman, Joel Apisdorf, K Claffy)
Scaling (Dennis Ferguson, Mark Garrett)
Quality of service/pricing (Roger Bohn)
Where to go from here (Walter Wiebe)
Privacy and security (Carter Bullard)
How and what statistics will help (Mark Garrett)
Federal perspective (Mark Luker)
Epilogue (post-mortem reflections)

ISMA '96 workshop home page
ISMA '96 workshop agenda
relevant materials


Background

On February 19-20, 1996, with the support of the National Science Foundation, NLANR held a workshop at the San Diego Supercomputer Center to discuss the current and future state of Internet measurement and analysis. The intent of the workshop was to facilitate discussion among communities of academia, equipment vendors, and service providers, who share an interest in and incentive to understand one another's interests and concerns with Internet statistics and analysis.

The existence of the NSFNET (1986-1995) as a `central network' for the research and education community facilitated research into aspects of aggregate network traffic patterns and the anomalies in those patterns caused by the introduction of new or unique applications. Decommissioning the NSFNET backbone has left the Internet community with no dependable public source of statistics on Internet workloads. And yet empirical investigations of the nature of current workloads and their resource requirements, as well as how they change over time, is vital to supporting Internet evolution. Workload profiles are changing more rapidly than ever before, keeping pace with them in an increasingly competitive, increasingly proprietary environment, is even more important now than during the life of the NSFNET backbone.

The objective of the workshop was to identify, elicit, and evaluate the cross sections of interest, goals, willingness, and technical capability to facilitate the measurement and dissemination of workload and performance statistics in the new distributed environment. Critical to the discussions was the presence of those who can set policy for design and configuration of Internet components: vendors who can implement statistics collection mechanisms in network equipment, service providers who can ensure the effective deployment of those mechanisms to support their short and medium term engineering requirements, and researchers who can direct their investigations toward areas that serve more immediate and as yet unaddressed concerns of Internet stability.

Attendance at the workshop was limited, and attendees were expected to present or formulate position papers to participate. The workshop was organized by K. Claffy, Mark Garrett, and Hans-Werner Braun. This report is derived from minutes taken by various contributors at the workshop.

This material on this report is partially based on work sponsored by the National Science Foundation under NSF grant No. NCR-9530668. Any opinions, findings and conclusions or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or other participating organizations.

Session 1: Introductions
(Mark Garrett, Hans-Werner Braun)

In the first half hour attendees introduced themselves and described their role and interest in participating. The group included representatives from several interest groups: ISPs, NAPs, vendors of both IP and ATM equipment, researchers, and government. Hans-Werner Braun then presented briefly on the scope and purpose of workshop. The agenda focused on both technical and policy aspects of traffic measurement: what Internet problems need solutions that require measurements, what kind and where they need to occur, and what are the obstacles to collecting and making the data available to the research community on a continuing basis. He expressed an interest in seeing, as a result of this workshop, a specification of a minimal set of metrics conducive to use in a competitive environment where traffic traverses many service providers en route to its destination.

Braun focused on several pressing needs:

The Internet, not having been architected for the exponential growth that it now experiences, acutely needs a mechanism for more rational cost recovery, that is, more accurate accountability for resources consumed, than current technology supports. In particular, the Internet architecture is not prepared to deal with the large aggregation of flows it handles now if a significant number of those flows are several orders of magnitude higher volume than the rest. Braun presented data to illustrate one of the major concerns of workload profiles today: the disparity in size between most current Internet flows/transactions, at less than 10 packets, and newer multimedia applications with much higher volume and duration.

The disparity in workload profiles in the current cross-section of Internet applications necessitates revised metrics for describing network behavior. Simple mean or peak utilization figures are ineffective in addressing a service provider's engineering needs, without also knowing the transaction profile constituting and perhaps dominating those figures. Keeping track of workload profiles requires measuring flow data at relevant network locations. (NLANR currently supports operational collection of such data at several points, including data from the FIX-West multiagency network interconnection facility.)

More accurate resource consumption, and concomitant pricing models, will allow progress with another severe need in the current infrastructure: a service architecture from the perspective of the end user. As we will discuss later, communities of users are now frustrated enough with the current quality of their Internet service that they are seeking or trying to develop independent third-party mechanisms for measuring and verifying service. ISPs, while perhaps not fond of an effort to develop metrics to rate their service, will likely find it in their best interests to participate in the process rather than ignore it.

Braun noted several factors leading to the current Internet environment, which though at one time was amenable to modeling as a circuit problem, may now be more reasonably attended to as a biology problem. The current economic structure of the market, devoid of rational cost models, contributes to the overwhelming demand that ISPs race to meet, leading to the growth we see threatening global stability. Time for network analysis is scant, typically devoted to the first and third of three separate strains:

Although it seems clear that greater attention to the second category of study, operational workload profiling, could prevent many instances of the third category, ISPs cannot typically afford the luxury of spending time on the second category. Braun hopes the workshop provides a forum from which to begin greater integration of these activities, both within and across multiple ISPs, i.e., mechanisms for collaboration and cooperation in deriving workload and performance information, both with each other as well as the user community,

Mark Luker, current NSFNET program manager, offered a description of what NSF would like to see come from the workshop. He hears increasing feedback from within the R&E community that their work requires more reasonable service that the Internet can now provide. Simply adding capacity is not a solution, since R&E connectivity now involves multiple service providers, none of whom can add capacity as fast as their customers can consume it. In addition, they need different kinds of service, not available before, for new applications. Luker also announced that NSF would soon release a New Connections program solicitation for high performance R&E Internet connections, where meritorious science and engineering projects would receive bandwidth resources above and beyond what is commercially available. (The NSF released the solicitation a few weeks after the workshop, mid-March 1995.) Proposers would negotiate with their campus providers, ISPs, and the NSF vBNS, on how to secure high performance for such projects.

Session 2: Meaningful Internet measurement
(Vern Paxson, Walter Willinger)

Vern Paxson of LBL presented his view of why meaningful Internet measurement is painfully hard, starting with a list of several as yet unanswered questions that require Internet measurement:

As an example, Paxson showed a graph of Usenet (netnews) traffic from 1984 to 1994, indicating 75% per year growth. The graph ended in late 1994, when the measurement software broke. The sheer size and diversity of the Internet requires measuring of a given metric at multiple sites to capture realistic variety, and yet the workloads are highly variable and changing underneath the measurements. Researchers are desperate for invariants, representative measurements, and parsimonious models (i.e., those with few parameters, and some hope of scalability to a global realm).

Walter Willinger of Bellcore presented a complementary view on why there is hope for meaningful Internet traffic measurements. He gave the good news: Although Internet traffic does not exhibit Poisson arrivals, the cornerstones of telephony modeling, number of researchers have measured a consistent thread of self-similarity throughout all of his studies from a variety of networks, including samples of SS7, ISDN, Ethernet and FDDI LANs, backbone access points, and ATM traffic. Self-similarity can derive from the aggregation of many individual, albeit highly variable, on-off components. Self-similarity implies that a given correlational structure is retained over a wide range of time scales. Although self-similarity is a parsimonious concept, it comes in many different colors, and we only now are beginning to understand what causes it.

The bad news about self-similarity is that it is a significantly different paradigm that requires new tools for dealing with traffic measurement and management. Load service curves (e.g., delay vs. utilization) of classical queueing theory are inadequate; indeed for self-similar traffic even metrics of means and variances indicate little unless accompanied by details of the correlational structure of the traffic. (The technofolks in the room got off on a bit of a self-similarity tangent, basically trying to address anecdotal data from backbone providers.), but there was strong agreement that answering their questions and refining the models toward greater utility for network engineering would require traffic measurements from a wide variety of locations.

Mark Garrett summarized several points of the discussion:

  1. engineering without measurement is dangerous,
  2. new models (theory) and data are needed,
  3. the federal government should encourage and consider funding projects that leverage implications of recent theoretical and empirical work
  4. we should seek out measurement infrastructure and sources of statistics in the commercially decentralized Internet

In transition to the next topic the group discussed existing sources of Internet traffic measurements. Last year Paxson et al led the establishment of the Internet Traffic Archive (ITA), and the associated moderated discussion list. The ITA holds traces as well as software for manipulating them. Other sources of traffic information are few and far between, mostly due to ISP privacy concerns. The few ISPs that do maintain statistics are either unable to release them due to customer privacy concerns, or unwilling to share them since they reveal competitive information.

There are even ISPs who consider customer privacy so critical, and any statistics collection at all a violation of it, that they refuse to track customer data in any way. Bill Schrader of PSI eloquently articulated this position in his response to the invitation to the workshop. Although there is some sketchy legal framework for traffic monitoring for engineering purposes, this legislation does not extend to making any such information public. However, the integrity of the Internet may depend on balancing the needs for data analysis with this recognized sensitivity to privacy issues. To support this balance, researchers themselves have developed tools to encode address information while retaining the structure needed for realistic investigations.

Regarding the latter problem of disclosing data that could help competitors, Sean Doran of Sprint noted that he does not feel any competitive threat in the public Internet market, since there is more than enough demand to share for the foreseeable future. Most ISPs are barely managing to keep up with demand. Indeed, for an ISP that also leases private lines, the uncoordinated public Internet is a great business generator, since it requires all competitors to run a decent network to ensure connectivity. Unlike the public Internet, where demand is overflowing, the market for leased lines is competitive, and thus there is no incentive to cooperate with other ISPs with whom one also competes for leased line business. Since installing leased line provides a higher return than cooperating with other ISPs, the non-cooperative strategy dominates, which hardly fosters a healthy global Internet, or even leased line customers with decent connectivity beyond their private trunks. And yet there is little attention among many ISPs to the health of the global Internet beyond the extent that it affects their core business. If a customer of one ISP cannot reach a remote site due to a problem several ISPs away, he can tell his own ISP but the chances of effecting resolution are small. Customer support is often only intra-ISP, not inter-ISP, a choice derived from limited resources and lack of supporting technology rather than disinterest. Cooperative forums such as NANOG or IEPG might be reasonable places to develop cross-ISP trouble reporting mechanisms, but most ISP engineers are too busy to actively participate in these forums beyond just attending a few meetings a year.


Session 3: Measurement infrastructure
(Vern Paxson, Guy Almes, Nevil Brownlee, Robert Moskowitz )

Paxson of LBL presented his work on black-box measurement of Internet clouds, using endpoint measurements to discern how an Internet cloud perturbs traffic going through it. He found 35 sites to participate, enabling him to procure data on a wide range of links. (Note that each additional site introduces N additional Internet paths to probe and analyze). He then used his traceroute-based tool, npd (network probe daemon), to track route stability and failure patterns over time. His measurements taken over the course of a year indicate a significant service degradation in the past year (e.g., a substantial increase in the number of outages of 30 seconds or more). He also found widespread asymmetry (the different directions of an Internet path visit at least one different city about half the time). The software has been ported to several unix platforms, allowing widespread deployment of such probes. Vern notes that npd was just a prototype he wrote for his thesis research, although he agreed that obtaining a reasonable picture of performance would be greatly facilitated with some measurement infrastructure, including deployment of performance evaluation tools, techniques for independent measuring and analyzing of of available congestion levels, queueing delays, packet loss, etc.

Nevil Brownlee presented data from metering software used in New Zealand's Kawaihiko University network. The NeTraMet software consists of meters that watch packets and build tables of information about flows, meter readers to gather flow data, a manager to coordinate meters and collectors, and a customizable rule configuration file. An IETF working group on realtime traffic flow measurements (RTFM) and mailing list (subscribe to rtfm list via majordomo here), focuses on the use of such software. Nevil described the operational environment in which he uses NetraMet: a university network with a full mesh of PVCs. There is a meter at each of seven universities in New Zealand, with about 600 rules each, split between commercial and non-commercial traffic. The software allows him to create a traffic matrix.

Since he started weekly time-series and utilization plots, he has not seen a significant degree of aggregation; there are still heavy tails. he has noticed that the Auckland data is a lot `spikier' than that from most other sites. One explanation for the difference is that Auckland's access rate and frame relay CIRs are a lot higher than its average load, so there is no artificial `trimming'. The other universities tend to run their PVCs closer to their CIRs. Over the last month or two Canterbury has increased its CIR significantly, and their time-series plots seem to have got spikier.

Guy Almes of Advanced Network and Services spoke next, on a complementary IETF working group effort: Internet Provider Performance Metrics (IPPM). Their charter is to develop and deploy objective measurement tools for use in the post-NSFnet architecture. Customers have a problem in trying to select an ISP: their only metric is the price tag, perhaps in addition to rumors about performance. More rational selection requires performance metrics that are concrete and repeatable, exhibit no bias for identical technology or artificial performance goals, and which independent users can rely on to evaluate performance.

To clarify the distinction between the RTFM and BMWG/IPPM working groups, RTFM focuses on network-centric workload characterization of longitudinal traffic measurements, (bit counting), while BMWG/IPPM focuses on measuring performance (delay, throughput, loss), treating background traffic as an exogenous variable. Both groups focus on objectives different from what the SNMP or RMON network management protocols cover. Fred Baker of Cisco offered to provide a half-page summary of RMON and RMON-II.

Guy discussed end-to-end path measurement methodologies. Two path performance metrics of interest are delay and flow capacity. Statistics of delay as measured by ping are somewhat useful (minimum, maximum, mean, variance, percentiles), and one-way vs. round trip path measurements are increasingly relevant in the face of asymmetric routes. Measuring flow capacity is problematic because determining how much capacity one can get from a given path ostensibly requires ramping up traffic flow across the path until degradation occurs. Matt Mathis has written treno, a tool to measure flow capacity in a somewhat controlled way.

Guy discussed the need to be able to estimate values in one area using combinations in another area, or using previous measurements of delay and flow capacity, to determine current or predict imminent flow capacity. Such extrapolations are still an area of research, but probe computers, or transponders at key sites, i.e., exchange points, regional provider gateways to backbones, campus gateways, are still needed to develop more reasonable models, in particular a model that allows us to ascertain performance numbers without saturating the measured networks.

ISPs are in general skeptical of performance measurements. Steve Corbato from the University of Washington compared the situation to the Mbone: ISPs have begun to introduce Mbone tunnel support since it became obvious that their customers would just set up tunnels anyway, and that without coordination by the ISP, the result would be much less efficient, with as much negative impact on the ISPs as on anyone else. Network flow capacity testing poses the same threat: users will develop and deploy tools and metrics themselves if no better methodology is available. Furthermore, without a vehicle for sharing the data there will no doubt be unnecessary redundancy, not to mention inaccuracy, in effort, so we might as well codify and consolidate useful efforts, and develop benign tools and well-engineered architectures for probe machines. We might as well do it right. Bill Norton noted that NSF would be likely to fund a packaged turnkey system that ISPs could use to offer standard graphs (e.g., a la NSFNET) to the community.

The conversation came back to determining a simple and useful Consumer Reports style (though the phrase was not universally liked) service metric, e.g., periodic ftps and pings. Sean Doran warned that the last thing that high quality ISPs need is for Consumer Reports to endorse them and cause everyone to shift traffic over to them. Sincoskie and Braun felt this concern merely justified continually repeated measurements and credible criteria.


The next speaker was Robert Moskowitz of Chrysler, who is leading the Automotive Network Exchange, an advocate organization on behalf of North American automakers in pursuit of a publicly endorsed mechanism for measuring the ability of various ISPs to meet their requirements. In particular, the industry wants to be able to dictate terms to ISPs, trading partners, and then monitor or pay an independent agency to monitor them. Performance metrics will include per time period measurements of total data transferred, and average, peak, and normalized utilizations. Reliability metrics, e.g., circuit availability, were even more important to the auto industry. Moskowitz noted that a 3-minute outage can bring down an assembly line; after 20 minutes the plant may shut down and the shift sent home. Help desk metrics include: total calls/period, mean time to repair (MTTR), and 95th percentile worst time to repair. The ANX intends to publish statistics similar to how the FAA publishes airline ontime performance.

Sean Doran, apparently reflecting the opinion of a few other ISP representatives in the room, was perplexed by Robert's unrealistically ambitious objective: the current Internet, including underlying router technology (not just service quality, though one can criticize that too) is simply not amenable to the control that Moskowitz's agenda requires. Routing instability and convergence times, especially when crossing large ISP interchanges, inhibit that kind of integrity. If he wants a guaranteed 3-minute reachability repair time, Moskowitz needs a set of private interconnects, a single-ISP solution that avoids exchange points, so he has a single point of contact/liability for connectivity problems or violation of contractual agreements. Both Sprint and MCI voiced willingness to meet these needs with leased lines. Moskowitz assured the group that they had considered this option, but given that the ANX needs all up and downstream suppliers and dealers connected too, with the same service quality, and that since one in seven working Americans have automotive related jobs, this approach would just constitute a reconstruction of the Internet. Why not fix the one we have?

Moskowitz was not arguing that the public Internet could provide what he needed today, or even this year. His goal was rather to develop mechanisms to achieve his vision as soon as possible, and the obvious place to start is developing standardized metrics from which to compare the ability of different ISPs to meet an industry's networking requirements. Several participants noted that Moskowitz was not alone; other customer communities, most notably higher education, had voiced precisely the same concerns, and would also soon begin to systematically measure ISP performance with or without participation of the ISPs themselves. It is thus quite likely in the best interest of ISPs to participate in defining such measurements and metrics in a neutral and productive forum. Without ISP and IP research community participation, bogus measurements and misleading metrics may result, leading to inaccurate evaluation of ISPs and unhappy customers.


Over lunch, kc showed a demonstration video (Planet Multicast), of a recent network visualization tool for examining a particularly problematic Internet component: the Mbone. Developed by Tamara Munzner at Stanford graphics department, the tool lets you interactively examine and move various components of the Mbone topology. A brief description of the work and resulting VRML and Geomview-compatible images are on the CAIDA web server.


Session 4: Flow characterization and control
(Craig Partridge, Peter Newman, Joel Apisdorf, K Claffy)

After lunch, both Craig Partridge of BBN and Peter Newman of Ipsilon Networks presented statistics on the potential performance of flow caching for a September 1995 5MB, 30-minute packet trace from FIX-west. Both are working on high-speed IP router projects at their respective companies, and flow duration statistics that can assist in deciding which flows should be switched are critical to optimizing router performance. Partridge found that even with only modest hit rates, as low as 10 to 20%, caching can be effective in improving performance. Their analyses indicated that by recognizing traffic suitable for fast switching (caching) and treating it accordingly, one can greatly increase effective switching capacity, more than 5-fold in Newman's simulations on this specific trace. Newman and Partridge both emphasized that further developing and establishing the validity of such technology requires raw packet traces: headers and accurate time stamps, with sanitized addresses that retained the structure of CIDR prefix information.

Apisdorf and Claffy presented a real-time traffic collector, a prototype of which Apisdorf has recently ported to an OC-3 ATM-based monitoring platform. The vBNS and other infrastructures could not make reasonable use of the original FDDI-capable NNStat-based software because they needed to monitor the fiber coming directly from the ATM switch at each backbone node. Apisdorf has ported the collection architecture and code to a 120 MHz, 64MB Pentium PC; both will be public domain. The monitor can capture all of both TCP/UDP and IP headers if IP options are not used, and they expect it to keep pace with full OC-3 rates. The monitor is not operational yet on the vBNS; they demonstrated it on the San Diego node, with the same graphical interface for customizable graphs as they use for the FIX-west statistics.

The Digital Alpha lent by MCI for this data collection project collects 5 minutes of complete packet data (mod losses) every hour on the FIX-west FDDI, calculates flow-based statistics on each 5 minute trace, and transfers the results down to the NLANR web site which supports a real-time interactive statistics querying engine. Because the FIX-west FDDI monitor had been running for several months, Claffy interactively created a few interesting time-series graphs as well as comparing current Internet traffic with that on the SDSC vBNS node. In particular, the tool allowed one to show the disparity in bandwidth and switching resources consumed versus customers served (i.e., flows) using web and CuSeeme traffic profiles. For example, an interactively created graph of the fraction of total packets versus the fraction of total flows illustrated how CuSeeme traffic satisfied few customers with a significant consumption of resources for each one.

Claffy warned that this source of statistics will disappear when FIX-West converts to a Gigaswitch platform, planned in June 1996, after which accomplishing the same task would require the prohibitive cost and complexity of separately monitoring each attached FDDI and collating the statistic sets. Doran noted that SprintLink had 150 routers and 21 were running similar flow statistics gathering software on the Cisco 7513 platform. Partridge suggested that we assume that every counter implemented costs 3% of router performance, and Sincoskie emphasized the importance of being able to download analysis code to a router, so the user (ISP) can decide his own tradeoff between routing packets and gathering statistics. Joel Halpern still wanted a better idea of what to collect in the router, to ensure that the data reduction is useful later.

Even if we had such boxes, at which strategic points would we want to deploy them? An increasing number of direct interconnections among service providers has rendered the NAPs less significant for anywhere but the R&E environments, although they are still the only place that NSF may have any leverage. Mark Luker of NSF acknowledged that they were considering such issues, and were open to input on where such probes are needed. Although NAPs, campuses, ISP backbones, local providers are all important to measure, the NAPs may be the most difficult to get consent for the collection from the attached providers. Most of the group agreed that aggregated information was safe to distribute, and further anonymity could be secured through appropriate encryption of addresses. (Carter asked about the impact of the Telecommunications Act of 1996, but noone seemed to want to even discuss it.) Nonetheless if NSF were to make anonymous monitoring a part of the NAP agreement, many providers would just disconnect and set up their own peering points. Braun did not have the same problem with the FIX-west data, since not only was it totally aggregated and sanitized, but there were no legal proprietary issues since all connections to the FIX are specifically approved and in the service of U.S. federal agencies.

Hon So of the Pac Bell NAP urged us to make it clear to ISPs of the benefit of allowing their traffic to be monitored. Luker agreed, suggesting to use sample measurements to make a strong economic case that it would improve the price-performance of their network.


Session 5: Scaling
(Dennis Ferguson, Mark Garrett)

Dennis Ferguson presented reflections from a former backbone operator on the two most acute needs in the Internet right now: higher performance (`rrrreally big') routers, and a basic traffic engineering methodology.

He first described the suboptimality in trying to use many existing (i.e, mid-performance, since high performance ones do not exist yet) routers to achieve the route processing and convergence time requirements of high bandwidth environments. He took a dimmer view of route caching, confident that the backbones will always outgrow the existing route cache algorithms. His projection for IPv4 routing table space was for 350,000 routes before IPv6 would establish itself, thus a a minimal target for IPv4 router vendors.

Ferguson then described the other Achilles' heel for ISPs: in 10 years there has not yet developed a basic Internet traffic engineering methodology. If even half the attention to rocket science traffic modeling were devoted to how to estimate a reasonable ingress-egress traffic matrix from sampled measurements, network engineers, particularly of large clouds, would find their job substantially easier. And yet, not even matrix estimates are available with most router equipment right now; link loads are available, but they just do not hold enough information to estimate a reasonable matrix. He wanted to know how to derive better estimates from samples and averages, the way civil engineers have derived approximations for highway architectures.

Sincoskie responded that we lack the background to develop models with what we have; further study, e.g., of self-similarity is required. Ferguson assured us that he does not need precise models right now; he needs realistic short-term engineering guidelines. His many observations of backbone traffic are mostly inconsistent with self-similarity models anyway, a rough traffic matrix would at least improve his ability to balance load and engineer topology.

Mark Garrett led a somewhat controversial discussion on the difference between Internet and telephony. Sincoskie felt the difference was not one of technology but of growth. We are still building half the Internet every year. There are new ISPs every day, many who know relatively little about the Internet and the potential far-reaching effects of a local misconfiguration. (David Conrad mentioned in contrast the stringent licensing requirements in Singapore, at $8M per license, with QoS requirements on availability that grew even more stringent over time.)

In the U.S., at least, the ISP business agenda is typically to sell as many connections as they can for an unmeasured, unregulated and unquantified service. No engineer would claim that either this business agenda or the underlying technology is mature enough to support exponential growth much longer. Internet issues that are several years old are still not resolved: address space growth, routing table growth, routing convergence in big clouds, unfriendly or broken TCP implementations abusing the network. (Sometimes we forget how phenomenal it is that it works as well as it does, a testimony to its architects who were chartered to design a system to handle much less. Moment of silence in homage.)


Session 6: Quality of service/pricing
(Roger Bohn)

While we were on the topic of unsustainable growth, Roger Bohn, an economist at UCSD presented his view that a huge part of the problem is that, along with router technology and engineering methodology, the economic model of the internet has not grown either.

The fundamental goal of any economic system is to maximize the number of happy users and the value of what they do. Maximizing value in the Internet is difficult since the economic value model is quite randomized. In most markets, value is attributed according to quality of service. The Internet is no exception: rational pricing would provide the right feedback to providers and users to encourage more appropriate use. Quality signals are not now clear: users need operating signals accompanied by measurement-distinguishable service qualities so they can declare the QoS for which they will pay. Otherwise high value users may get degraded by high requirement, low value users. Regarding the earlier reference to the Internet as a biological system, pricing would form an essential `biofeedback' mechanism to keep the system balanced and effective.

What are the implications for traffic measurement? Measurement at boundaries is essential, and different providers will no doubt measure different things, e.g., source vs. destination-based billing. Furthermore, sample-based billing is inevitable; users need to accept it as just another revenue stream, not precise or optimal. Large customers who care about predictability of service will naturally pay more.

Many ISPs were opposed to usage-based pricing at all, and sample-based pricing in particular. Doran asked what he was supposed to do about customer disputes of charges, and which end of a TCP connection should be charged? Halpern suggested different pricing models for different problems. For best effort traffic, the value of usage pricing is questionable, but for RSVP-based traffic, it makes sense to use the reservation request information for usage-based billing purposes. He also noted that costs for adding capacity are not linear and such an assumption in the pricing model would be incorrect. Ferguson reminded us that ISPs spend more on salaries than on bandwidth; bandwidth is actually relatively cheap. Conrad noted that bandwidth may not be the showstopping cost in the U.S., but it is in quite a few places. Donald Neal from New Zealand (where a usage-based pricing infrastructure has been in place for some time) agreed that a single pricing model would not work.

Sincoskie noted that we are still missing a way to measure service quality. Also, why do we assume the network is congested? AT&T engineered their telephone network for zero loss. Bohn acknowledged that there were two possibilities: either we will expand bandwidth ahead of offered load, or traffic will expand to consume available bandwidth. Either way, cost recovery for resource consumption is imperative to maintaining the `Internet supply'. In a minimalist pricing proposal, we can have both usage and flat pricing, using precedence-based queueing. Users select their own precedence values, and ISPs set quotas or prices for their customers. ISPs count packets at user-ISP boundaries. Users who don't like their service performance can set a higher precedence to improve it; users who don't want to pay for a high quota can set low precedence and tolerate delays. Although precedence is an end-to-end matter, incremental deployment of precedence distinction and quotas can still provide benefits to those who use it.

Luker encouraged consideration of a wider range of existing operational economic models: CoD, US Mail, telephone system's standard, 800, and 900 numbers. Meritorious applications would congest almost any network; Moskowitz's ANX and the education community will need more reliability. Are we going to build the ability to handle these needs into the Internet of the future or have others build it from the outside? Partridge noted that the IETF integrated services working group is already planning along these lines.


Session 7: Where to go from here
(Walter Wiebe)

Walter Wiebe, executive director of the Federal Networking Council, led summary discussions for the first day of the workshop. He began with a recap of the similar concerns mentioned by the automotive industry and higher education communities, both of which have previously appealed to the FNC regarding their frustration with degradation of Internet QoS (primarily availability). Partridge suggested that we separate the use of the term `QoS', which has a more specific technical meaning within the IETF, from some other term relating to ISP service criteria. Partridge explained that within the IETF, QoS is specific to a traffic's elasticity, i.e., guaranteed vs. best effort service, and relates mostly to queueing design in output ports of switches and routers. But we seek a term to reflect the integrity of the best effort service that we receive. The group settled on RAS: reliability, availability, serviceability. QoS, when we get it, adds another dimension to RAS, but they are separable issues.

A number of ISPs definitely did not buy into RAS, claiming that it is technically not feasible, certainly not with the time they have to address it, and furthermore it is not necessary. RAS is also significantly impeded by the current rapid growth of the Internet. Stability is prerequisite to reliability, and as most metrics of Internet growth continue to double every nine months, we can hardly consider the Internet stable. Some thought it would make more sense for a portion of the network to focus on the RAS levels that Chrysler needs, while other parts retain low-RAS as they continue to expand rapidly.

If we insist on RAS, we need a process to first specify IP service criteria; i.e. some way to describe what the customer is supposed to receive from the provider and to ascertain that they did indeed receive it. We can then expand this definition to a small set of RAS levels. Wiebe thought that it would not be hard to put forth an initial RAS metric set as the telcos have had for years: e.g, as 99.9% hardware availability.

Sincoskie asks how we apply to industry any metrics we may define? The Singapore model they refer to as incremental engineering, where ISPs must all agree on such service criteria and then refine them as they gain experience? Sincoskie offered the phrase `incremental regulation', which seems inevitable if ISPs can not regulate themselves. Industrial cooperation under threat of federal intervention? Are the ANX and related groups a big enough stick? Donald Neal of New Zealand reminded us that despite its best intentions, the U.S. federal government will not be regulating the New Zealand network, so we'd better prepare a working non-regulatory model for how to determine that ISPs deliver the QoS for which they are being paid.

Bob Simcoe of Digital observed that we have identified two classes of measurements those used for design and engineering decisions (e.g., flow stats) and those that might identify which ISPs were providing better service. ISPs would naturally have some resistance to the latter category, a `report card' on their quality of service. Also, there is precedent that established metrics, even if they relate little to real system performance tend to drive the industry. So we need to be careful and educated about what we offer to the community as a benchmark.

For the former category of workload statistics, needed among other places deep within ISP infrastructures, engineers and researchers will have to make a written, clear case to submit to the business side (ISPs) on what data collection would do for them, e.g., solve connectivity problems, allow optimal router design. Ferguson agrees that neither category of statistics provides any current benefit to ISPs; in his own experience as an ISP engineer, he would not have had much use for either workload characterization or RAS. He always knew when something was screwed up and what he needed to do next, before customers complained. RAS would just be a methodology to inform everyone what the operators already knew. What operators need is a model that could, even roughly, predict the immediate next step. Partan agreed: his ISP does not need statistics, they need OC-3 and OC-12 routers to improve service now. And they need them more reliable than the current lower speed routers. Current routing technology just is not built for high performance or robustness, and it will take years to change the industry to support what Moskowitz's agenda presumes. Mathis put it concisely: Internet is like a freight train roaring along while people are laying tracks in front of it. It's not just gaining on those laying tracks; it's gaining on the steel mills.. Fred Baker (at a steel mill) agrees, and noted that the router vendors are a great consumer of any realistic traffic traces so that they can design equipment that will better serve ISPs.


Tuesday


Session 8: Measurements and security
(Carter Bullard)

Carter Bullard of Fore Systems presented briefly on security as it relates to measurement. Measuring for security has a more stringent set of requirements than measurement for performance and involves measuring every single network event, and with a high degree of semantic preservation, yet without compromising user privacy. The latter constraint limits investigations to at absolute most the transport layer.

He described Argus, one measurement tool of interest developed at CMU. Experience with argus and other tools have indicated the need for more identifiers in flow measurements, e.g., MAC address; bi-directional. The bidirectional nature of flows is particularly important to capture, for example, whether a TCP connection setup was successful, or if UDP requests were serviced. Measurements also need to comply with theoretical security monitor models, which suggest the need for a physically independent monitor as opposed to an integrated monitor in a firewall or a router.

Security relevant measurements can also be used to support real-time firewall policy enforcement validation, traditional security features such as non-repudiation, and research in the area of analysis of attack strategies.


Session 9: How and what statistics will help

Before starting the next session, Mark Garrett captured some action items from yesterday:

We spent the next hour discussing examples of statistics that are essential to investigating and addressing current Internet problems.
Matt Mathis described a TCP (Reno) problem that he has used his previously described tool, treno, to illuminate. Matt and colleagues have data illustrating the symptom: the inability of the Reno code to use more than 50% of a path's available bandwidth under certain conditions. They have submitted for publication a paper detailing their findings and proposed solution, selective ACK (SACK). Braun asked about possible dates of vendor releases of SACK, and Allyn Romanow responded that for Sun it would probably be at least a year.

Claffy presented a more general matrix of the problems we need statistics to solve, created last night with Matt and Bilal Chinoy.


problems we need statistics to solve
need whymeasurements neededwhereproblems
trace-driven experiments
  • cache management
  • optimize queueing
  • congestion and scaling dynamics
  • aggregate transport behavior
  • full header
    traces
  • few high aggregation points (near NAPs)
  • great political aversion (if becomes part of NAP agreement, will just increase private exchanges)
  • switches can't sniff, need data on all ports or inconclusive
  • specifications for ISPs to give routers vendors router benchmarking
  • flow counts/parameters
  • NAPs
  • backbone core points
  • corporate campuses
  • large customers
  • private interconnects
  • software/hardware doesn't support
  • aggregate traffic flow capacity/topology planning traffic matrices ISPs infrastructures
  • routers don't support
  • privacy issues
  • specifications for users to give ISPs
  • service quality assessments
  • ISP shopping
  • multi-function transponder testing platforms
  • Mathis' treno
  • Paxson's probe daemon
  • bridged at strategic interconnects
  • uncongested customers
  • no validated standard metrics
  • routing system stability RAS
  • route (BGP) logging archiving
  • caching
  • route servers most ISPs not using RA (so not great routing data)

    and more general needs for


    Session 10: Federal perspective
    (Hans-Werner Braun, Phil Dykstra, Mark Luker)

    Phil Dykstra, chair of the Federal Engineering Planning Group (FEPG), led the closing summary session. He began with a discussion of the federal networking and government agency interest in facilitating the Internet industry to work out its own mechanisms before regulation forced perhaps less palatable ones on them. He suggested leveraging the traditional role of many federal agencies to collect and publicize data of national interest.

    The Federal Networking Council Advisory Committee (FNCAC) has recommended the development of

    Alluding to the earlier talk on security and encryption, Phil noted that even in a secure network there are useful statistics to collect. Analyzing raw bitstreams is still important for planning bandwidth, topology, services, identifying failures, routing problems, flow profiles, and improving security itself. Other important statistics do require more, but encryption is only one of the obstacles to such statistics, e.g., routing stability, number of NSPs/users, user satisfaction, demographics, health/security, crime. Wise investment in a system requires lots of data, and we might want to develop Leading Internet Indicators (LIIs) that reflect industry status, analogous to economic estimates of new housing starts, consumer confidence, excess capacity. Phil solicited feedback on several questions:

    Bradner noted that telcos are required to report outages to the FCC; should the Internet have similar requirements? Right now Phil is asking for more than what the telcos are required to report. He's in the `There's data out there; let's get it.' mode. But market growth is so rapid and unyielding that no one knows the current Internet user population, and no one knows when it will saturate, despite the few attempts at systematic surveys and plots. Furthermore, the voice of NSF in the Internet and its ISPs has diminished since the R&E community has been dwarfed by the relentless commercial market. Indeed, the R&E and other communities want to use the Internet for their business, and NSF and the FNC hear from them and others increasing feedback about degradation of service quality (RAS). Scientists on NSF-funded projects are frustrated because they have money to buy network connectivity but not to procure private networks to all their collaborators. They are seeking ways to evaluate user satisfaction and translate it into purchasing decisions.

    Based on these factors and the discussion thus far, Luker suggested that we should move ahead to measure RAS, and NSF solicits recommendations on how to foster this process. The New Connections program that NSF wil announce next month (ed: was announced as scheduled) focuses on supporting applications and campus networks with service not available through the commodity Internet, e.g., via the vBNS and other special portions of the Internet. The program presumes a new standard for campus networking, which will inevitably contribute to national network infrastructure. The NSF hopes the community will see the vBNS as a viable testbed for developing measurement platforms, since scientists at the vBNS-connected supercomputer centers are ready and willing to support such development. NSF is also willing to steer or reemphasize, accept new proposals or precise definitions.

    Focusing on end-user centric metrics, Braun presented some data taken from about a month of periodic probe activity from vBNS sites to selected Internet locations. The graphs indicated quite high variance delay and throughput statistics, but are extremely difficult to codify into what we really need: end-user-centric performance metrics that network consumers can use to assess the services they receive from their ISPs.

    NSF and others hope that the new NSF/NCRI Connections announcement will push into this direction, with an emphasis on users and their applications. The challenge is that end user service is currently non-trivial to measure in the Internet environment. For example, diagnostic tools to measure throughput are often inherently aggressive to the network; tools to measure delay often receive differential treatment by routers, so they will not reflect actual performance. The ability to assess a reasonable approximation of end-to-end network performance in a non-invasive fashion is essential.

    The group did not want the workshop to end without specifying an appropriate forum to define, collect, and publish metrics, and build baselines. Mathis felt IPPM was an appropriate place to develop the metrics, but what we needed was an measurement infrastructure to evaluate them and compare various parts of the Internet. This latter effort required the focus of a single organization to specify and deploy probe architectures at strategic exchange points. Unlike trace gathering or workload profiling agents, probe machines do not require promiscuous sniffing, and so deployment should be less controversial. Bradner, as IETF Operational Requirements area directorate for the IPPM working group, noted that the IPPM charter focused on developing and explaining performance criteria, terminology, and measurement methodologies (and perhaps to a lesser extent tools), but not on an ad hoc consumer ratings studies. Doran felt that the appropriate medium was email...

    Bob Moskowitz asked if there were a firm proposal on the table for probe machines and people to which he could contribute resources. He wanted to see 50 probe machines and 3 people to start immediately. Yakov Rekhter suggested starting with `RAS machines' at the three NAPs. Elise Gerich of Merit announced that they were already planning to deploy Vern Paxson's npd software at the route servers, and volunteered Merit to do the post-processing, with advice from Paxson on extracting meaningful information from the data.

    Ferguson voiced his concern that we were still not addressing longer term technology issues that have already been ignored for several years. He could not fathom what fulfillment we will get from putting a bunch of probe machines out there and analyzing a bunch of data to come to the solid conclusion: ``Yep, performance really sucks alright.'' What do we do once we've discovered the problems? Where's the engineering methodology that the measurement is supposed to support? Where's the router technology to support the growth we already sort of measure now? How are these measurements going to help us when the router industry is way more than 2 years away from developing the OC-192c routers that ISPs will need in 2 years?


    Epilogue: (compiled post-mortem reflections)

    What was the most painful part?

    This workshop attempted to gather policy, engineering, research, and user community representatives, who often found themselves frustrated at not being able to communicate with each other. The technical side seemed to find the most useful aspect of the workshop was meeting with others who were doing measurement and analysis, and establishing a community interested in facilitating greater understanding of network statistics. But for them, unfortunately, many of the policy discussions, such as on how to convince ISPs to provide or recognize useful data, were not useful.

    Where was the split?

    The two somewhat orthogonal categories of measurements did not contribute to common ground: engineers were eager to find out what kind of monitoring and statistical research the Internet needs and how the ISPs can make it happen. But from the federal and user communities, there was a stronger interest in how to validate or qualify service (RAS) from ISPs, and what sorts of statistics further that goal. Since many powerful user communities in the U.S. are fervently appealing to federal agencies (NSF) and advisory committees (FNCAC) for a higher level of Internet reliability, NSF wants to offer help coordinating or facilitating a path to get there, before other federal agencies start to take notice and action. Unfortunately, ISPs find RAS metrics at least unrealistic and more likely just plain silly, given that neither the current Internet architecture nor its underlying technology are conducive to a service that ISPs are in a position to control with `phone quality' robustness. (`Build us faster routers and smarter routing and management protocols, and then we can talk report cards. In the meantime just let us try to keep up with your demand.')

    What was sort of agreed upon?

    And while the workshop provided a(nother) forum for finding consensus among Internet engineers and operators that imminent scalability and service quality problems loom ominously ahead, the consensus did not extend to how to handle the overload. Some on the technical side advocated the solution of just adding capacity as quickly as possible and continuing along the current rapid growth path. Others felt that a realistic economic model for the Internet, with monetary incentive to conserve resources, would be essential eventually, better sooner than later. Developing either capacity planning or pricing models would benefit from more sophisticated workload and performance measurements, and ISPs and equipment vendors would both benefit from an infrastructure for such measurements.

    Why will it be hard?

    There was skepticism that any measurements that might occur in the short to medium term at exchange points would lead to useful information for ISPs, either because useful data is too hard to collect and analyze or because ISPs are averse to their collection due to customer privacy concerns. (The two factors are not unrelated: ISPs have been unsure about the legal implications as well as the benefit of data collection, and so have not put pressure on their equipment suppliers to support functionality that they now wish they had.)

    For example, it seems clear that the router vendors are in no position to support the collection of traffic matrices despite that all ISPs vehemently agree that aggregate traffic matrices are crucial to backbone topology engineering. In addition to allowing the discovery of mistraffic, e.g, route leakage or a customer accidentally sending huge amounts of unintended traffic into the core, traffic matrices combined with route flap data are essential to an ISP's ability to communicate problems to peer ISPs when necessary. Backbone engineers consider traffic matrix data significantly more important than flow data for short to medium term engineering, and it may be essential to the investigation of Big Internet issues (routing, addressing) as well.

    Notably, although the telcos have long measured traffic matrices for phone network engineering, ISPs have had both technical, legal, and resource limitations as obstacles to collecting much less sharing such measurements. Collecting packet headers, though essential for researchers to develop realistic models and analysis techniques, is even more technically and logistically problematic.

    Areas to focus resources

    The workshop did accomplish what NSF had intended: it provided a forum for a diverse set of people to express their opinions about this controversial topic, identify and stimulate thought on critical issues, and plan some next steps. In particular, four infrastructural gaps were identified and discussed that are all conducive to concerted influence toward closing them, and both industry and federal funding agencies are in positions to exert complementary influence.

  • high end routers
    First, router vendors. All the large ISPs have mentioned their vital need for high end routers, and have in fact needed high end routers since the NSFNET backbone project. (One might consider the NSFNET impact as detrimental here: it kept the largest T3 provider at the time from needing a commercially available high end router because they had RS/6000 router from project partner IBM, who unfortunately never followed through with it as a product line.) Yet the total market for such technology is small, the competition nonexistent, the high cost for entering the market prohibitive. Cisco far and away dominates all other router market segments for the foreseeable future. As the market continues to race ahead of the technology, NSF is no longer funding a project that involves router vendors, ISPs and a realistic market. (The vBNS faces issues vastly different from those of the commercial ISP networks in crisis: huge routing tables, huge clouds, lack of strong cooperation among competitive ISPs, etc.)

  • traffic characterization and engineering
    Barely able to keep up with aggregate traffic loads, ISPs have little opportunity to investigate characteristics of their workloads that they might leverage to improve their network capacity. At the same time, much of the academic research in traffic characterization for the last decade is too far removed from the operational problems to benefit an ISP, often because assumed models are not applicable to an ISP's traffic workloads, or because analyses do not provide resulting statistics that can guide the decisions ISPs currently need to make. More promising theoretical and empirical analysis have recently emerged, but they crucially rely on realistic and fine-grained traffic measurements to refine them toward utility for ISPs. NSF and other funding agents can invest in medium to long term research projects that leverage implications of this more recent theoretical and empirical work, as well as shorter term research into an approximate traffic engineering methodology that ISPs can make use of today. NSF could also fill a pressing need by developing a reliable and secure mechanism for researchers to obtain accurate traffic measurements without compromising the integrity of an ISP's operation.

  • sanction/sponsor a public measurement infrastructure
    A strong contingent of the workshop considered the ability to assess and verify the service one receives from their ISP critical to healthy competition in the industry. (Although we note that representatives from some large ISPs disgreed, assuring us that there was more demand than they all knew what to do with.) Representatives of large consumer groups, as well as some researchers, strongly felt the need for and utility of a public measurement infrastructure from which to do end-to-end service assessments as well as workload profiling from otherwise inaccessible strategic Internet locations. The recent and continued proliferation of private exchange points renders the cost of deploying such platforms high, but embedding them selectively in the infrastructure would enable a wide range of studies that are currently impossible. Remote management and intelligent analysis and utilization of the resulting information will cost more than initially deploying the equipment, but several researchers were eager for a chance to demonstrate the utility, thus justifying the high cost, of resulting statistics to business, i.e. how it would benefit their infrastructure, and ultimately profits, to participate. If architected soundly, a public measurement and statistics infrastructure for investigating workload and performance metrics in a commercially decentralized Internet would benefit all four communities: users, trying to ascertain their services; vendors, trying to improve their products; researchers, trying to develop more accurate traffic models; and ISPs, trying to diagnose and repair problems with their and other infrastructures, while at the same time coping with the unabating demand for their service.

  • stronger mechanisms for ISP coordination and data exchange
    Although everyone felt that we should seek out measurement infrastructure and sources of statistics in the commercially decentralized Internet, there was definite dissonance as to which measurements would help, and who should have access to them. While the public measurement infrastructure described above would help researchers and end users, the ISPs would benefit more from the ability to collect statistics that were too sensitive to release publicly, and perhaps from comparing them to corresponding statistics from other ISPs. Examples of such statistics are matrices of traffic flow among autonomous systems (ASMs), routing status and stability, unusually configured routing, more specific routes in the presence of less specific routes, unicast layer underneath mbone tunnel topology. The consortium could also be a forum for the consensus on and development of tools to measure new metrics, e.g., the number of reachable destinations covered by a route. It would have stronger and more active, although complementary, focus than NANOG and IEPG. ISPs will be more likely to participate in limited data exchange within the closed consortium if a neutral, independent, well-trusted body coordinated it, including offering services and tools to encrypt sensitive data, process log files, visualize large quantities of data, and provide interactive, access-controlled access to customizable reports for consortium members. Most importantly, the consortium would serve as a vehicle for ISPs to reassemble some of the fragmentation they suffer from today, while retaining a still healthy competitive atmosphere. If achieved, a cooperative consortium is the most likely hope for forestalling regulation-minded agencies from taking greater notice and action in the Internet.

    infrastructural gaps: where to invest attention and resources
    encourage development of more powerful routers for core Internet components, a prohibitively expensive endeavor with too small a potential market and thus too little return to motivate vendors to pursue independently
    foster short term research into basic traffic engineering methodologies given limited data, and longer term research into the implications of realistic theoretical and empirical traffic characterization
    endorse and sponsor a public measurement infrastructure
    facilitate the development of an ISP consortium for coordination and limited, secure data-sharing


    Acknowledgments

    This report benefitted greatly from several people who took minutes at the workshop, in particular George Clapp and Mark Garrett. The input and feedback from Daniel McRobb, Vern Paxson, Craig Partridge, and Tracie Monk assisted greatly. And of course all the participants, who said stuff.
    (I think that covers everyone).


    Workshop followup via isma@caida.org mailing list (subscribe here)
    kc@caida.org