by Tracie Monk (DynCorp) and K C. Claffy (NLANR)
2. The Demand for Metrics
In this paper, we present a survey of the current activities in Internet performance and workload measurement. The environment is dynamic enough that we can only provide an indicator of the growing activity and interest surrounding this important topic; we apologize for anyone omitted. Hopefully, through the hyperlinks and other references, readers will be able to followup directly with the various researchers and organizations who have (or plan) initiatives relating to the collection, analysis and/or presentation of Internet statistics.
We also outline in this paper a recommendation for the establishment of a `provider consortium' to facilitate cooperation. The National Laboratory for Applied Networking Research (NLANR) is working to develop a framework for such a collaboration, and believes that it could serve as an appropriate forum for...
Transition from the NSFnet Backbone Service:
Throughout the lifetime of the NSFnet Backbone Service, Federal agencies and various user communities were engaged in the development, operation and ultimate transition of the Internet to the commercial sector. The involvement of many of these groups was channeled through the Federal Networking Council (FNC) and its Advisory Committee (FNCAC), which consists of representatives from the higher education and research communities, service providers and industry. Following the April meeting of the FNCAC, its members distributed a set of recommendations calling upon government to...
These recommendations resulted from concerns expressed by the members' communities (such as the High Energy Physics community) about the current state of the Internet as well as briefings from researchers such as Hans-Werner Braun (Teledesic - previously NLANR), K Claffy (NLANR), and Mark Garrett (Bellcore). The most recent presentation before this body concerned results from statistics gathering efforts over the FIX-West facility and the conclusions of the Internet Statistic and Metrics Analysis workshop. The results of this workshop and other statistics / metrics activities are discussed below.
In February 1996, NLANR and Bellcore hosted an NSF-supported workshop entitled Internet Statistics and Metrics Analysis (ISMA). One outcome of this workshop was the articulated need for smoother coordination and information exchange among Internet service providers -- both providers of Internet access and traffic exchange services.
Although most participants at the workshop felt that the Internet development and provider community should seek out measurement infrastructure and sources of statistics in the commercially decentralized Internet, there was definite dissonance as to which measurements would help, and who should have access to them. While a public infrastructure for end-to-end performance measurements could help researchers and end users study the infrastructure, participants concurred that Internet service providers (ISPs) themselves would be the greatest beneficiaries of an enhanced statistics capability. Opinions varied as to the sensitivity of some data and how much could be released publically. But there appeared to be consensus supporting the need for desensitized versions of such statistics to be made available in a neutral forum. Such a neutral forum could also facilitate more comprehensive collaboration, consensus-building, and the development of tools to measure new metrics that are important to stability, service, efficient resource allocation, and more economically viable network usage pricing policies.
The Internet is still relatively devoid of pricing models or other mechanisms to allocate and prioritize scarce resources -- particularly bandwidth. Enhanced network functionalities such as multiple qualities of service and bandwidth reservation systems such as Resource Reservation Protocol (RSVP) will require some mechanisms for accounting for network usage at a finer granularity than is generally deployed, or even possible, within many operational Internet components today. They will also require basic statistics to facilitate accounting and the development of economic models.
In his recent Internet Draft, on Metrics for Internet Settlements, Brian Carpenter (CERN), asserts that financial settlements are a critical mechanism for exerting pressure on providers to strengthen their infrastructures. He writes:
He suggests that metrics used in Internet settlements
...should preferrably be estimated, if necessary, by statistical sampling.
...should be symmetric in nature so that the resulting settlements can be associative and communicative, thereby allowing settlements to be aggregated by upstream service providers, and redistributed by transit carriers, in an equitable manner.
Carpenter further suggests that each metric used in the settlement agreement should define or refer to a measurement method and should specify a settlement rate and currency.
Factors which continue to inhibit implementing settlements include the lack of a common understanding of the business mechanics of inter-ISP relations. Some suggest that the ITU / telco settlements model may have relevance to ISP settlements. Others suggest that the connectionless nature of the IP protocol demands that entirely new pricing models be developed. For a good introduction to discussions on Internet economics, see Hal Varian's web site on The Information Economy.
The higher education and research communities are among the more vocal Internet users. Being among the first to find the Internet mission critical, they are quite chagrined with the current state of congestion -- and with the fact that the Internet (given its design) offers only best- effort guarantees. They claim that service quality over recent months has been significantly degraded due to the growth in web activity, the proliferation of users (see Mark Kosters' May 1996 presentation) and the lack of cooperation among commercial providers / users which exemplify the current Internet environment.
In a series of meetings over the last year, representatives from EDUCOM, FARNET, and related institutions have discussed the communications requirements of their communities and shared their concerns about the ability of the Internet to meet their future needs. A paper by Doug Gale and the Monterey Futures Group, concludes "that by the year 2000,higher education will require an advanced internetworking fabric with the capacity to:
Similar demands are beginning to be made by other user groups who now view the Internet as the appropriate medium for their future communication needs. The automotive industry, for example, adopted TCP/IP in 1995 as a standard for data communications among its thousands of trading partners. Through the Automotive Industry Action Group and its Telecommunications Working Group, the major auto manufacturers are working to set forth necessary requirements for the quality of service necessary to satisfy their industry's needs. Specific areas being examined include specifications related to...
The AIAG has identified several metrics which it views as critical to this initiative and to future monitoring efforts. These include Performance Metrics such as latency, packet/cell loss, link utilization, throughput, and BGP4 configuration and peering arrangements; as well as Reliability Metrics such as physical route diversity; routing protocol convergence times; disaster recovery plans; backbone, exchange point and access circuit availability; and speed of failed customer premise equipment replacement.
As more user groups (e.g., the financial sector, energy industry, and others) move toward the Internet as their preferred communications vehicle, we are likely to witness increasing pressure on providers and others to collect, collate, analyze, and share data related to Internet (and provider) performance.
Many internet service providers currently collect basic statistics on their own network's performance and traffic flows. Typically this includes measurement of throughput, delay, and availability. In the era of the post-NSFnet Backbone Service, the only baseline against which networks evaluate performance is their past performance metrics. There are no data available against which national level comparisons or comparisons with other networks' performance can be made. Increasingly, what is needed by both users and providers is information on end-to-end performance -- information which is beyond the realm of what is controllable by individual networks.
Another example of statistics maintained in isolation within an individual ISP is trouble ticket tracking of problems that originate and are resolved within the context of that ISP itself. Throughout most of the lifetime of the NSFnet backbone, resolving route instabilities and other trouble tickets were the the responsibility of Merit (under its cooperative agreement with NSF). In the current environment there is no such entity to claim to share responsibility for national level management of the Internet. As a result, there are no scalable mechanisms available for resolving or tracking problems originating or extending beyond the control of an individual network.
Route instabilities is another area which can have a direct, sometimes profound, affect upon the performance of individual networks. Some networks are seeking to improve the stability of their routing by peering directly with the routing arbiter (RA) at network access points (e.g., SprintNAP and FIX-West/MAE-West). In the context of the routing arbiter project, Merit/ISI have also developed statistics, including those on route flapping and inappropriate announcement, that represent a macroscopic characterization of routing stability and identify trouble areas for the networks with which they peer. However, these efforts are still in their nascent stages and do not yet have sufficient buy-in or support from commercial players to make them a fundamental component of the Internet architecture.
The vacuum created in national-level statistics/metrics collection which followed the transition to the commercial architecture has also complicated planning by national service providers and others. While detailed traffic and performance measurements are essential to identifying the causes of network problems and formulating corrective actions, it is trend analysis and accurate network/systems monitoring which permit network managers to identify "hot spots" (overloaded paths), predict problems before they occur and identify ways to avoid them by efficient deployment of resources and optimization of network configuration. As the nation and world become increasingly dependent on the National and Global Information Infrastructures (NII/GII), it is critical that mechanisms be established to re-enable infrastructure-wide planning and analysis.
The importance of measuring Internet metrics, particularly those related to performance, has been discussed most recently at the ISMA workshop (Feb. 1996); meetings of the FNCAC and Educom/Farnet (both in April 1996); and at a BOF following the May 1996 NANOG meeting. Future events, government-related activities, and various QoS-related efforts are discussed below.
IETF / INet & Related Sessions:
IPPM BOF (June 25, 1996) - The IP Provider Metrics (IPPM) group, derived from the Benchmark Methodology Working Group (BMWG) of the Operational Requirements (OR) Area of the IETF, is comprised of researchers and service providers interested in defining basic metric terms and a formal structure for defining new metrics and measurement methodologies in order to develop standardized performance evaluations across different Internet components, particularly ``IP clouds''. Such evaluations, according to Vern Paxson, can assist by:
PIARA: There will also be a BOF at the Montreal IETF on Pricing for Internet Addresses and Route Assignments, which will discuss Brian Carpenter's Internet Settlements RFC (discussed above), as well as Rekhter, Resnick and Bellovin's draft on charging for routing and addressing. Contact Allison Mankin (ISI) for details at firstname.lastname@example.org.
During INet '96's Network Measurement and Metrics Session on June 26, 1996, papers will be presented by Steve Corbato on Backbone Performance Analysis Techniques; Matt Mathis on Diagnosing Internet Congestion with a Transport Layer Performance Tool; and Vern Paxson on Towards a Framework for Defining Internet Performance Metrics. Guy Almes will chair this session.
Internet statistics / metrics are of great importance to FNC agencies, as well as to the broader
Internet provider / user communities (see the FNC paper presented at the ISMA
workshop). In May 1996, the FNC chartered an ad hoc working group with
the explicit purpose of sharing information on existing and planned
measurement activities and enhancing collaboration among various
Federal / Federally-sponsored statistics and metrics initiatives.
These initiatives include projects to develop and deploy statistics /
metrics tools and analysis techniques and workshops and other efforts
to improve the community's understanding of emerging
technologies/applications and to disseminate relevant results. Current
participants include: ARL, DARPA, DOE/SLAC, Kansas University, NCCOSC (Naval Command, Control and Ocean Surveillance Center), NLANR, and NSF. Immediate
International collaborations related to Internet statistics and metrics will also be discussed at the annual meeting of the Coordinating Committee for Intercontinental Research Networking (CCIRN) in Montreal, CA on June 29, 1996.
Quality of Service (QoS) Efforts:
Demands for implementing QoS levels for Internet offerings are increasing. From the providers standpoint, such offerings will enable increased revenue through being able to offer business quality services to users. From the users perspective, their ability to contract for higher QoS levels will enable many industries to switch from intranets and private networks to the Internet, or some variation thereof. The emerging QoS requirements of users -- most notably the higher education and research communities and automotive industry -- are addressed above. In addition to these user-driven demands for QoS, similar needs are being expressed by providers and (internationally) by regulators.
The Commercial Internet eXchange (CIX) has recently formed a QoS Working Group co-chaired by Bob Collet (Teleglobe & Chairman of CIX) and Barry Raveendran Greene (Singapore Telecom). The group is sponsoring a workshop entitled Quality of Service Metrics: Remaining Competitive on the afternoon of June 24, 1996, in conjunction with the CIX's annual meeting. In addition, the CIX is considering plans to initiate a survey of business-related metrics among its members this summer.
On the international front, user groups are demanding that current U.S.-centric pricing models be made more equitable. Current policies tend to saddle foreign Internet users with a disportionate responsibility for the cost of intercontinental circuitry. Recent meetings of Europe's DANTE organization have discussed this issue in some detail. Others involved in related discussions suggest that revision of intercontinental pricing models may occur concurrent with deployment of advanced services, such as RSVP, and improvements in traffic monitoring.
Foreign regulatory organizations are also increasingly interested in metrics and related measurements. Singapore Telecom, for example, requires Singapore's three ISPs to report quarterly QOS statistics. Primary business metrics include: network availability and system accessiblity (dial-up access, leased-line access international connectivity). Secondary indicators associated with service activation time include dial-up access and leased line access. Singapore Telecom also monitors customer support metrics such as number of telephone inquiries, enquiries via Internet e-mail, and the number of customer complaints per 1,000 subscribers. As measurement tools become more widely available, Singapore Telecom anticipates monitoring actual performance of these ISPs.
In a related area, MCI has announced two new alliances in June 1996 which it says are aimed (in part) at improving its ability to offer QoS over the net. The Concert alliance with British Telecom is proclaimed to offer first-ever global Internet service performance guarantees on a global scale. Through a separate alliance with Intel entitled WebMaker, MCI plans to utilize emerging Internet standards such as IP Multicast, RSVP and Real-Time Transport Protocol (RTP) to develop new QoS levels that support multicasting and bandwidth reservation across the Internet.
New IETF RFCs are also emerging suggesting alternatives to improve QoS. B. Rajagopalan and R. Nair's RFC on issues related to Quality of Service (QoS) - Based Routing in the Internet presents some potential requirements on path computation, efficiency, robustness and scalability, and describe some issues in realizing a QoS-based routing architecture. Other QoS RFCs which have recently been released include one by Fred Baker (Cisco), Roch Guerin (IBM), and Dilip Kandlur (IBM) entitled Specification of Committed Rate Quality of Service and by Peter Kim (Hewlett Packard Laboratories Bristol) on Link Level Resource Management Protocol (LLRMP) Protocol Specification. These RFCs and related QoS issues will be discussed at the IETF / INet '96 meetings
IP & Routing:
The table below provides an overview of the types of metrics which are currently desired related to IP traffic and routing. The relevance of these metrics to future financial settlements and to analyzing network performance is included -- ranging from low (minimal) relevance to high relevance. The table also indicates the tools currently available -- or yet to be developed -- related to gathering statistics on each metric.
NLANR maintains a repository of links to operational statistics data from research sites, ISPs and the NAPs, at http://oceana.nlanr.net/INFO/oldindex.html (Expired Link). NLANR will continue to support workload characterization at the FIX-West exchange point ( http://www.nlanr.net/NA/) until the gigaswitch is installed this summer.
Excellent sources of information on IP tools are also available from ESnet's Network Monitoring Task Force and from Merit. Increasingly tools like the network time protocol (NTP) daemon are being deployed by NSPs and by NAPs which should facilitate the collection of certain statistics. We discuss emerging tools below.
|Type||Applicable where||Relevance to Internet Settlements||Relevance to Analysis of Network Performance||Measurement Tools|
|- Access Capacity (bit/sec)||CC charge for bit rate; equipment cost depends on bit rate||a priori|
|- Connect Time ||CC charge for connect
|- Total Traffic (bytes)||transit traffic settlement between ISPs||router or access server stats; TCP dump sampling, RTFM meters; etc.|
|- Peak travel (bit/sec sustained for n sec.)||ISP/NSP overbooks trunks||router or access server stats; TCP dump sampling, RTFM meters; etc.|
|- Announced Routes (#)||at peering/exchange points & connection of subscribers to multiple subnets||TBD, analysis of routing tables, i.e. netaxs|
|- Route Flaps (#)||at peering/exchange points & connection of multihomed networks||TBD (currently available if peering through RA)|
|- Stability, e.g. route uptime/downtime, route transitions||at peering/exchange points and connection of multihomed networks||(currently available if peering through RA|
|- Presence of more specific routes with less specific routes||at peering/exchange points and connection of multihomed networks||TBD|
|- Number of reachable destinations (not just IP addresses) covered by a route||at peering/exchange points & connection of multihomed networks||TBD|
|- Delay (milliseconds)||Individual networks||
|- Flow Capacity (bits/sec)||everywhere (networks, routers, exchange points)||TBD, treno|
|- Mean Packet Loss Rate (%)||everywhere||TBD, ping,|
|- Mean RTT (sec)||everywhere||TBD, ping,|
|- HOP Counts/Congestion||everywhere||
|- Flow characteristics||exchange points, multihomed networks||Reporting by ISPs|
|- Network outage information (remote host unreachable)||Individual networks||Reporting by ISPs|
|- AS x AS matrices||Individual networks||Reporting by ISPs|
|- Information Source||connection of service provider (DNS or RR server); content provider (web server); info replicator (MBONE router & caches)||
||router or access server stats; tcpdump sampling, RTFM meters; etc.|
|- MBONE||Internet Infrastructure||
||TBD, some public tools|
|- Information caching hierarchy||Internet Infrastructure/individual caches||TBD, public tools|
Notes: Notes: CC - common carrier, ISP - internet service provider, NSP - national service provider, TBD - to be determined.
Privacy and security considerations must be addressed during the measurement of any metrics related to Internet traffic flows. tcpdpriv is one program which addresses these issues; additional programs may need to be developed/deployed as measurements become more commonplace.
Many of the metrics above are inherently problematic for the Internet infrastructure, and still require critical research. At INet '96, Vern Paxson' paper on Towards a Framework for Defining Internet Performance Metrics (compressed postscript here) will provide a cogent discussion of the need for new tools for passive and active measurement of network traffic. As these tools are developed, particular attention needs to be devoted to privacy considerations (particularly with passive measurement tools) and to designing tools which are minimally invasive (particularly on active measurements where the tools themselves can potentially disrupt or distort traffic patterns). Additional attention should also be devoted to the problem of routers dropping packets and how to measure this phenomenon.
A list of some emerging tools / initiatives which should facilitate IP analysis is provided below:
NetSCARF (Bill Norton, Merit) New project to develop a public domain network statistics collection and reporting facility that will evolve to meet the needs of ISPs. Funded by the Resource Allocation Committee, the first software release is expected in July 1996. Software will collect statistics necessary for individual networks to display graphs on total number of bits, system up time, and interfaces links. Current plans are to collect at 15-minute intervals, and display results by individual network. No near-term aggregation of data from multiple networks is expected. For information, contact Bill Norton (email@example.com) at Merit.
ATM performance is receiving increasing attention. Kansas University is planning a workshop on this topic on June 19-20, 1996. The event is hosted by Kansas University (contact Victor Foster at firstname.lastname@example.org), with sponsorship by DARPA and NCCOSC. For an excellent overview of the subject, see Don Endicott's slides (Expired Link) from this meeting.
Kansas University has also worked with the Department of Defense and Sprint to develop the Netspec tool for characterizing a range of interacting and independent loads on the ACTS ATM Internetwork (AAI) project's network. The Netspec web site also discusses the performance of this tool compared to other commonly used network performance testing tools, such as NetPerf, nettest, and ttcp.
A new RFC by Steven Berson (ISI) has just been released which provides guidelines for using ATM virtual circuits (Vcs) with QoS as part of an Integrated Services Internet. Other groups working on ATM tools include the ATM Forum and Bellcore.
A limited exchange of statistics should have direct payback for ISPs in achieving their goals such as configuring and managing their networks, and even greater payback to the Internet community by enabling analyses which can strengthen the overall information infrastructure.
Market Pressures upon ISPs to participate in such a consortium concept include:
Business Constraints hindering such cooperation include:
Technology Constraints hindering the collection and analysis of Internet metrics include the facts that:
Despite these challenges, the requirements for data collection, analysis, and distribution exist and will have to be addressed at some point in the near future. Collaboration toward this end is critical.
Developing an effective provider consortium will require (minimally):