Cooperation in Internet Data Acquisition and Analysis

Cooperation in Internet
Data Acquisition and Analysis

Presented at Coordination and Administration of the Internet
Cambridge, MA - September 8-10, 1996

by Tracie Monk (DynCorp)* and k claffy (nlanr)*

http://www.tomco.net/~tmonk/cooperation.html (Expired Link)

* The opinions in this paper are those of the authors and do not necessarily reflect the views of
any organizations with which the authors are affiliated.

Introduction

The Internet is emerging from a sheltered adolescence, growing at exponential proportions, full of potential and promise, but still relatively ignorant of the real world. It now faces a crossroads. Citizens, corporations and governments are waking to the opportunities presented by a truly connected global economy, and reexamining fundamental principles of intellectual property law and communications in light of the realities of cyberspace. Organizational behavior and boundaries, business practices and financial systems are also adapting to the new medium. Society is at the forefront of the information revolution.

The number of North American Internet service providers pioneering this revolution now stands at over 3,000, approximately a dozen of which qualify as national backbone providers. Internationally, the number of Internet hosts has almost doubled over the year ending July 1996, reaching 12,881,000. Domains quadrupled over this period to 488,000.1/ Competition is fierce among the builders and operators of this nascent infrastructure, driven by demands for additional capacity and new customers. However, neither the industry nor the research community that developed and nurtured the early Internet are spending significant attention on assessing current robustness or future capacity needs.

This paper has three goals. We first provide background on the current Internet architecture and describe why measurements are a key element in the development of a robust and financially successful commercial Internet. We then discuss the current state of Internet metrics analysis and steps underway within the Internet Engineering Task Force (IETF) as well as other arenas to encourage the development and deployment of Internet performance monitoring and workload characterization tools. Finally, we offer a model for a cooperative association for Internet data analysis among Internet competitors.

The Current Internet:
Requirements for Cooperation

No centralized authority oversees the development of the current commercial architecture. Providers, including traditional telcos, RBOCs, cable companies, and utilities, view one another as competitors and are therefore reluctant to coordinate their efforts. Yakov Rekhter, Internet researcher at Cisco Systems, notes that:

Despite all the diversity among the providers, the Internet-wide IP connectivity is realized
via Internet-wide distributed routing, which involves multiple providers, and thus implies
certain degree of cooperation and coordination. Therefore, we need to balance the
provider goals and objectives against the public interest of Internet-wide connectivity
and subscriber choices. Further work is needed to understand how to reach the balance.

--Yakov Rekhter, Routing in a Multi-Provider Internet

Most large providers currently collect basic statistics on the performance of their own infrastructure, typically including measurements of utilization, availability, and possibly rudimentary assessments of delay and throughput. In the era of the post-NSFnet backbone service, the only baseline against which these networks can evaluate performance is their past performance metrics. No data or even standard formats are available against which to compare performance with other networks or against some baseline. Increasingly, both users and providers need information on end-to-end performance, which is beyond the realm of what is controllable by individual networks.

The Transition

From 1986-1995 the National Science Foundation's NSFnet backbone served as the core of the Internet. Its decommissioning in April 1995 hailed a new era with commercial providers assuming responsibility for extending Internet services to millions of existing and new users. At the same time, this change left the Internet community with no dependable public source of statistics on Internet traffic flows. 2/

The post April 1995 architecture involved four new NSF-sponsored projects:

general purpose Network Access Points (NAPs) to which commercial backbone networks would connect to avert network partitioning as the NSFnet went away 3/

a routing arbiter, charged with the task of providing routing coordination in the new NSFnet architecture and promote stability of Internet routing in a significantly fluctuating environment -- including: maintaining routing policy databases and servers at the four priority NAPs (and later at the FIX West/MAE-West facility); developing advanced routing technologies, strategies, and management tools; and working with NSP/ISPs and NAP providers to resolve routing problems at the NAPs.

financially support interconnectivity for regional networks, with declining NSF funding to support the transition and commercialization of providers serving the U.S. higher education community

continue leading edge network research, development, and services through a cooperative agreement between NSF and MCI for the very High Speed Backbone Network Services (vBNS)

In August 1996, NSF announced the next step toward full commercialization of the existing Internet, declaring the NAPs and operational services of the RA successful and commercially viable. NSF will phase out its support for what they distinguished as priority NAPs and the Routing Arbiter service, and remove its stipulation that regional providers procure transit from national service providers (NSPs) who peer at each priority NAP.

This announcement comes at a time when the number of metropolitan and regional peering points are increasing, as are the number of networks that peer or want to peer. NSPs have also begun to favor direct peering arrangements with one another outside of the NAP architecture as a more economic and technically efficient means of sharing traffic between large networks. The higher education and research sectors are also shifting their attention toward a second generation architecture supported by high performance connections, and expanded use of the NSF/MCI vBNS and other federally sponsored research and education networks.

The transition to the new commercial environment, with its privately operated services and cross service provider NAP switching points, has significantly complicated statistics collection and left the Internet community without a dependable public source of statistics on Internet workloads. This most recent step in the transition, however, removes most of the government's remaining influence and increases the community's dependence on commercial providers to cooperatively manage this still fragile infrastructure. Empirical investigations of the nature of current workloads and their resource requirements, as well as how they change over time, remain a vital element in supporting the continued evolution of the Internet.

Importance of Workload Profiles for
Internet Pricing and Service Qualities

The Internet still strongly needs realistic pricing models and other mechanisms to allocate and prioritize its scarce resources, particularly bandwidth. Existing pricing practices center around leased line tariffs. As providers begin to employ multiple business models (various usage-based and fixed-price schemes), they face questions of which aspects of their service to charge and how to measure them. For example, one can measure bandwidth and traffic volumes in several ways, including average port utilization, router thruput, SNMP counters, and quotas of certain levels of priority traffic.4/

For example, Australia's Telstra imposes tariffs only on incoming traffic in order to encourage `Australia's content provision to a global market'.5/ Such usage-based tariffs are less common in the U.S. market, where many believe that they would stifle the utility of a growing, thriving Internet. However, this view receives increasing scrutiny, and usage-based pricing is already the norm for ISDN and other phone-based Internet services. Even some backbone ISPs now offer usage-based charging (albeit at a rough granularity), which often can decrease a customer's bill if they typically underutilize their prescribed bandwidth. Within a year pricing models in the U.S. will likely evolve into more refined and coherent methods. These models may include sample-based billing and measurement of traffic at the network provider boundaries, where individual providers apply alternative accounting techniques, e.g. measuring the source vs. destination of flows. 6/

U.S. providers are among the first to acknowledge the need for mechanisms to support more rational cost recovery, e.g., accurate accountability for resources consumed. The absence of these economic measures is troublesome given the ill preparedness of the U.S. Internet architecture and providers to deal with a large aggregation of flows, particularly if a significant number of those flows are several orders of magnitude higher volume than the rest, e.g., videoconferencing.

This disparity in size between most current Internet flows/transactions and newer multimedia applications with much higher volume and duration, necessitates revised metrics of network behavior. Analysis of traffic flows at the federally-sponsored FIX West facility, for example, have demonstrated averages of 15 packets per flow. As illustrated in the table and graphics below, typical cuseeme and mboneflows are exponentially higher.

Illustrative Sample Flows
from FIX-West
application	flows	packets	bytes	seconds	type
web	96482	1443763	821696977	1091907	absolute
-	0.302	0.330	0.595	0.193	fraction of total
-	569	14	8516	11	ave pkt size, pkts/fl, byts/fl, du/fl
ftp-data	850	124586	73647232	28717	absolute
-	0.003	0.028	0.053	0.005	fraction
-	591	146	86643	33	ave pkt size, pkts/fl, byts/fl, du/fl
Mbone [tunnel traffic]	35	202636	51292766	9226	absolute
-	0.000	0.046	0.037	0.002	fraction of total
-	253	5789	1465507	263	ave pkt size, pkts/fl, byts/fl, du/fl
cuseeme	15	16288	6385996	3812	absolute
-	0.000	0.004	0.005	0.001	fraction
-	392	1085	425733	254	ave pkt size, pkts/fl, byts/fl, du/fl
Source: NLANR, Aug. 30, 1996.

It is also important to note that average or mean flow statistics may be misleading, since some flows are orders of magnitude larger than the mean (a `heavy-tailed distribution'). Given this caveat, we note that the smaller mean volume of the cuseeme flows relative to that of the mbone flows is consistent with the characteristic usage of the applications. Cuseeme end users typically connect to each other for brief point-to-point conversations (often several times trying to get it working), resulting in many flows that are short by multimedia standards. Mbone flows, in contrast, tend to represent meetings, workshops, conferences, and concerts that last for hours if not days at time. In addition, current tools only measure mbone tunnel traffic, resulting in multiple mbone sessions appearing as a single flow.

Simple mean or peak utilization figures are therefore ineffective in addressing ISP engineering needs, without also knowing the transaction profile constituting and perhaps dominating those figures. Tracking workload profiles requires measuring flow data at relevant network locations. Currently, the only significant public source of multipoint workload characterization is from the FIX-West facility and at several NSF supercomputing sites.

Below we provide graphical depictions of three types of traffic across FIX-West. Note that the average (mean) of general Internet traffic fluctuates at around 50-80 packets per flow. Cuseeme and mbone traffic, on the other hand, illustrate significant unpredictability and variability in their averages, ranging around 500 packets per flow and 10,000 packets per flow, respectively. (Readers can take their own samplings using tools at: http://www.nlanr.net/NA. Double clicking on the graphics will enlarge their size.)

General traffic flows at FIX-West

Cuseeme traffic flows at FIX-West

Mbone traffic flows at FIX-West

Demands by Internet users for providers to implement multiple service levels are increasing. From an ISP standpoint, such offerings will allow increased revenue through alternative business quality services. From the user perspective, their ability to contract for higher priority service will enable many industries to switch from intranets and private networks to a lower cost, more ubiquitous Internet-based infrastructure. The ability to specify or reserve these services, however, requires development and implementation of mechanisms for accounting and pricing, which inevitably depend on reliable traffic flow data. Factors inhibiting such financial measures in the U.S. include the unclear dynamics of inter-ISP business mechanics. While some suggest that the ITU/telco settlements model may have relevance to ISP settlements, most industry analysts agree that the connectionless nature of the IP protocol demands entirely new pricing and settlement models.

Rational pricing of multiple service qualities will also provide clear feedback to providers and users on the value of Internet resources. As equipment vendors develop, and ISPs deploy, technologies that support quality signals, services should evolve to permit users to designate a service quality for which they are willing to pay, with higher demand services such as videoconferencing priced according to their value to the user.

Threat of Government Intervention

With the simultaneous diversification and usage explosion of the infrastructure, Internet service providers have not been in a position to provide accurate statistics models. Given the narrow profit margins and dearth of qualified personnel in the industry, providers are reluctant to dedicate manpower or other resources to statistics collection and analysis, allocating them instead to the monumental tasks associated with accommodating service demand. Eventually, larger telecommunication companies will devote attention to this area. Unfortunately, there may not be sufficient pressure upon the NSPs until problems such as congestion and outages worsen, and either billed customers demand (and are willing to pay for) better guarantees and data integrity, or, the government intercedes to dictate acceptable policies and practices for the Internet.

The telecommunications industry is a classic example of the government inserting itself and dictating cooperation among competing companies. In 1992 the Federal Communications Commission (FCC) mandated the formation of the Network Reliability Council (now the Network Reliability and Interoperability Council)7/ following a major communications outage on the East Coast. Since the Internet has only recently received attention as a critical element of the national infrastructure, it has escaped such intense regulatory scrutiny. As the emerging backbone of the national and global information infrastructures (NII/GII) however, this relative obscurity may be a passing luxury.

In Executive Order 13010 dated July 15, 1996, President Clinton established a Commission on Critical Infrastructure Protection, to develop `a strategy for protecting and assuring the continued operation of this nation's critical infrastructures' including telecommunications, electrical power systems, gas and oil transportation, banking and finance, transportation, water supply systems, emergency services, and continuity of government. Its mission objectives refer directly to the importance of cyberthreats; its output (due July 1997) includes proposals for `statutory or regulatory changes necessary to effect its recommendations'.

Most recently, attention has focused on RBOC claims that the Internet has detrimental effects on their telephony infrastructure. While many Internet analysts contest these allegations, the FCC and the NRIC are reviewing a study sponsored by the baby Bell companies on the impacts of the Internet's growth during Spring 1996. The FCC's interest in this subject is closely tied to its review of the America's Carrier Telecommunication Association (ACTA) petition, requesting that Internet telephony be banned, and to renewed discussions surrounding the so-called modem tax.

While the FCC, National Telecommunications Information Agency (NTIA), and Congress have thus far been relatively mute with respect to regulation of the Internet, it is idealistic to assume that they will remain so. Failure of industry participants to respond to requirements for more formal mechanisms of cooperation will slow the current pace of the Internet's evolution, particularly for electronic commerce and higher grade services. It will also increase the pressure upon governments, both U.S. and foreign, to intercede to protect these critical information resources.

Steps Toward
Improving Internet Measurements

The new commercial Internet is characterized by hundreds of ISPs, many on shoestring budgets in low margin competition, who generally view statistics collection as a luxury that has never proven its operational utility. Note the last publicly available source of Internet workload and performance data, for the NSFNET backbone, was basically a gift from the NSF, an investment of U.S. tax dollars with the hope that tools, methodologies, theories of traffic, refinements, feedback would emerge from the efforts in the IETF and other arenas. But there was never any fiscal pressure that the statistics collection activity to justify the resources it required within the cost structure of providing Internet service. It was never forced to prove itself worthwhile. And (surprise...) it didn't.
`but some data is worse than others': measurement of the global
Internet, in Telegeography, k claffy, August 1996)

Implications of the Transition for
Data Acquisition/Analysis

The Internet architecture remains in a state of transition. The large number of commercial providers and the proliferation of cross service provider exchange points, render statistics collection a much more difficult task. In addition, the challenges inherent in Internet operation, particularly given its still `best effort' underlying protocol technology, fully consume the attention of ISPs. Given its absence from the list of their top priorities, data collection and analysis continue to languish.

Yet it is detailed traffic and performance measurement that has heretofore been essential to identifying the causes of network problems and ameliorating them. Trend analysis and accurate network/systems monitoring permit network managers to identify hot spots (overloaded paths), predict problems before they occur, and avoid them by efficient deployment of resources and optimization of network configurations. As the nation and world become increasingly dependent on the NII/GII, mechanisms to enable infrastructure-wide planning and analysis will be critical.

Efforts Stimulating Action by Providers

Since the advent of the world wide web, the proliferation of users and the lack of cooperation among commercial ISPs has resulted in significant degradation of service quality for some of the higher end user communities. The higher education and research communities were among the first groups to depend on the Internet, and also among the most vocal in publicly acknowledging its recent inability to meet their growing expectations. Since the Educom-sponsored `Monterey Conference' in October 1995, representatives of the higher ed community have met repeatedly, in forums sponsored by the Federation of American Research Networks (FARNET), Educom's National Telecommunications Task Force (NTTF) and others, to assess their internetworking needs and develop plans to address them.

At the most recent meeting in Colorado Springs, Colorado, a post-Monterey white paper concluded that `the commodity Internet may not evolve rapidly enough to meet higher education's imminent and foreseeable high performance enterprise networking and internetworking capacity and service needs.' Participants at the meeting set goals for a second generation Internet to support the higher education and research communities, aimed at interconnecting enterprise networks in various stages of migration to higher performance technologies, and at controlling cost and pricing/allocation models from within the enterprise. General requirements for this Internet II include: 8/

improved information security
authorization and authentication capabilities
network management capabilities including performance audits

Internet II would also have definable and measurable qualities of service, including:

latency and jitter specifications
bandwidth interrogation and reservation capabilities
packet delivery guarantees

The business community is also reassessing its goals for the Internet. In October 1996, the industry-driven Cross Industry Working Team (XIWT) plans to hold two invitational workshops on Internet evolution. The first meeting will focus on defining the near term requirements of major Internet user groups; the second will address implications of these and other requirements for Internet service providers and vendors. The latter event will concentrate on developing mechanisms for Internet industry cooperation, including those related to `facilitating the technical and business practices and procedures needed to improve the performance of Internet applications and services by enhancing cooperation among providers of Internet services, equipment and applications'.

Other user groups that view the Internet as mission critical are moving independently to address their industries' service requirements. Most notable is the Automotive Industry Action Group (AIAG), who will announce in September their selection for an `overseer' to support the major automobile manufacturers and their thousands of suppliers, by:

certifying a small number of highly competent Internet service providers to interconnect automotive trading partner's private networks;

monitoring providers' ongoing compliance with performance standards ;

enforcing strict security mechanisms to authenticate users and protect data, thereby creating a virtual private network for the auto industry.

The AIAG has identified several metrics it views as critical to this initiative and future monitoring efforts. These include performance metrics such as latency, packet/cell loss, link utilization, throughput, and BGP4 configuration and peering arrangements; as well as reliability metrics such as physical route diversity; routing protocol convergence times; disaster recovery plans; backbone, exchange point and access circuit availability; and replacement speed of failed customer premise equipment.9/

Within the federal community, NSF has been the most proactive in supporting Internet measurement research and traffic analyses.10/ However, other federal agencies grow increasingly active in this critical arena. The Department of Energy (DOE), for example, has established a measurement working group and has tasked teams of high energy physics community researchers with monitoring global Internet traffic patterns and performance metrics for ESnet. 11/ The Department of Defense (through DARPA and the HPCMO) hosted an ATM performance measurement workshop in June and is currently developing and deploying measurement tools such as NetSpec across its ATM networks.12/ The Federal Networking Council (FNC) is also forming a statistics/metrics cross-agency working group and has expressed support for the creation of a North American CCIRN statistics/metrics working group.13/

Technical Challenges

The sections below describe some of the current and future technical challenges associated with Internet metrics and collection of statistics.

Lack of Common Definitions: There are several modest efforts underway currently to collect statistics across specific networks or at select peering points, see: http://oceana.nlanr.net/INFO/Nsfnet/ubiquity.html (Expired Link). Unfortunately, these efforts lack a common framework and common definitions of Internet metrics, limiting the comparability of results. The IETF's Internet Provider Performance Metrics (IPPM) working group is addressing this situation via development of an IP performance metrics framework. In advance of the December IETF meeting, IPPM members are also developing draft RFCs defining metrics for: roundtrip and one-way delay; flow capacity; packet loss, modulus of elasticity, connectivity/availability; and route persistence and route prevalence. Such steps toward a more strongly specified common framework will facilitate future metrics discussions.

Lack of Consensus on Traffic Modeling: There is as yet no consensus on how statistics can support research in IP traffic modeling. There is also skepticism within the Internet community regarding the utility of empirical studies: critics claim that because the environment changes so quickly, within weeks any collected data is only of historical interest. They argue that research is better served by working on mathematical models rather than by empirical surveys that capture, at most, only one stage in network traffic evolution.

However, prediction of performance metrics, e.g., queue lengths or network delays, using traditional closed-end mathematical modeling techniques such as queuing theory, have met with little success in today's Internet environment. The assumption of Poisson arrivals was acceptable years ago for the purposes of characterizing small local area networks (LANs). As a theory of wide area internetworking behavior, however, Poisson arrivals -- in terms of packet arrivals within a connection, connection arrivals within an aggregated stream of traffic, and packet arrivals across multiple connections -- have demonstrated significant inconsistency with collected data.14/

A further contributing factor to the lag of Internet traffic modeling is the early financial structure of the Internet. A few U.S. government agencies assumed the financial burden of building and maintaining the early transit network infrastructure, leaving little need to trace network usage for the purposes of cost recovery. As a result, since the transition Internet customers have had little leverage with their service providers regarding service quality.

Lack of Adequate Tools: Many applications in today's Internet architecture inherently depend on the availability of the infrastructure on a fairly pervasive scale. Yet wide area networking (WAN) technologies and applications have advanced much faster than has the analytical and theoretical understanding of Internet traffic behavior. Devices connected to WANs are increasing at 30-50% per year, networking traffic doubling every 10-18 months by some estimates, and vendors such as Netscape aim to release new products every six months. Overall, Internet-related companies continue to pour money into hardware, pipes, and multimedia-capable tools, with little attention to the technical limitations of the underlying infrastructure or tools to monitor this increasingly complex system.

Even the basic application level protocols of the TCP/IP suite, e.g., ftp and telnet, are becoming less reliable in the face of network congestion and related infrastructure problems. Yet, network managers today, both ISPs and end users, have few tools available to effectively monitor networks (end-to-end) so as to avoid potential problems. Performance of such applications depends on many inter-related factors, including: packet loss, network end-to-end response time, number and quality of intermediate hops and the route used, link bandwidth and utilization, end node capability. There is no suite of tools available for remotely monitoring, assessing or directly intervening to affect these conditions, particularly if the problem arises beyond an individual network's border. 15/

As traditional phone companies enter the Internet marketplace, armed with years of experience with analytic tools for modeling telephony workloads and performance, it is tempting to assume imminent remedy of the shortage of measurement tools. Unfortunately, models of telephony traffic developed by Bell Labs and others are not readily replicable to the Internet industry. Internet traffic is not only fundamentally inconsistent with traditional queueing theory, the former framed with best-effort rather than deterministic service protocols, but also exists on an infrastructural mesh of competing service providers with slim profit margins. As a result, telephony tables of acceptable blocking probability (e.g., inability to get a dial tone when you pick up the phone) suggest standards that are far in excess of that achievable in today's marketplace.

Emerging Technologies: Currently available tools for monitoring IP traffic (layer 3) focus on metrics such as response time and packet loss (ping), reachability (traceroute), throughput (e.g., ftp transfer rates), and some workload profiling tools (e.g., oc3mon and Cisco's flow stats). These metrics are significantly more complex when applied to IP "clouds". For example, to date there is no standard methodology for aggregating metrics such as loss measurement over various paths through a cloud and communicating it as a single metric for the provider.

With the transition to ATM and high speed switches, IP layer analysis may no longer be technically feasible. Most commercial ATM equipment, for example, is not capable of accessing IP headers. In addition, the primary network access and exchange points in the U.S. are chartered as layer 2 entities, providing services at the physical or link layer without regard for the higher layers. Because most of the NSFnet statistics reflected information at and above layer 3, the exchange points cannot use the NSFnet statistics collection architecture as a model upon which to base their own operational collection. Many newer layer 2 switches, e.g., DEC gigaswitch and ATM switches, have little if any capability for performing layer 3 statistics collection, or even for looking at traffic in the manner allowed on a broadcast medium (e.g., FDDI and Ethernet), where a dedicated machine can collect statistics without interfering with packet forwarding. Statistics collection functionality in newer switches takes resources directly away from forwarding, driving customers toward switches from competing vendors who sacrifice such functionality in exchange for speed.

The table which follows identifies key uses for Internet statistics and metrics -- both within individual networks and infrastructure-wide. The characteristics of requisite tools, their likely deployment options, and the status/problems associated with their use are also identified.

Internet Measurement Tools:
illustrative requirements, uses and current status

Requirements	Description	Tool(s) characteristics	Deployment	Status / Problems
Internet- wide Planning (end-to-end)	dynamic web cache management optimize queuing congestion & scaling dynamics aggregate transport behavior capacity / topology planning	trace-driven experiments (full headers) flow characterization AS traffic matrices, eg. measures of aggreg. traffic flow, route leakage, etc. visualization tools	few high aggregation exchange points core backbone routers & multihomed networks	tools not generally available ISPs reluctant to report routers don't support complicated by trend toward private peering switches can't "sniff" data, req. tools on every port
Network Planning (ISP- specific)	" "	" "	beacons strategically located through backbone	requires availability of better tools, education as to their importance and use, & standard interpretation of their use
Internet-wide Management (end-to-end)	outage reporting (host unreachable) trouble ticket tracking	various database implementations & automated probe measurement & reporting	individual networks reporting to centralized authority	low economic incentives ISPs reluctant to report
Testing New Applications/ Protocols	such as IP v.6, bandwidth reservation, QoS, caching, multicast, directory services	interoperability tracking across ISPs	few high aggregation peering points internal & border routers	low economic incentives
Benchmarking routers / switches	performance, reliability, interoperability	traffic generators, monitors, analyzers	peering points internal and backbone routers	no current mechanisms for standardized field testing
Perform ance Monitoring -- reliability, availability & serviceability	evaluating ISP performance, e.g., latency, loss, thruput route stability / flapping / reachability	Mathis' treno; Paxson's probe daemon; ping route (BGP) tables	exchange points connection of multihomed networks border routers uncongested customer sites network beacons RA server	no validated standard metrics no comparability among provider stats current tools too invasive limited statistics capabilities in routers -- none in switches; most ISPs not using RADB
Performance Monitoring -- QoS / Settlements	required for billing user / customers req. for transit & related agreements	measures of usage, i.e., router or access server stats, header sampling, flow meters	exchange points & routers	no validated standard metrics or agreements; no auditing capabilities for users

In viewing this table and similar Internet measurement materials, it is helpful to distinguish between tools designed for Internet performance and reliability measurement from those meant for traffic flow characterization. While there is a dearth of tools in both areas, equipment / software designed to measure performance and network reliability are generally more readily available and easier to deploy by users and providers alike. Many of these tools treat the Internet as a black box, measuring end-to-end features such as packet loss, latency, and jitter from points originating and terminating outside individual networks. These performance and reliability tools are fundamental to evaluating and comparing alternative providers and to monitoring service qualities.

Traffic flow characterization tools, on the other hand, can yield a wealth of data on the internal dynamics of individual networks and cross-provider traffic flows. They enable network architects to better engineer and operate their networks and to better understand global traffic trends and behavior -- particularly as new technologies and protocols are introduced into the global Internet infrastructure. Deployment of this type of tool must be within the networks -- particularly at border routers and at peering points. Traffic flow characterization tools therefore require a much higher degree of cooperation and involvement by service providers than do performance oriented tools.

Both types of measurement tools are critical to enhancing the overall quality and robustness of the Internet. They also contribute to our ability to ensure the continued evolution of the Internet infrastructure, both in strengthening its ability to meet the needs of its diverse user communities and maintaining its flexibility to accommodate opportunities presented by future technologies and applications.

Cooperative Association
for Internet Data Analysis

The authors believe that the best means for addressing the cooperation requirements outlined in this paper is through formation of a provider consortium. Market pressures upon ISPs to participate in such a forum include the increasing dependence of users (customers of providers) on the Internet for mission critical applications, resulting in demands for higher qualities of service and evidence of performance compliance by providers. Economic models of the Internet are also evolving and will soon include settlements based on authenticated, likely confidential provider statistics. Lastly, the meshed nature of the global Internet dictates that no single company can do it alone. Systemic improvements to the Internet infrastructure and to the operational practices of its providers will necessitate collaboration and cooperation among the competitive telecommunications firms.

An industry-driven consortium could spearhead efforts to: develop cross-network outage and trouble ticket tracking; monitor congestion and relevant traffic patterns, including routing; promote studies of peering relationships and testbeds for examining emerging Internet technologies and protocols such as IPv.6, dynamic caching, bandwidth reservation protocols and QoS routing; and provide a forum for discussion and eventual implementation of charging policies. 16/

From the standpoint of statistics collection and analysis, such a forum could:

facilitate the identification, development and deployment of measurement tools across the Internet;
provide commercial providers with a neutral, confidential vehicle for data sharing and analysis;
provide networking researchers and the general Internet community with reliable data on Internet traffic flow patterns;
enhance communications among commercial Internet service providers, exchange/peering point providers and the broader Internet community.

Business Constraints to Cooperation

The business constraints hindering such cooperation relate to the competitive nature of the Internet business environment, as well as the appearance of industry collusion by major providers. However, a charter with principals of openness and inclusion can readily address these concerns, as well as addressing constraints arising from the lack of adequate pricing models and other mechanisms for economic rationality in Internet business practices.

Probably the most relevant constraint to cooperation is that of data privacy, which has always been a serious issue in network traffic analysis. Many ISPs have service agreements prohibiting them from revealing information about individual customer traffic. Collecting and using more than aggregate traffic counts often requires customer cooperation regarding what to collect and how to use it. However, provisions of the Omnibus Crime Control and Safe Streets Act of 1968, Section 2511.(2)(a)(i) accord communications providers considerable protection from litigation:

It shall not be unlawful under this chapter for an operator of a switchboard, or an officer, employee, or agent of a provider of wire or electronic communication service, whose facilities are used in the transmission of a wire communication, to intercept, disclose, or use that communication in the normal course of his employment while engaged in any activity which is a necessary incident to the rendition of his service or to the protection of the rights of property of the provider of that service, except that a provider of wire communication service to the public shall not utilize service observing or random monitoring except for mechanical or service quality control checks.

Responsible providers could go further than the law and anonymize monitored traffic with tools such as tcpdpriv, virtually eliminating any accusations of breach of privacy.17/

Technical Constraints to Cooperation

Technology constraints hindering the collection and analysis of data on Internet metrics center on the nascent development stage of IP and ATM measurement tools and supporting analysis technologies, and on complications arising from adoption of new and emerging technologies, e.g. gigaswitches and ATM. Generally, we view these and other technical constraints as solvable given sufficient technical attention and market pressure.

Next Steps

Despite the business and technical challenges, requirements for cooperation among Internet providers will continue to grow, as will demands for enhanced data collection, analysis, and dissemination. Development of an effective provider consortium to address these needs would require, minimally:

participation by 3 or more of the major service providers, e.g., ANS, AT&T, BBN Planet, MCI, Netcom, PSI, Sprint, or UUNet

participation by a neutral third party with sufficient technical skills to provide the core data collection and analysis capabilities required by the consortium

appropriate privacy agreements to protect the interests of members

agreement on which basic metrics to collect, collate, analyze, and present (assuming differences in the granualarities of data available to consortium members vs. approved researchers vs. the general public)

agreement on which tools to develop, particularly those related to emerging infrastructures using new technologies

A consortium organization could also make available a solid, consistent library of tools that would appeal to both users and providers. Data collection by the consortium should strictly focus on engineering and evolution of the overall Internet environment, e.g. accurate data on traffic patterns that could enhance engineers' ability to design efficient architectures, conserving manpower and other resources currently devoted to this task. The right statistics collection and cross-ISP dissemination mechanisms would also facilitate faster problem resolution, saving the time and money now devoted to tracking problems, e.g., route leakage, link saturation and route flapping. Finally, experience with data will foster the development of more effective usage-based economic models, which, in turn, will allow ISPs to upgrade their infrastructure in accordance with evolving customer demands.

Developing the appropriate metrics and tools to measure traffic phenomena, as well as end-to-end performance and workflow characteristics, remains a daunting task. Other areas where resources are needed to improve the Internet infrastructure include:

development of more powerful routers for core Internet components, a prohibitively expensive endeavor with too small a potential market and thus too little return to motivate vendors to pursue independently

sponsorship of short-term research into basic traffic engineering methodologies given limited data, and longer term research into the implications of realistic theoretical and empirical traffic characterization

development of a public measurement infrastructure

For additional information on these topics, see:
A Survey of Internet Statistics / Metrics Activities, T. Monk and k claffy
`but some data is worse than others': measurement of the global Internet , k claffy

Footnotes

Information on Network Wizard's Internet Domain Name Survey, July 1996 is available at http://www.nw.com/zone/WWW/report.html.

The NSFnet Backbone Service: Chronicling the End of an Era by Susan R. Harris, Ph.D., and Elise Gerich, in ConneXions, Vol. 10, No. 4, April 1996, provides a good overview of the NSFnet 1989-1995, see:
Mitigating the coming Internet crunch: multiple service levels via Precedence by R. Bohn, H-W Braun, K Claffy, and S. Wolff (1994), proposes three components of a short-term solution. First, network routers would queue incoming packets by IP Precedence value instead of the customary single-threaded FIFO. Second, users and their applications would use different and appropriate precedence values in their outgoing transmissions according to some defined criteria. Third, network service providers may monitor the precedence levels of traffic entering their network, and use some mechanism such as a quota system to discourage users from setting high precedence values on all their traffic. All three elements can be implemented gradually and selectively across the Internet infrastructure, providing a smooth transition path from the present system. The experience we gain from an implementation will furthermore provide a valuable knowledge base from which to develop sound accounting and billing mechanisms and policies in the future. The paper is available at: http://www.nlanr.net/Papers/mcic.html.
An excellent summary of pricing practices and policies in OECD countries is in the OECD report: Information Infrastructure Convergence and Pricing: the Internet, January 1996, at
Hal Varian's web site at http://www.sims.berkeley.edu/resources/infoecon provides a useful introduction to Internet economics.

In his recent Internet Draft, on Metrics for Internet Settlements, Brian Carpenter (CERN) asserts that financial settlements are a `critical mechanism for exerting pressure on providers to strengthen their infrastructures'. He suggests that metrics used in Internet settlements should not rely on expensive instrumentation such as detailed flow analysis, but rather simple measurements, estimated, if necessary, by statistical sampling. Internet draft ftp://ds.internic.net/internet-drafts/draft-carpenter-metrics-00.txt. (Expired Link)

Additional information on the Network Reliability and Interoperability Council is at and The electric industry's power grid network has many similarities to that of the connectionless IP network. For information on North American Electric Reliability Council - NERC, see: http://www.nerc.com/.
On August 8-9, 1996, FARNET, with support from the Resource Allocation Committee, Educom's NTTF, The Coalition for Networked Information, NYSERNet, Advanced Network and Services (ANS) and NSF, convened a workshop at Cheyenne Mountain in Colorado Springs, CO. Workshop attendees included leaders in networking from the higher education community, industry and government. The workshop sought to continue previous steps to articulate higher education's networking needs and requirements for the rest of this century and beyond. Details on this meeting are available at: http://www.educause.edu/netatedu/reports/index_farnet.html.

Information on the automotive industry's telecommunications initiatives is at the Automotive Industry Action Group's (AIAG) Telecommunications working group page: http://www.aiag.org/project/telecom/telecomm.html (Expired Link). Robert Moskowitz (Chrysler) also described some of these activities at NLANR's ISMA workshop in February 1996.

NSF has supported several recent projects and events related to Internet traffic analysis. NLANR efforts include:
- The Internet Statistics and Metrics Analysis (ISMA) workshop (February 1996): http://moat.nlanr.net/ISMA/
- Traffic measurements at the FIX West facility and across the vBNS: http://www.nlanr.net/NA
- Vizualizations of caching traffic and mbone topologies, as well as bgp peering relationships among autonomous systems: http://www.nlanr.net/INFO (Expired Link).
- Development of software using modified pings to assess end-to-end performance: http://www.nlanr.net/Viz/End2end
- Summaries of available provider and NAP statistics: http://www.nlanr.net/INFO (Expired Link) and other sources of relevant statistics/metrics information.
The NSF-supported Routing Arbiter project collects network statistics at the Ameritech, MAE-East, MAE-West, PacBell, and Sprint interconnection points. The Merit/ISI RA web page (http://www.ra.net) is a launching point from which to view both graphical and text representations of routing instabilities, NAP statistics, trends, etc.

DOE's ESnet established a "State of the Internet" working group in May 1996. The working group and ESnet's Network Monitoring Task Force (NMTF) are working with other organizations to implement enhanced WAN statistics collection / analysis throughout its network and the global high energy physics community.

In June 1996, Kansas University hosted an ATM performance workshop. The workshop sought to exchange ideas on ATM WAN measurement tools and techniques, discuss ATM WAN experiments to establish today's performance limits, identify the factors that affect performance, and to identify first-order opportunities for improving network performance. Its focus was on "solid engineering techniques for measurements, concise analysis of measurements that have been taken, improvements made, and proposals for what should be done in the near future to further enhance performance." Papers presented at this meeting are available at: http://www.tisl.ukans.edu/workshops/ATM_Performance/ shops/ATM_Performance/

The Coordinating Committee for Intercontinental Research Networking (CCIRN) is setting up a working group on Internet statistics and metrics. Europe's TERENA, Asia's Asia-Pacific Networking Group (APNG), and the U.S.'s Federal Networking Council (FNC) have the responsibility to establish similar working groups for their continents. CCIRN anticipates that participation in these groups will include representatives from commercial, research, and government sectors.

Although Internet traffic does not exhibit Poisson arrivals, the cornerstones of telephony modeling, a number of researchers have measured a consistent thread of self-similarity in Internet traffic. Several metrics of network traffic have heavy tailed distributions:
- call holding times (CCSN/SS7) (telephone call holding times)
- telnet packet interarrivals
- FTP burst size upper tail
- transmission times of WWW files
Recent theorems have shown that aggregating traffic sources with heavy-tailed distributions leads directly to (asymptotic) self-similarity. In an article for Statistical Science (1994), W. Willinger identified three minimal parameters for a self-similar model:
- the Hurst (H) parameter, which reflects how the time correlations scale with the measurement interval
- variance of the arrival process
- mean of the arrival process
Although self-similarity is a parsimonious concept, it comes in many different colors, and we only now are beginning to understand what causes it. Self-similarity implies that a given correlational structure is retained over a wide range of time scales. It can derive from the aggregation of many individual, albeit highly variable, on-off components. The bad news about self-similarity is that it is a significantly different paradigm that requires new tools for dealing with traffic measurement and management. Load service curves (e.g., delay vs. utilization) of classical queueing theory are inadequate; indeed for self-similar traffic even metrics of means and variances indicate little unless accompanied by details of the correlational structure of the traffic. In particular, self-similarity typically predicts queue lengths much higher than do classical Poisson models. Researchers have analyzed samples and and found fractal components of behavior in a wide variety of network traffic (SS7, ISDN, Ethernet and FDDI LANs, backbone access points, and ATM).
Still unexplored is the underlying physics that could give rise to self-similarity at different time scales. That is, at millisecond time scales, link layer characteristics (i.e., transmission time on media) would dominate the arrival process profile, while at the 1-10 second time scales the effects of the transport layer would likely dominate. Queueing characteristics might dominate a range of time scales in between, but in any case the the implication that several different physical networking phenomena manifest themselves with self-similar characteristics merits further investigations into these components.
See the paper by Les Cottrell and Connie Logg entitled "Network Monitoring for the LAN and WAN", presented at ORNL, June 24, 1996 at http://www.slac.stanford.edu/grp/scs/net/talk/ornl-96/ornl.htm.

Traffic Matrices - router vendors are in no position to support the collection of traffic matrices, despite the general perception that aggregate traffic matrices are crucial to backbone topology engineering. In addition to allowing the discovery of mistraffic, e.g, route leakage or a customer accidentally sending huge amounts of unintended traffic into the core, traffic matrices combined with route flap data are essential to an ISP's ability to communicate problems to peer ISPs when necessary. Backbone engineers consider traffic matrix data significantly more important than flow data for short to medium term engineering, and it may be essential to the investigation of Big Internet issues (e.g., routing, addressing) as well.
While telcos have long measured traffic matrices for phone network engineering, ISPs have had both technical legal, and resource limitations as obstacles to collecting -- not to mention sharing -- such measurements. Collecting packet headers, though essential for researchers to develop realistic models and analysis techniques, is even more technically and logistically problematic.

The number of forums representing the Internet industry -- service providers, vendors, content developers and others -- is growing rapidly. Several of the established groups are listed below. We do not imply that any of these organizations possess the mandate or capacity to facilitate the types of technical and engineering collaborations we describe in this paper.
Developed by Greg Minshall (Ipsilon), tcpdpriv takes packets captured by tcpdump and removes or scrambles data e.g., source and destination hosts and ports, within the packet to protect privacy.

last updated September 24, 1996
please direct questions or comments to tmonk@ixiacom.com

Related Objects

See https://catalog.caida.org/paper/1996_cooperation/ to explore related objects to this document in the CAIDA Resource Catalog.

Cooperation in Internet Data Acquisition and Analysis

Introduction

The Current Internet: Requirements for Cooperation

Steps Toward Improving Internet Measurements

Cooperative Association for Internet Data Analysis

Related Objects

The Current Internet:
Requirements for Cooperation

Steps Toward
Improving Internet Measurements

Cooperative Association
for Internet Data Analysis