Cooperation in Internet Data Acquisition and Analysis
Data Acquisition and Analysis
Presented at Coordination and Administration of the Internet
Cambridge, MA - September 8-10, 1996
by Tracie Monk (DynCorp)* and k claffy (nlanr)*
http://www.tomco.net/~tmonk/cooperation.html (Expired Link)
* The opinions in this paper are those of the authors and do not necessarily reflect the views of
any organizations with which the authors are affiliated.
Introduction
The Internet is emerging from a sheltered adolescence, growing at exponential proportions, full of potential and promise, but still relatively ignorant of the real world. It now faces a crossroads. Citizens, corporations and governments are waking to the opportunities presented by a truly connected global economy, and reexamining fundamental principles of intellectual property law and communications in light of the realities of cyberspace. Organizational behavior and boundaries, business practices and financial systems are also adapting to the new medium. Society is at the forefront of the information revolution.
The number of North American Internet service providers pioneering this revolution now stands at over 3,000, approximately a dozen of which qualify as national backbone providers. Internationally, the number of Internet hosts has almost doubled over the year ending July 1996, reaching 12,881,000. Domains quadrupled over this period to 488,000.1/ Competition is fierce among the builders and operators of this nascent infrastructure, driven by demands for additional capacity and new customers. However, neither the industry nor the research community that developed and nurtured the early Internet are spending significant attention on assessing current robustness or future capacity needs.
This paper has three goals. We first provide background on the current Internet architecture and describe why measurements are a key element in the development of a robust and financially successful commercial Internet. We then discuss the current state of Internet metrics analysis and steps underway within the Internet Engineering Task Force (IETF) as well as other arenas to encourage the development and deployment of Internet performance monitoring and workload characterization tools. Finally, we offer a model for a cooperative association for Internet data analysis among Internet competitors.
The Current Internet:
Requirements for Cooperation
No centralized authority oversees the development of the current commercial architecture. Providers, including traditional telcos, RBOCs, cable companies, and utilities, view one another as competitors and are therefore reluctant to coordinate their efforts. Yakov Rekhter, Internet researcher at Cisco Systems, notes that:
Despite all the diversity among the providers, the Internet-wide IP connectivity is realizedMost large providers currently collect basic statistics on the performance of their own infrastructure, typically including measurements of utilization, availability, and possibly rudimentary assessments of delay and throughput. In the era of the post-NSFnet backbone service, the only baseline against which these networks can evaluate performance is their past performance metrics. No data or even standard formats are available against which to compare performance with other networks or against some baseline. Increasingly, both users and providers need information on end-to-end performance, which is beyond the realm of what is controllable by individual networks.
via Internet-wide distributed routing, which involves multiple providers, and thus implies
certain degree of cooperation and coordination. Therefore, we need to balance the
provider goals and objectives against the public interest of Internet-wide connectivity
and subscriber choices. Further work is needed to understand how to reach the balance.
--Yakov Rekhter, Routing in a Multi-Provider Internet
The Transition
From 1986-1995 the National Science Foundation's NSFnet backbone served as the core of the Internet. Its decommissioning in April 1995 hailed a new era with commercial providers assuming responsibility for extending Internet services to millions of existing and new users. At the same time, this change left the Internet community with no dependable public source of statistics on Internet traffic flows. 2/
The post April 1995 architecture involved four new NSF-sponsored projects:
- general purpose Network Access Points (NAPs) to which commercial backbone networks would connect to avert network partitioning as the NSFnet went away 3/
- a routing arbiter, charged with the task of providing routing coordination in the new NSFnet architecture and promote stability of Internet routing in a significantly fluctuating environment -- including: maintaining routing policy databases and servers at the four priority NAPs (and later at the FIX West/MAE-West facility); developing advanced routing technologies, strategies, and management tools; and working with NSP/ISPs and NAP providers to resolve routing problems at the NAPs.
- financially support interconnectivity for regional networks, with declining NSF funding to support the transition and commercialization of providers serving the U.S. higher education community
- continue leading edge network research, development, and services through a cooperative agreement between NSF and MCI for the very High Speed Backbone Network Services (vBNS)
This announcement comes at a time when the number of metropolitan and regional peering points are increasing, as are the number of networks that peer or want to peer. NSPs have also begun to favor direct peering arrangements with one another outside of the NAP architecture as a more economic and technically efficient means of sharing traffic between large networks. The higher education and research sectors are also shifting their attention toward a second generation architecture supported by high performance connections, and expanded use of the NSF/MCI vBNS and other federally sponsored research and education networks.
The transition to the new commercial environment, with its privately operated services and cross service provider NAP switching points, has significantly complicated statistics collection and left the Internet community without a dependable public source of statistics on Internet workloads. This most recent step in the transition, however, removes most of the government's remaining influence and increases the community's dependence on commercial providers to cooperatively manage this still fragile infrastructure. Empirical investigations of the nature of current workloads and their resource requirements, as well as how they change over time, remain a vital element in supporting the continued evolution of the Internet.
Importance of Workload Profiles for
Internet Pricing and Service Qualities
The Internet still strongly needs realistic pricing models and other mechanisms to allocate and prioritize its scarce resources, particularly bandwidth. Existing pricing practices center around leased line tariffs. As providers begin to employ multiple business models (various usage-based and fixed-price schemes), they face questions of which aspects of their service to charge and how to measure them. For example, one can measure bandwidth and traffic volumes in several ways, including average port utilization, router thruput, SNMP counters, and quotas of certain levels of priority traffic.4/
For example, Australia's Telstra imposes tariffs only on incoming traffic in order to encourage `Australia's content provision to a global market'.5/ Such usage-based tariffs are less common in the U.S. market, where many believe that they would stifle the utility of a growing, thriving Internet. However, this view receives increasing scrutiny, and usage-based pricing is already the norm for ISDN and other phone-based Internet services. Even some backbone ISPs now offer usage-based charging (albeit at a rough granularity), which often can decrease a customer's bill if they typically underutilize their prescribed bandwidth. Within a year pricing models in the U.S. will likely evolve into more refined and coherent methods. These models may include sample-based billing and measurement of traffic at the network provider boundaries, where individual providers apply alternative accounting techniques, e.g. measuring the source vs. destination of flows. 6/
U.S. providers are among the first to acknowledge the need for mechanisms to support more rational cost recovery, e.g., accurate accountability for resources consumed. The absence of these economic measures is troublesome given the ill preparedness of the U.S. Internet architecture and providers to deal with a large aggregation of flows, particularly if a significant number of those flows are several orders of magnitude higher volume than the rest, e.g., videoconferencing.
This disparity in size between most current Internet flows/transactions and newer multimedia applications with much higher volume and duration, necessitates revised metrics of network behavior. Analysis of traffic flows at the federally-sponsored FIX West facility, for example, have demonstrated averages of 15 packets per flow. As illustrated in the table and graphics below, typical cuseeme and mboneflows are exponentially higher.
application | flows | packets | bytes | seconds | type |
---|---|---|---|---|---|
web | 96482 | 1443763 | 821696977 | 1091907 | absolute |
- | 0.302 | 0.330 | 0.595 | 0.193 | fraction of total |
- | 569 | 14 | 8516 | 11 | ave pkt size, pkts/fl, byts/fl, du/fl |
ftp-data | 850 | 124586 | 73647232 | 28717 | absolute |
- | 0.003 | 0.028 | 0.053 | 0.005 | fraction |
- | 591 | 146 | 86643 | 33 | ave pkt size, pkts/fl, byts/fl, du/fl |
Mbone [tunnel traffic] |
35 | 202636 | 51292766 | 9226 | absolute |
- | 0.000 | 0.046 | 0.037 | 0.002 | fraction of total |
- | 253 | 5789 | 1465507 | 263 | ave pkt size, pkts/fl, byts/fl, du/fl |
cuseeme | 15 | 16288 | 6385996 | 3812 | absolute |
- | 0.000 | 0.004 | 0.005 | 0.001 | fraction |
- | 392 | 1085 | 425733 | 254 | ave pkt size, pkts/fl, byts/fl, du/fl |
Source: NLANR, Aug. 30, 1996. |
It is also important to note that average or mean flow statistics may be misleading, since some flows are orders of magnitude larger than the mean (a `heavy-tailed distribution'). Given this caveat, we note that the smaller mean volume of the cuseeme flows relative to that of the mbone flows is consistent with the characteristic usage of the applications. Cuseeme end users typically connect to each other for brief point-to-point conversations (often several times trying to get it working), resulting in many flows that are short by multimedia standards. Mbone flows, in contrast, tend to represent meetings, workshops, conferences, and concerts that last for hours if not days at time. In addition, current tools only measure mbone tunnel traffic, resulting in multiple mbone sessions appearing as a single flow.
Simple mean or peak utilization figures are therefore ineffective in addressing ISP engineering needs, without also knowing the transaction profile constituting and perhaps dominating those figures. Tracking workload profiles requires measuring flow data at relevant network locations. Currently, the only significant public source of multipoint workload characterization is from the FIX-West facility and at several NSF supercomputing sites.
Below we provide graphical depictions of three types of traffic across FIX-West. Note that the average (mean) of general Internet traffic fluctuates at around 50-80 packets per flow. Cuseeme and mbone traffic, on the other hand, illustrate significant unpredictability and variability in their averages, ranging around 500 packets per flow and 10,000 packets per flow, respectively. (Readers can take their own samplings using tools at: http://www.nlanr.net/NA. Double clicking on the graphics will enlarge their size.)
General traffic flows at FIX-West
Cuseeme traffic flows at FIX-West
Mbone traffic flows at FIX-West
Rational pricing of multiple service qualities will also provide clear feedback to providers and users on the value of Internet resources. As equipment vendors develop, and ISPs deploy, technologies that support quality signals, services should evolve to permit users to designate a service quality for which they are willing to pay, with higher demand services such as videoconferencing priced according to their value to the user.
Threat of Government Intervention
With the simultaneous diversification and usage explosion of the infrastructure, Internet service providers have not been in a position to provide accurate statistics models. Given the narrow profit margins and dearth of qualified personnel in the industry, providers are reluctant to dedicate manpower or other resources to statistics collection and analysis, allocating them instead to the monumental tasks associated with accommodating service demand. Eventually, larger telecommunication companies will devote attention to this area. Unfortunately, there may not be sufficient pressure upon the NSPs until problems such as congestion and outages worsen, and either billed customers demand (and are willing to pay for) better guarantees and data integrity, or, the government intercedes to dictate acceptable policies and practices for the Internet.
The telecommunications industry is a classic example of the government inserting itself and dictating cooperation among competing companies. In 1992 the Federal Communications Commission (FCC) mandated the formation of the Network Reliability Council (now the Network Reliability and Interoperability Council)7/ following a major communications outage on the East Coast. Since the Internet has only recently received attention as a critical element of the national infrastructure, it has escaped such intense regulatory scrutiny. As the emerging backbone of the national and global information infrastructures (NII/GII) however, this relative obscurity may be a passing luxury.
In Executive Order 13010 dated July 15, 1996, President Clinton established a Commission on Critical Infrastructure Protection, to develop `a strategy for protecting and assuring the continued operation of this nation's critical infrastructures' including telecommunications, electrical power systems, gas and oil transportation, banking and finance, transportation, water supply systems, emergency services, and continuity of government. Its mission objectives refer directly to the importance of cyberthreats; its output (due July 1997) includes proposals for `statutory or regulatory changes necessary to effect its recommendations'.
Most recently, attention has focused on RBOC claims that the Internet has detrimental effects on their telephony infrastructure. While many Internet analysts contest these allegations, the FCC and the NRIC are reviewing a study sponsored by the baby Bell companies on the impacts of the Internet's growth during Spring 1996. The FCC's interest in this subject is closely tied to its review of the America's Carrier Telecommunication Association (ACTA) petition, requesting that Internet telephony be banned, and to renewed discussions surrounding the so-called modem tax.
While the FCC, National Telecommunications Information Agency (NTIA), and Congress have thus far been relatively mute with respect to regulation of the Internet, it is idealistic to assume that they will remain so. Failure of industry participants to respond to requirements for more formal mechanisms of cooperation will slow the current pace of the Internet's evolution, particularly for electronic commerce and higher grade services. It will also increase the pressure upon governments, both U.S. and foreign, to intercede to protect these critical information resources.
Steps Toward
Improving Internet Measurements
The new commercial Internet is characterized by hundreds of ISPs, many on shoestring budgets in low margin competition, who generally view statistics collection as a luxury that has never proven its operational utility. Note the last publicly available source of Internet workload and performance data, for the NSFNET backbone, was basically a gift from the NSF, an investment of U.S. tax dollars with the hope that tools, methodologies, theories of traffic, refinements, feedback would emerge from the efforts in the IETF and other arenas. But there was never any fiscal pressure that the statistics collection activity to justify the resources it required within the cost structure of providing Internet service. It was never forced to prove itself worthwhile. And (surprise...) it didn't.Implications of the Transition for`but some data is worse than others': measurement of the global
Internet, in Telegeography, k claffy, August 1996)
Data Acquisition/Analysis
The Internet architecture remains in a state of transition. The large number of commercial providers and the proliferation of cross service provider exchange points, render statistics collection a much more difficult task. In addition, the challenges inherent in Internet operation, particularly given its still `best effort' underlying protocol technology, fully consume the attention of ISPs. Given its absence from the list of their top priorities, data collection and analysis continue to languish.
Yet it is detailed traffic and performance measurement that has heretofore been essential to identifying the causes of network problems and ameliorating them. Trend analysis and accurate network/systems monitoring permit network managers to identify hot spots (overloaded paths), predict problems before they occur, and avoid them by efficient deployment of resources and optimization of network configurations. As the nation and world become increasingly dependent on the NII/GII, mechanisms to enable infrastructure-wide planning and analysis will be critical.
Efforts Stimulating Action by Providers
Since the advent of the world wide web, the proliferation of users and the lack of cooperation among commercial ISPs has resulted in significant degradation of service quality for some of the higher end user communities. The higher education and research communities were among the first groups to depend on the Internet, and also among the most vocal in publicly acknowledging its recent inability to meet their growing expectations. Since the Educom-sponsored `Monterey Conference' in October 1995, representatives of the higher ed community have met repeatedly, in forums sponsored by the Federation of American Research Networks (FARNET), Educom's National Telecommunications Task Force (NTTF) and others, to assess their internetworking needs and develop plans to address them.
At the most recent meeting in Colorado Springs, Colorado, a post-Monterey white paper concluded that `the commodity Internet may not evolve rapidly enough to meet higher education's imminent and foreseeable high performance enterprise networking and internetworking capacity and service needs.' Participants at the meeting set goals for a second generation Internet to support the higher education and research communities, aimed at interconnecting enterprise networks in various stages of migration to higher performance technologies, and at controlling cost and pricing/allocation models from within the enterprise. General requirements for this Internet II include: 8/
- improved information security
- authorization and authentication capabilities
- network management capabilities including performance audits
- latency and jitter specifications
- bandwidth interrogation and reservation capabilities
- packet delivery guarantees
Other user groups that view the Internet as mission critical are moving independently to address their industries' service requirements. Most notable is the Automotive Industry Action Group (AIAG), who will announce in September their selection for an `overseer' to support the major automobile manufacturers and their thousands of suppliers, by:
- certifying a small number of highly competent Internet service providers to interconnect automotive trading partner's private networks;
- monitoring providers' ongoing compliance with performance standards ;
- enforcing strict security mechanisms to authenticate users and protect data, thereby creating a virtual private network for the auto industry.
Within the federal community, NSF has been the most proactive in supporting Internet measurement research and traffic analyses.10/ However, other federal agencies grow increasingly active in this critical arena. The Department of Energy (DOE), for example, has established a measurement working group and has tasked teams of high energy physics community researchers with monitoring global Internet traffic patterns and performance metrics for ESnet. 11/ The Department of Defense (through DARPA and the HPCMO) hosted an ATM performance measurement workshop in June and is currently developing and deploying measurement tools such as NetSpec across its ATM networks.12/ The Federal Networking Council (FNC) is also forming a statistics/metrics cross-agency working group and has expressed support for the creation of a North American CCIRN statistics/metrics working group.13/
Technical Challenges
The sections below describe some of the current and future technical challenges associated with Internet metrics and collection of statistics.
Lack of Common Definitions: There are several modest efforts underway currently to collect statistics across specific networks or at select peering points, see: http://oceana.nlanr.net/INFO/Nsfnet/ubiquity.html (Expired Link). Unfortunately, these efforts lack a common framework and common definitions of Internet metrics, limiting the comparability of results. The IETF's Internet Provider Performance Metrics (IPPM) working group is addressing this situation via development of an IP performance metrics framework. In advance of the December IETF meeting, IPPM members are also developing draft RFCs defining metrics for: roundtrip and one-way delay; flow capacity; packet loss, modulus of elasticity, connectivity/availability; and route persistence and route prevalence. Such steps toward a more strongly specified common framework will facilitate future metrics discussions.
Lack of Consensus on Traffic Modeling: There is as yet no consensus on how statistics can support research in IP traffic modeling. There is also skepticism within the Internet community regarding the utility of empirical studies: critics claim that because the environment changes so quickly, within weeks any collected data is only of historical interest. They argue that research is better served by working on mathematical models rather than by empirical surveys that capture, at most, only one stage in network traffic evolution.
However, prediction of performance metrics, e.g., queue lengths or network delays, using traditional closed-end mathematical modeling techniques such as queuing theory, have met with little success in today's Internet environment. The assumption of Poisson arrivals was acceptable years ago for the purposes of characterizing small local area networks (LANs). As a theory of wide area internetworking behavior, however, Poisson arrivals -- in terms of packet arrivals within a connection, connection arrivals within an aggregated stream of traffic, and packet arrivals across multiple connections -- have demonstrated significant inconsistency with collected data.14/
A further contributing factor to the lag of Internet traffic modeling is the early financial structure of the Internet. A few U.S. government agencies assumed the financial burden of building and maintaining the early transit network infrastructure, leaving little need to trace network usage for the purposes of cost recovery. As a result, since the transition Internet customers have had little leverage with their service providers regarding service quality.
Lack of Adequate Tools: Many applications in today's Internet architecture inherently depend on the availability of the infrastructure on a fairly pervasive scale. Yet wide area networking (WAN) technologies and applications have advanced much faster than has the analytical and theoretical understanding of Internet traffic behavior. Devices connected to WANs are increasing at 30-50% per year, networking traffic doubling every 10-18 months by some estimates, and vendors such as Netscape aim to release new products every six months. Overall, Internet-related companies continue to pour money into hardware, pipes, and multimedia-capable tools, with little attention to the technical limitations of the underlying infrastructure or tools to monitor this increasingly complex system.
Even the basic application level protocols of the TCP/IP suite, e.g., ftp and telnet, are becoming less reliable in the face of network congestion and related infrastructure problems. Yet, network managers today, both ISPs and end users, have few tools available to effectively monitor networks (end-to-end) so as to avoid potential problems. Performance of such applications depends on many inter-related factors, including: packet loss, network end-to-end response time, number and quality of intermediate hops and the route used, link bandwidth and utilization, end node capability. There is no suite of tools available for remotely monitoring, assessing or directly intervening to affect these conditions, particularly if the problem arises beyond an individual network's border. 15/
As traditional phone companies enter the Internet marketplace, armed with years of experience with analytic tools for modeling telephony workloads and performance, it is tempting to assume imminent remedy of the shortage of measurement tools. Unfortunately, models of telephony traffic developed by Bell Labs and others are not readily replicable to the Internet industry. Internet traffic is not only fundamentally inconsistent with traditional queueing theory, the former framed with best-effort rather than deterministic service protocols, but also exists on an infrastructural mesh of competing service providers with slim profit margins. As a result, telephony tables of acceptable blocking probability (e.g., inability to get a dial tone when you pick up the phone) suggest standards that are far in excess of that achievable in today's marketplace.
Emerging Technologies: Currently available tools for monitoring IP traffic (layer 3) focus on metrics such as response time and packet loss (ping), reachability (traceroute), throughput (e.g., ftp transfer rates), and some workload profiling tools (e.g., oc3mon and Cisco's flow stats). These metrics are significantly more complex when applied to IP "clouds". For example, to date there is no standard methodology for aggregating metrics such as loss measurement over various paths through a cloud and communicating it as a single metric for the provider.
With the transition to ATM and high speed switches, IP layer analysis may no longer be technically feasible. Most commercial ATM equipment, for example, is not capable of accessing IP headers. In addition, the primary network access and exchange points in the U.S. are chartered as layer 2 entities, providing services at the physical or link layer without regard for the higher layers. Because most of the NSFnet statistics reflected information at and above layer 3, the exchange points cannot use the NSFnet statistics collection architecture as a model upon which to base their own operational collection. Many newer layer 2 switches, e.g., DEC gigaswitch and ATM switches, have little if any capability for performing layer 3 statistics collection, or even for looking at traffic in the manner allowed on a broadcast medium (e.g., FDDI and Ethernet), where a dedicated machine can collect statistics without interfering with packet forwarding. Statistics collection functionality in newer switches takes resources directly away from forwarding, driving customers toward switches from competing vendors who sacrifice such functionality in exchange for speed.
The table which follows identifies key uses for Internet statistics and metrics -- both within individual networks and infrastructure-wide. The characteristics of requisite tools, their likely deployment options, and the status/problems associated with their use are also identified.
illustrative requirements, uses and current status
In viewing this table and similar Internet measurement materials, it is helpful to distinguish
between tools designed for Internet performance and reliability measurement from those meant
for traffic flow characterization. While there is a dearth of tools in both areas, equipment /
software designed to measure performance and network reliability are generally more readily
available and easier to deploy by users and providers alike.
Many of these tools treat the Internet
as a black box, measuring end-to-end features such as packet loss, latency, and jitter from
points originating and terminating outside individual networks. These performance and reliability
tools are fundamental to evaluating and comparing alternative providers and to monitoring service
qualities.
Traffic flow characterization tools, on the other hand, can yield a wealth of data on the internal
dynamics of individual networks and cross-provider traffic flows. They enable network architects
to better engineer and operate their networks and to better understand global traffic trends and
behavior -- particularly as new technologies and protocols are introduced into the global Internet
infrastructure. Deployment of this type of tool must be within the networks -- particularly at
border routers and at peering points. Traffic flow characterization tools therefore require a much
higher degree of cooperation and involvement by service providers than do performance oriented
tools.
Both types of measurement tools are critical to enhancing the overall quality and robustness of the
Internet. They also contribute to our ability to ensure the continued evolution of the Internet
infrastructure, both in strengthening its ability to meet the needs of its diverse user communities
and maintaining its flexibility to accommodate opportunities presented by future technologies and
applications.
The authors believe that the best means for addressing the cooperation
requirements outlined in this paper is through formation of a provider
consortium. Market pressures upon ISPs to participate in such a forum
include the increasing dependence of users (customers of providers) on the
Internet for mission critical applications, resulting in demands
for higher qualities of service and evidence of performance compliance
by providers. Economic models of the Internet are also evolving and will
soon include settlements based on authenticated, likely confidential
provider statistics. Lastly, the meshed nature of the global Internet
dictates that no single company can do it alone.
Systemic improvements to the Internet infrastructure and to the
operational practices of its providers will necessitate collaboration
and cooperation among the competitive telecommunications firms.
An industry-driven consortium could spearhead efforts to: develop
cross-network outage and trouble ticket tracking; monitor congestion
and relevant traffic patterns, including routing; promote studies of
peering relationships and testbeds for examining emerging Internet
technologies and protocols such as IPv.6, dynamic caching, bandwidth
reservation protocols and QoS routing; and provide a forum for discussion
and eventual implementation of charging policies. 16/
From the standpoint of statistics collection and analysis, such a forum
could:
The business constraints hindering such cooperation relate to
the competitive nature of the Internet business environment,
as well as the appearance of industry collusion by major providers.
However, a charter with principals of openness and inclusion
can readily address these concerns, as well as addressing
constraints arising from the lack of adequate pricing models
and other mechanisms for economic rationality
in Internet business practices.
Probably the most relevant constraint to cooperation
is that of data privacy, which has always been a serious
issue in network traffic analysis. Many ISPs have service
agreements prohibiting them from revealing information about
individual customer traffic. Collecting and using more
than aggregate traffic counts often requires customer cooperation
regarding what to collect and how to use it.
However, provisions of the Omnibus Crime Control and Safe
Streets Act of 1968, Section 2511.(2)(a)(i)
accord communications providers considerable
protection from litigation:
Responsible providers could go further than the law
and anonymize monitored traffic with tools such as
tcpdpriv, virtually eliminating any accusations
of breach of privacy.17/
Technical Constraints to Cooperation
Technology constraints hindering the collection and analysis of data
on Internet metrics center on the nascent development stage of IP
and ATM measurement tools and supporting analysis technologies, and on
complications arising from adoption of new and emerging technologies,
e.g. gigaswitches and ATM. Generally, we view these and other technical
constraints as solvable given sufficient technical
attention and market pressure.
Next Steps
Despite the business and technical challenges, requirements for
cooperation among Internet providers will continue to grow, as will
demands for enhanced data collection, analysis, and dissemination.
Development of an effective provider consortium to address these needs
would require, minimally:
Developing the appropriate metrics and tools to measure traffic phenomena,
as well as end-to-end performance and workflow characteristics, remains
a daunting task. Other areas where resources are needed to improve the
Internet infrastructure include:
Hal Varian's web site at
http://www.sims.berkeley.edu/resources/infoecon provides a
useful introduction to Internet economics.
The NSF-supported Routing Arbiter project collects network
statistics at the Ameritech, MAE-East, MAE-West, PacBell,
and Sprint interconnection points. The Merit/ISI RA web page (http://www.ra.net) is
a launching point from which to view both graphical and text
representations of routing instabilities, NAP statistics,
trends, etc.
Recent theorems have shown that aggregating traffic sources with
heavy-tailed distributions leads directly to (asymptotic) self-similarity.
In an article for Statistical Science (1994), W. Willinger identified
three minimal parameters for a self-similar model:
Although self-similarity is a parsimonious concept, it comes in many
different colors, and we only now are beginning to understand what
causes it. Self-similarity implies that a given correlational structure
is retained over a wide range of time scales. It can derive from the
aggregation of many individual, albeit highly variable, on-off components.
The bad news about self-similarity is that it is a significantly different
paradigm that requires new tools for dealing with traffic measurement
and management. Load service curves (e.g., delay vs. utilization) of
classical queueing theory are inadequate; indeed for self-similar traffic
even metrics of means and variances indicate little unless accompanied by
details of the correlational structure of the traffic. In particular,
self-similarity typically predicts queue lengths much higher than do
classical Poisson models. Researchers have analyzed samples and and
found fractal components of behavior in a wide variety of network traffic
(SS7, ISDN, Ethernet and FDDI LANs, backbone access points, and ATM).
Still unexplored is the underlying physics that could give rise to
self-similarity at different time scales. That is, at millisecond time
scales, link layer characteristics (i.e., transmission time on media)
would dominate the arrival process profile, while at the 1-10 second
time scales the effects of the transport layer would likely dominate.
Queueing characteristics might dominate a range of time scales in
between, but in any case the the implication that several different
physical networking phenomena manifest themselves with self-similar
characteristics merits further investigations into these components.
While telcos have long measured traffic matrices for phone network
engineering, ISPs have had both technical legal, and resource limitations
as obstacles to collecting -- not to mention sharing -- such measurements.
Collecting packet headers, though essential for researchers to develop
realistic models and analysis techniques, is even more technically and
logistically problematic.
Commercial Internet eXchange (CIX) - http://www.cix.org
CommerceNet - http://www.commerce.net
Cross Industry Working Team (XIWT) - http://www.xiwt.org
Internet Society (ISOC) - http://www.isoc.org
ISP Consortium - http://www.ispc.org
North American Network Operators Group (NANOG) -
http://www.nanog.org/
Reseaux IP Europens (RIPE) - http://www.ripe.net/
World Internet Alliance (WIA) http://www.wia.org/
Requirements
Description
Tool(s)
characteristics
td>
Deployment
Status /
Problems
Internet-
wide
Planning
(end-to-end)
Network
Planning
(ISP-
specific)
Internet-wide
Management
(end-to-end)
Testing
New
Applications/
Protocols
Benchmarking routers /
switches
Perform
ance Monitoring --
reliability, availability
&
serviceability
Performance Monitoring --
QoS / Settlements
Cooperative Association
for Internet Data Analysis
Business Constraints to Cooperation
It shall not be unlawful under this chapter for an operator of a
switchboard, or an officer, employee, or agent of
a provider of wire or electronic communication service, whose
facilities are used in the transmission of a wire
communication, to intercept, disclose, or use that communication
in the normal course of his employment while
engaged in any activity which is a necessary incident to the
rendition of his service or to the protection of the
rights of property of the provider of that service, except that a
provider of wire communication service to the
public shall not utilize service observing or random monitoring
except for mechanical or service quality
control checks.
A consortium organization could also make available a solid, consistent library
of tools that would appeal to both users and providers. Data collection
by the consortium should strictly focus on engineering and evolution of
the overall Internet environment, e.g. accurate data on traffic patterns
that could enhance engineers' ability to design efficient architectures,
conserving manpower and other resources currently devoted to this task.
The right statistics collection and cross-ISP dissemination mechanisms
would also facilitate faster problem resolution, saving the time and
money now devoted to tracking problems, e.g., route leakage, link
saturation and route flapping. Finally, experience with data will foster
the development of more effective usage-based economic models, which,
in turn, will allow ISPs to upgrade their infrastructure in accordance
with evolving customer demands.
For additional information on these topics, see:
A Survey of Internet Statistics / Metrics
Activities, T. Monk and
k claffy
`but some data is worse than others': measurement
of the global
Internet , k claffy
Canadian Association of Internet Providers (CAIP) - http://www.caip.ca
last updated September 24, 1996
please direct questions or comments to tmonk@ixiacom.com
Related Objects
See https://catalog.caida.org/paper/1996_cooperation/ to explore related objects to this document in the CAIDA Resource Catalog.