ANR has devoted much attention in the last two years to an investigation of the usefulness, relevance, and practicality of a wide variety of operationally collected statistics for wide area backbone networks. In particular, we have undertaken several studies on to what extent much of the statistics that the NSFNET project has collected over the life of the NSFNET backbone are useful for a variety of workload characterization efforts. We have also undertaken several studies which collect more comprehensive Internet traffic flow statistics and developed a methodology for describing those flows in terms of their impact on an aggregate Internet workload. We have developed a methodology for profiling Internet traffic flows which draws on previous flow models [1,2,3,4,5,6,7,8,9,10,11].
The model of flows we are using depends on a flow specification (e.g., host pairs) which we then apply to packets traversing a specific network location. We create flows as packets between two entities appear, and time them out after periods of inactivity.
Our methodology for modeling flows differs from many previous studies that have concentrated on end-point definitions of flows, by mainly focusing on TCP flows delimited by SYN and FIN packets.
Instead, we focus on the IP layer and define flows based on traffic satisfying various temporal and spatial locality conditions, as observed at internal points of the network. This approach to the definition and characterization of network flows has significant applications to various central problems for networking based on the Internet model. Among them, optimization of caches for routing, feasibility studies and optimization of routing based on quality-of-service considerations, usage based accounting, and optimization of the transport of IP traffic over an ATM fabric. In this section we concentrate on metrics characterizing individual flows including, volume in packets and bytes per flow, and flow duration, at various granularities of the definition of a flow, such as by destination network, host-pair, or host and port quadruple. Our measurements demonstrate
Our measurements have implications for: performance requirements of routers at Internet ``hotspots''; general and specialized flow-based routing algorithms; future usage-based accounting requirements; and traffic prioritization. We have presented the results of our work in several forums, including to industry, NSF, and journal publications.