Rick Wilder, MCI/vBNS

Position Statement for
NSF Workshop on Internet Statistics Measurement and Analysis

The Internet is rapidly growing in number of users, traffic levels, and topological complexity. At the same time it is increasingly driven by economic competition. All these developments render it more difficult, and yet more critical, to characterize network usage and workload trends. These factors point to the need for a high-performance monitoring system that can provide workload data to Internet users and service providers. To ensure the practicality of using the monitor at all needed locations, implementation on low-cost, standardized hardware is a necessity.

As part of NSF's vBNS project, MCI has undertaken the development of an OC-3 based monitor to meet these needs and is soliciting design input at this workshop. We will describe and demonstrate our current prototype. The goal of the project is to specifically narrow an increasing gap:

The specific design goals that led to the current prototype are a flexible data collection and analysis implementation that one can modify as we codify and refine our understanding of the desired statistics. The project schedule callss for deploying the monitor mid '96 in the vBNS. As soon as it is found to be stable, we will make the software freely available to others for use elsewhere.


Description of the OC-3 monitor

OC3MON is an IBM personal computer clone with 64 MB of main memory, an 120MHHz Intel Pentium processor, an Ethernet card, and two ATM network interface cards made by Texas Instruments. The prototype is ideally low cost, less than $5K per site, using generic PCI cards as ATM NICs (MCI chose the Texas Instruments SAR chips due to availability and low cost at the time; the cards generally run from $1 to $1.5K).

The receive port of each ATM card is connected to the monitor port of an optical splitter, such that 5% of the light goes to the ATM card. (The dual splitter cost about $800.) One card receives cells that the single-mode fiber brings to the site and the other receives cells leaving the site.

Software running on the PC directs each card to perform AAL5 reassembly on the first 1024 VC's in VPI 0, though it only provides buffers large enough to hold one cell. Due to the buffer size limitation the SAR (segmentation and reassembly) engine in the card must discard all but the first cell of each AAL5 PDU (protocol data unit, ATM-speak for a frame), which is fortunately exactly enough to hold the LLC/SNAP header (8 bytes), IP header (20 bytes) and TCP header (20 bytes), if there are no IP options. The ATM header is discarded.

The card places IP header data in a queue directly in host memory, absolving the need for host CPU intervention except when the card runs out of buffers. Since the card can be given up to 256 buffers at a time, and it takes some time to do so, the monitor has a hardware timer chip to interrupt the CPU in the time it takes to receive 128 IP frames, in order to service the cards in a timely fashion. We can shorten this interval to get finer timestamp granularity at the expense of more CPU overhead to get the cells into memory. IP headers are managed in blocks of 128, though the value can be tuned to an integral division of 128. The 8.3 nanosecond-granularity timestamp applies to the whole block of headers, using a 64 bit counter since power-on. The clock resolution will increase when we get faster CPU's for actual production; 133, 150, and 166 MHz Pentiums are already available for the same price we paid for the original 120 MHz. (Note: 1/clock rate = resolution) The 128-header block size was chosen for minimal CPU impact of queue managment in host memory (256-entry queues of free and filled buffers).

On the resulting trace one one can run any analysis desired. Concurrently with the interrupt-driven header capture, software runs on the host CPU to convert the packet headers to flows. A flow is considered unique based on its protocol, source IP address, destination IP address, source port, and destination port, although more flexible definitions of flows are supported. A packet is considered to belong to the same flow if no more than 64 seconds have passed since a packet with the same flow attributes.

When flows time out, they are passed to statistics routines that update accumulators that can be queried remotely via the ethernet interface at regular intervals. The flow analysis code and monitor architecture will be public domain. Placing packets into memory only consumes 1/8 of the CPU, so we so not anticipate a problem keeping up with OC-3 rates. However, TI SAR chip ignores the first full frame immediately following a single-cell frame. An improved SAR would fix this, but TI is as yet uninterested in addressing ths issue, and in fact Joel is already using an undocumented mode to force the SAR to wait for the end of frame after seeing the first cell because the SAR does not do that when the buffer fills.

Another drawback with the TI SAR is that its VC table can only handle VC numbes from 0 to 1023 absolute in VP 0 There are less than 1,000 VC's on the vBNS but other environments would likely need a more advanced SAR chip. The monitor can capture all of both TCP/UDP and IP headers if IP options are not used. The monitor is not operational yet on the vBNS,