Skip to Content
[CAIDA - Center for Applied Internet Data Analysis logo]
Center for Applied Internet Data Analysis
Characterizing Traffic Workload
The tables below show the output of our Coral analysis code from a typical packet trace. This particular trace was collected by Hans-Werner Braun from one of the four OC-3 links interconnecting FIX West and MAE West in San Jose, CA. The trace was captured on May 18, 1998 at 11:06 PDT, and ran for just over 11 minutes.
Trace contents: 22 million IP packets, 10.2 billion bytes.
7656 Optioned Packets (0.035%), 67,573 IP Fragments (0.308%)
15,436,310 IP packets with DF set (70.4%)
16,794 packets with bogus source addresses
  4272 packets with bogus destination addresses

13,644 non-IP packets (0.0622% of total frames)

Statistics for Interface 0:
11,273,620 Packets, 5,110,516,063 Total Bytes

Statistics for Interface 1:
10,664,224 Packets, 5,064,727,393 Total Bytes

Traffic breakdown by protocol:
            proto   pcount      pct.    bytes           pct.  avg. size
TCP         6       18567412    84.6364 9421455182      92.5919 507
UDP         17      2687922     12.2524 489832295       4.8140  182
IP-ENCAP    4       227259      1.0359  93621624        0.9201  411
ICMP        1       248114      1.1310  87415912        0.8591  352
GRE         47      198546      0.9050  81467165        0.8006  410
IGMP        2       5427        0.0247  954828  0.0094  175
IPIP        94      839         0.0038  149789  0.0015  178
IPSEC-ESP   50      514         0.0023  118644  0.0012  230
MOBILE      55      580         0.0026  77340   0.0008  133
VINES       83      634         0.0029  63913   0.0006  100
IPSEC-AH    51      207         0.0009  37924   0.0004  183
AX.25       93      208         0.0009  23548   0.0002  113
IPv6        41      98          0.0004  11406   0.0001  116
OSPFIGP     89      31          0.0001  7604    0.0001  245
SCC-SP      96      18          0.0001  2552    0.0000  141
SKIP        57      9           0.0000  1870    0.0000  207
NHRP        54      18          0.0001  1224    0.0000  68
RSVP        46      3           0.0000  424     0.0000  141
ISO-TP4     29      4           0.0000  146     0.0000  36
BNA         49      1           0.0000  66      0.0000  66

In the next two tables, we have consolidated the TCP and UDP port address pairs into aggregate flows of the same protocol category. In most cases, we have assumed that flows between any port number higher than 1023 and a well-known port number below 1023 are exclusively composed of flows of the same protocol (e.g. HTTP on port 80). We denote the condensed set of port numbers in the table with the label `0'.

For some of the protocols, we have condensed ranges of port numbers in both the source and destination fields. For example, the Real Audio category in the UDP table includes all flows with destination ports between 6970 and 7170 inclusive, represented with the label `7070'. Unfortunately, this range also includes the ports used by AFS, and so we are potentially including an unknown amount of AFS traffic as Real Audio. We are currently investigating techniques for differentiating Real Audio and AFS flow profiles according to packet size distribution or other traffic characteristics, and we hope to effectively differentiate between the two in the future.

We have also condensed `unknown' flows into a single category denoted by `0' for both source and destination port numbers. These flows typically use random port numbers larger than 1023, and may actually be due to well-known protocols such as passive FTP. Since our analysis is restricted to IP and TCP/UDP headers of captured traffic, at this point we have no easy way to determine the actual protocol used in these flows.

TCP Traffic matrix:
                src     dst     packets pct.    bytes           pct   avg. size
HTTP            80      0       8077648 43.5044 6394955586      67.8765 791
Unclassified    0       0       2067412 11.1346 1001294468      10.6278 484
FTP data        20      0       458958  2.4718  428520333       4.5483  933
HTTP            0       80      5112504 27.5348 415910441       4.4145  81
SMTP            0       25      407037  2.1922  252793347       2.6832  621
NNTP            119     0       383335  2.0646  246872167       2.6203  644
NNTP            0       119     376577  2.0282  222235658       2.3588  590
FTP data        0       20      290478  1.5645  82919641        0.8801  285
HTTPS           443     0       129600  0.6980  70788857        0.7514  546
RealAudio       7070    0       96493   0.5197  70308063        0.7463  728
Hotline         5501    0       40892   0.2202  53938185        0.5725  1319
POP             110     0       65665   0.3537  26076006        0.2768  397
Web Cache       3128    0       26277   0.1415  19516569        0.2072  742
SMTP            25      0       340171  1.8321  18826150        0.1998  55
NetBIOS SSN     139     0       19351   0.1042  15448453        0.1640  798
IMAP            143     0       9182    0.0495  11707969        0.1243  1275
Telnet          0       23      71443   0.3848  9718808         0.1032  136
HTTPS           0       443     90246   0.4860  9659233         0.1025  107
                0       510     10317   0.0556  8774607         0.0931  850
Telnet          23      0       56204   0.3027  7727852         0.0820  137
FTP Control     21      0       46456   0.2502  5405741         0.0574  116
RealAudio       0       7070    90287   0.4863  4757291         0.0505  52
                81      0       3950    0.0213  2812653         0.0299  712
LDAP            389     0       4955    0.0267  2773392         0.0294  559
                889     0       2158    0.0116  2240762         0.0238  1038
SSH             22      0       14744   0.0794  2227076         0.0236  151
POP             0       110     50115   0.2699  2210691         0.0235  44
                6       0       1422    0.0077  2080484         0.0221  1463

UDP Traffic matrix:
                src     dst     packets pct.    bytes           pct   avg. size
RealAudio       0       7070    638552  23.7563 256365362       52.3374 401
DNS             53      53      610111  22.6982 69113251        14.1096 113
Unclassified    0       0       540624  20.1131 66387962        13.5532 122
CU-SeeMe        7648    7648    38401   1.4287  13793045        2.8159  359
DNS             53      0       65017   2.4189  12069004        2.4639  185
Quake 2         27901   27910   162691  6.0527  10074004        2.0566  61
Starcraft       6112    6112    184655  6.8698  9028296 1.8431  48
Quake 2         27910   27901   47353   1.7617  7604178 1.5524  160
QuakeWorld      27500   27001   62209   2.3144  7412654 1.5133  119
                8828    2058    4857    0.1807  7161204 1.4620  1474
DNS             0       53      83337   3.1004  5340825 1.0903  64
QuakeWorld      27001   27500   56888   2.1164  3382653 0.6906  59
SNMP            0       161     32713   1.2170  2387666 0.4874  72
NetBIOS NS      137     137     21092   0.7847  1911737 0.3903  90
Quake 2         27910   0       10353   0.3852  1653182 0.3375  159
Quake 2         0       27901   9594    0.3569  1646967 0.3362  171
                0       371     1594    0.0593  1279607 0.2612  802
Quake 2         27901   0       20790   0.7735  1265826 0.2584  60
QuakeWorld      0       27001   12803   0.4763  1216071 0.2483  94
NetBIOS DGM     138     138     2985    0.1111  905452  0.1848  303
QuakeWorld      27001   0       15357   0.5713  878083  0.1793  57
NTP             123     123     11290   0.4200  859896  0.1755  76
QuakeProxy      0       27910   12450   0.4632  719013  0.1468  57
DBase           217     217     936     0.0348  456880  0.0933  488
CU-SeeMe        7648    0       3098    0.1153  340409  0.0695  109
Quake           27500   0       1298    0.0483  261060  0.0533  201
                910     910     2930    0.1090  251794  0.0514  85

Flow Length Distributions

In the graph below we demonstrate one metric of traffic consumption, using a trace from a busy 5 minute interval in April 1998 on an MCI Internet backbone trunk. The figure focuses on the number of packets per flow as a function of specific TCP or UDP application. The vertical lines are box and whisker plots with the x-axis indicating the mean number of packets per flow for a 24-hour period. The top and bottom of the vertical lines indicate the maximum and minimum 5-minute averages over the 24-hour period, respectively. This figure shows on a log-log scale how small most of the transaction-style flows, e.g., HTTP, SMTP, DNS, are in contrast to the bulk data transfer-style flows, e.g., FTP-data, NNTP. Note that although telnet flows are sometimes composed of a large number of packets, they are typically much smaller in byte payload (not shown on the graph).

figure 1: distribution of packets per flow by protocol as a function of the number of flows

figure 1: distribution of packets per flow by protocol as a function of the number of flows. A flow roughly corresponds to a conversation with a single application. A 64 second timeout was used to define flow boundaries. (log-log scale)

The graph below uses an earlier five minute trace from September 1995 from the same measurement point, and illustrates a similar metric to indicate disparity in resource consumption. Specifically, we use a two-dimensional parameterization: a scatterplot of (fraction of total bytes, fraction of total flows)-tuples. figure 2: Relative proportion of resources consumed by popular services
figure 2: Relative proportion of resources consumed by popular services. For this 1995 five minute trace, the average flow size for the 48 > IP4 (IP protocol 4) flows was 10088 packets and 6,344,202 bytes. There was not enough cuseeme traffic to appear on the graph (thresholded by traffic volume), but there were 34 cuseeme flows with an average flow size of 705 packets and 318,676 bytes. Cuseeme is unicast rather than multicast, so each flow translates into a single request, i.e. it satisfies only one end user.

Web flows from clients to servers, which typically consist of very few packets (e.g., the amount needed to send an http GET request), comprise 23.3% of the flows in this trace in this trace but only 3.3% of the bytes. The average size of a flow to an http server was 12.5 packets and 710 bytes; the average size of a flow from an http server was 16.5 packets and 8270 bytes. Unlike IP protocol 4 flows, where many users can benefit from a single (multicast) flow, the current standard implementation of http is rather the opposite: often many flows are required for a single page (equivalent to a single mouse click) if that page includes icons or other images. Thus a large number of http flows may translate into only a few satisfied requests from the user's perspective. The latest version of http, now under development and deployment, is addressing this issue using `persistent' connections so that a single TCP connection can be used to retrieve many different objects from a web server. However, persistent http will only effectively mitigate this situation if web browsers are modified to open a single TCP connection and use it for multiple requests. Current browsers will open multiple simultaneous connections to a single server to retrieve objects in parallel (e.g. the images on a page). This practice can short-circuit the congestion avoidance behavior of TCP since these short-lived connections may not last long enough to effectively adapt to the current network conditions.

Unfortunately, browser vendors are likely to resist modifying their software in any way that increases perceived latency in downloading pages. The current protocol lends itself most easily to downloading each object serially, which means filling in only one image on a page at a time. Even if the complete page takes longer to load, users may still perceive higher performance when multiple TCP sessions are open in parallel. [Nielsen97] discuss a technique for multiplexing the transmission of multiple objects over a single TCP connection that merits further development.

DNS (domain name service) flows comprised 31% of the flows and 3.2% of the packets. As DNS is extended to serve more types of data (e.g. PGP keys and other authentication information) it may account for an increasing fraction of the total flows. Unfortunately, short transactions appear to be a basic part of the DNS system, and consequently congestion avoidance depends on effective backoff schemes in the individual end hosts.


references

1. [Nielsen97]. H. Nielsen, J. Gettys, A. Baird-Smith, E. Prud'hommeaux, H. Lie, and C. Lilley, Network Performance Effects of HTTP/1.1, CSS1, and PNG, 24 June 1997.

  Last Modified: Mon Mar-31-2008 11:54:49 PDT
  Page URL: http://www.caida.org/research/traffic-analysis/fix-west-1998/trafficworkload/index.xml