Application of sampling methodologies to wide-area network traffic characterization
The relative performance of different data collection methods in the assessment of various traffic parameters is significant when the amount of data generated by a complete trace of a traffic interval is computationally overwhelming, and even capturing summary statistics for all traffic is impractical. This paper presents a study of the performance of various methods of sampling in answering questions related to wide area network traffic characterization. Using a packet trace from a network environment that aggregates traffic from a large number of sources, we simulate various sampling approaches, including time-driven and event-driven methods, with both random and deterministic selection patterns, at a variety of granularities. Using several metrics which indicate the similarity between two distributions, we then compare the sampled traces to the parent population. Our results revealed that the time-triggered techniques did not perform as well as the packet-triggered ones. Furthermore, the performance differences within each class (packet-based or time-based techniques) are small.