The contents of this legacy page are no longer maintained nor supported, and are made available only for historical purposes.

Bibliography Details

N. Williams, S. Zander, and G. Armitage, "A Preliminary Performance Comparison of Five Machine Learning Algorithms for Practical IP Traffic Flow Classification", in ACM SIGCOMM 2006, Aug 2006.

A Preliminary Performance Comparison of Five Machine Learning Algorithms for Practical IP Traffic Flow Classification
Authors: N. Williams
S. Zander
G. Armitage
Published: ACM SIGCOMM, 2006
URL: http://ccr.sigcomm.org/online/files/p7-williams.pdf
Entry Dates: 2009-02-13
Abstract: The identification of network applications through observation of associated packet traffic flows is vital to the areas of network management and surveillance. Currently popular methods such as port number and payload-based identification exhibit a number of shortfalls. An alternative is to use machine learning (ML) techniques and identify network applications based on per-flow statistics, derived from payload-independent features such as packet length and inter-arrival time distributions. The performance impact of feature set reduction, using Consistencybased and Correlation-based feature selection, is demonstrated on Naive Bayes, C4.5, Bayesian Network and Naive Bayes Tree algorithms. We then show that it is useful to differentiate algorithms based on computational performance rather than classification accuracy alone, as although classification accuracy between the algorithms is similar, computational performance can differ significantly.
Results:
  • datasets: four 24-hour periods of these traces (auckland-vi-20010611, auckland-vi-20010612, leipzig-ii-20030221, nzix-ii-20000706)
  • classification accuracy between the algorithm is similar, computational performance can differ significantly;
  • comparing five machine learning algorithms: naive bayes (NBD,NBK); C4.5; bayesian network; naive bayes tree
  • analysis tools: weka