The contents of this legacy page are no longer maintained nor supported, and are made available only for historical purposes.

Bibliography Details

L. Bernaille, A. Soule, M. Jeannin, and K. Salamatian, "Blind Application Recognition Through Behavioral Classification", Tech. rep., Paris 2005, 2005.

Blind Application Recognition Through Behavioral Classification
Authors: L. Bernaille
A. Soule
M. Jeannin
K. Salamatian
Published: Paris, 2005
Entry Dates: 2009-02-06
Abstract: Application recognition appears to be an important task for a large number of applications in security and traffic engineering. Well-known port numbers can no longer be used to reliably identify network applications. There is a variety of new Internet applications that either do not use wellknown port numbers or use other protocols, such as HTTP, as wrappers in order to go through firewalls without being blocked. One consequence of this is that a simple inspection of the port numbers used by flows may lead to the inaccurate classification of network traffic. Moreover because of some privacy concern or more simply because of used encryption mechanism it is frequently impossible to get access to the full payload of packets. This means that classification should be based only to the behaviour of the packet flow in term of size, inter-arrival time and interaction. We develop in this paper a Blind applicative flow recognition through behavioral classification. The approach is based on very simple sequences of quantified packet size and packet direction. These sequences are clustered through a powerful spectral clustering algorithm. We developed thereafter a recognition algorithm based on a mixture of HMM representative of the obtained clusters. The presented method appear to be very powerful as it reaches recognition performance of 90% with only observing seven packets of a flow!. This work is a first step toward an operational flow recognition system that will be robust toward flow morphing (tunnelling flow in other protocol) and payload encryption.
  • datasets: collected at the exit of the UPMC university network; DAG card to capture data; 300 bytes per packets;
  • based on size, inter-arrival time and interaction; a recognition algorithm based on a mixture of HMM representative of the obtained clusters
  • reaches recognition performance of 90%;