Bibliography Details

C. Wright, F. Monrose, and G. Masson, "On Inferring Application Protocol Behaviors in Encrypted Network Traffic", in Journal of Machine Learning Research 2006, Dec 2006.

On Inferring Application Protocol Behaviors in Encrypted Network Traffic
Authors:	C. Wright F. Monrose G. Masson
Published:	Journal of Machine Learning Research, 2006
URL:	http://jmlr.csail.mit.edu/papers/volume7/wright06a/wright06a.pdf
Entry Dates:	2009-02-09
Abstract:	Several fundamental security mechanisms for restricting access to network resources rely on the ability of a reference monitor to inspect the contents of traffic as it traverses the network. However, with the increasing popularity of cryptographic protocols, the traditional means of inspecting packet contents to enforce security policies is no longer a viable approach as message contents are concealed by encryption. In this paper, we investigate the extent to which common application protocols can be identified using only the features that remain intact after encryption-namely packet size, timing, and direction. We first present what we believe to be the first exploratory look at protocol identification in encrypted tunnels which carry traffic from many TCP connections simultaneously, using only post-encryption observable features. We then explore the problem of protocol identification in individual encrypted TCP connections, using much less data than in other recent approaches. The results of our evaluation show that our classifiers achieve accuracy greater than 90% for several protocols in aggregate traffic, and, for most protocols, greater than 80% when making fine-grained classifications on single connections. Moreover, perhaps most surprisingly, we show that one can even estimate the number of live connections in certain classes of encrypted tunnels to within, on average, better than 20%.
Results:	datasets:real traffic traces collected by the Statistics Group at George Mason University in 2003; containing headers for IP packets on GMU's Internet link from the first 10 minutes of every quarter hour over a two-month period; only use packet size, timing and direction; achieve accuracy greater than 90% for several protocols in aggregate traffic, and for most protocols, greater than 80% when making fine-grained classifications on single connections;