<?xml version="1.0" standalone="no"?>
                    <!DOCTYPE div SYSTEM "/www/backend/www-xml-443/dtd/caidaML.dtd">
                    <!-- do NOT ERASE the DOCTYPE declaration! --><div>


<tr bgcolor="#f4f4f4">
  <td>
<font face="helvetica,arial" size="2">
<b>URL:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
<a href="http://www.cs.ucsd.edu/~klevchen/mlksv-imc2006.pdf">http://www.cs.ucsd.edu/~klevchen/mlksv-imc2006.pdf</a>
</font>
  </td>
</tr>


<tr bgcolor="#e9e9e9">
  <td>
<font face="helvetica,arial" size="2">
<b>Entry Dates:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
2009-02-09


</font>
  </td>
</tr>


<tr bgcolor="#f4f4f4">
  <td>
<font face="helvetica,arial" size="2">
<b>Abstract:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
Network managers are inevitably called upon to associate network traffic with particular applications. Indeed, this operation is critical for a wide range of management functions ranging from debugging and security to analytics and policy support. Traditionally, managers have relied on application adherence to a well established global port mapping: Web traffic on port 80, mail traffic on port 25 and so on. However, a range of factors-including firewall port blocking, tunneling, dynamic port allocation, and a bloom of new distributed applications-has weakened the value of this approach. We analyze three alternative mechanisms using statistical and structural content models for automatically identifying traffic that uses the same application-layer protocol, relying solely on flow content. In this manner, known applications may be identified regardless of port number, while traffic from one unknown application will be identified as distinct from another. We evaluate each mechanism's classification performance using realworld traffic traces from multiple sites.


</font>
  </td>
</tr>


<tr bgcolor="#e9e9e9">
  <td>
<font face="helvetica,arial" size="2">
<b>Results:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
<ul>
<li>
datasets: three traces
1) Cambridge trace includes all traffic of the Computer Laboratory at the University of Cambridge, UK, over a 24-hour period on November 23,2003;
2) Wireless is a five-day trace of all traffic on the wireless network in the UCSD Computer Science and Engineering building starting on April 17,2006;
3) Departmental trace collects over an hour of traffic from a UCSD department backbone switch at noon on May 23, 2006; 
</li>
<li>
relying solely on flow content; three classifcation techniques for capturing statistical and structural aspects of messages exchanged in a protocol: product distributions of byte offsets, Markov models of byte transitions, and common substring graphs of message strings.
</li>
</ul>


</font>
  </td>
</tr>
</div>

