<?xml version="1.0" standalone="no"?>
                    <!DOCTYPE div SYSTEM "/www/backend/www-xml-443/dtd/caidaML.dtd">
                    <!-- do NOT ERASE the DOCTYPE declaration! --><div>


<tr bgcolor="#f4f4f4">
  <td>
<font face="helvetica,arial" size="2">
<b>URL:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
<a href="http://www2007.org/papers/paper511.pdf">http://www2007.org/papers/paper511.pdf</a>
</font>
  </td>
</tr>


<tr bgcolor="#e9e9e9">
  <td>
<font face="helvetica,arial" size="2">
<b>Entry Dates:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
2009-02-11


</font>
  </td>
</tr>


<tr bgcolor="#f4f4f4">
  <td>
<font face="helvetica,arial" size="2">
<b>Abstract:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
Traffic classification is the ability to identify and categorize network traffic by application type. In this paper, we consider the problem of traffic classification in the network core. Classification at the core is challenging because only partial information about the flows and their contributors is available. We address this problem by developing a framework that can classify a flow using only unidirectional flow information. We evaluated this approach using recent packet traces that we collected and pre-classified to establish "base truth". From our evaluation, we find that flow statistics for the server-to-client direction of a TCP connection provide greater classification accuracy than the flow statistics for the client-to-server direction. Because collection of the server-to-client flow statistics may not always be feasible, we developed and validated an algorithm that can estimate the missing statistics from a unidirectional packet trace. 


</font>
  </td>
</tr>


<tr bgcolor="#e9e9e9">
  <td>
<font face="helvetica,arial" size="2">
<b>Results:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
<ul>
<li>
datasets: collect full packet traces; eight 1-hour traces between April 6-9, 2006;
</li>
<li>
machine learning approach (K-Means);
</li>
<li>
four metrics in evaluation: flow accuracy,byte accuracy, precision and recall;
</li>
</ul>


</font>
  </td>
</tr>
</div>

