<?xml version="1.0" standalone="no"?>
                    <!DOCTYPE div SYSTEM "/www/backend/www-xml-443/dtd/caidaML.dtd">
                    <!-- do NOT ERASE the DOCTYPE declaration! --><div>


<tr bgcolor="#f4f4f4">
  <td>
<font face="helvetica,arial" size="2">
<b>URL:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
<a href="http://pages.cpsc.ucalgary.ca/~mahanti/papers/globecom06.pdf">http://pages.cpsc.ucalgary.ca/~mahanti/papers/globecom06.pdf</a>
</font>
  </td>
</tr>


<tr bgcolor="#e9e9e9">
  <td>
<font face="helvetica,arial" size="2">
<b>Entry Dates:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
2009-02-13


</font>
  </td>
</tr>


<tr bgcolor="#f4f4f4">
  <td>
<font face="helvetica,arial" size="2">
<b>Abstract:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
We apply an unsupervised machine learning approach for Internet traffic identification and compare the results with that of a previously applied supervised machine learning approach. Our unsupervised approach uses an Expectation Maximization (EM) based clustering algorithm and the supervised approach uses the Naive Bayes classifier. We find the unsupervised clustering technique has an accuracy up to 91% and outperform the supervised technique by up to 9%. We also find that the unsupervised technique can be used to discover traffic from previously unknown applications and has the potential to become an excellent tool for exploring Internet traffic. 


</font>
  </td>
</tr>


<tr bgcolor="#e9e9e9">
  <td>
<font face="helvetica,arial" size="2">
<b>Results:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
<ul>
<li>
datasets: auckland-iv(20010314-20010319) and auckland-vi(20010608-20010609), part of each trace;
</li>
<li>
present an unsupervised machine learning approach (AutoClass) for Internet traffic classification;
</li>
<li>
AutoClass can achieve an average accuracy greater than 90%, outperforms Naive Bayes by up to 9%;
</li>
</ul>





</font>
  </td>
</tr>
</div>

