<?xml version="1.0" standalone="no"?>
                    <!DOCTYPE div SYSTEM "/www/backend/www-xml-443/dtd/caidaML.dtd">
                    <!-- do NOT ERASE the DOCTYPE declaration! --><div>


<tr bgcolor="#f4f4f4">
  <td>
<font face="helvetica,arial" size="2">
<b>URL:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
<a href="http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5062243">http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5062243</a>
</font>
  </td>
</tr>


<tr bgcolor="#e9e9e9">
  <td>
<font face="helvetica,arial" size="2">
<b>Entry Date:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
2010-10-22


</font>
  </td>
</tr>


<tr bgcolor="#f4f4f4">
  <td>
<font face="helvetica,arial" size="2">
<b>Abstract:</b>
</font>
</td>
  <td>
<font face="helvetica,arial" size="2">
In this paper, we present Structon, a novel approach that uses Web mining together with inference and IP traceroute to
geolocate IP addresses with significantly better accuracy than existing automated approaches. Structon is composed of three
ideas which we realize in three corresponding steps. First, we extract geolocation information of Web server IP addresses from
Web pages. Second, we devise heuristic algorithms to improve both the accuracy and the coverage of the IP geolocation database
using these Web server IP addresses and their geolocations as input. Third, for those segments that are not covered in the
first two steps, we use IP traceroute to identify the access routers of those segments. When the location of the access router
is known, we can deduce the location of the associated segment since it is co-located together with the access router. By
mining 500-million Web pages collected in China in 2006 (11 percent of the total Web pages in China at that time), we are able
to identify the geolocations for 103 million IP addresses. This represents nearly 88 percent IP addresses allocated to China in
March 2008. Structon is 87.4 percent accurate at city granularity and up to 93.5 percent accurate at province level. We also
used 10 day Windows Live client log to evaluate our client IP addresses coverage: Structon identified geolocations of 98.9
percent of client IP addresses.


</font>
  </td>
</tr>
</div>

