As of July 2018, the database contains 13 Internet Topology Data Kit datasets from 2011 onward (but not including 3 ITDK's in 2010). There are 1.5 million alias sets involving 5.4 million unique addresses. The data model consists of datasets, alias sets, and addresses. Addresses belong to alias sets which in turn belong to datasets.
Development is still in its early stages, so we welcome feedback to email@example.com.
There are six main operations: status and get -- which operate on a result ID -- and list, track, find, and group, which return a result ID at completion. A user can select which datasets to apply these operations to by (1) time range, (2) dataset ID or name, (3) all available, or (4) the latest available dataset.
use 'status' to check on a run by result ID, which is given after a query/fetch operation is started:
$ ./aliasq-api status 1 status: finished submission date: Wed Jul 11 11:40:21 2018 completion date: Wed Jul 11 11:40:21 2018Valid 'status' codes are 'inprogress', 'finished', and 'error'. You should periodically re-run the status command until the status becomes 'finished'.
use the 'get' operation to retrieve successful results into a local file named by the requested result ID (the following example assumes we successfully executed an operation with result ID 1):
$ ./aliasq-api get 1 HTTP response code: 200 $ cat aliasq-1.out # dataset_id, dataset_name, timestamp*, datetime 13 itdk-20180301-midar 1519862400 2018-03-01T00:00:00
use the 'list' command to list available datasets:
$ ./aliasq-api list --start 2017-01-01 result ID: 2 $ ./aliasq-api get 2 $ cat aliasq-2.out # dataset_id, dataset_name, timestamp*, datetime 11 itdk-20170207-midar 1486425600 2017-02-07T00:00:00 12 itdk-20170828-midar 1503878400 2017-08-28T00:00:00 13 itdk-20180301-midar 1519862400 2018-03-01T00:00:00
use the 'track' command to list all datasets/alias-sets that contain a target address (that is, "track a target across datasets over time"):
$ ./aliasq-api track --all 126.96.36.199 result ID: 3 $ ./aliasq-api get 3 $ cat aliasq-3.out # dataset_id, set_id, dataset_name, timestamp*, datetime 12 3316 itdk-20170828-midar 1503878400 2017-08-28T00:00:00 13 1000 itdk-20180301-midar 1519862400 2018-03-01T00:00:00
use the 'find' command finds and prints all aliases of the target address :
$ ./aliasq-api find --dataset=12 188.8.131.52 result ID: 4 $ ./aliasq-api get 4 $ cat aliasq-4.out # dataset_id, set_id, dataset_name, timestamp*, datetime, addr_count, addresses 12 3316 itdk-20170828-midar 1503878400 2017-08-28T00:00:00 15 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168
the 'group' command takes a list of addresses and groups them into aliases based on known alias sets. A user can select which datasets to apply these operations to by (1) time range, (2) dataset ID or name, (3) all available, or (4) the latest available dataset.
$ ./aliasq-api group --dataset=itdk-20170828-midar 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 result ID: 5 $ ./aliasq-api get 5 $ cat aliasq-5.out # dataset_id, set_id, dataset_name, addr_count, addresses 12 3315 itdk-20170828-midar 2 126.96.36.199 188.8.131.52 12 3316 itdk-20170828-midar 4 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124
The 'group' command only prints out addresses that have at least one other alias among the target addresses.
aliasq-api is written in Python 3, and queries execute quickly (the above 'find' and 'group' commands take <1 sec).
Getting access to aliasq-api
We currently limit access to academic researchers. Please send access requests, questions, or feedback to firstname.lastname@example.org using an email address from your institution (we don't accept access requests sent from general email providers like gmail, yahoo, etc.