DNS Damage - Measurements at a Root Server
Nevil Brownlee, CAIDA and Univ. of Auckland
kc claffy, CAIDA
Evi Nemeth, CAIDA and Univ. of Colorado
CAIDA is the Cooperative Association for Internet Data Analysis at the
San Diego Supercomputer Center on the UC San Diego campus.
DNS Background
Hierarchical namespace
13 Root servers
Have nameserver data for TLDs
Have EDU and some IN-ADDR.ARPA data
No longer have COM, NET, ORG
Not recursive, answers are mostly referrals
Query rates at the roots
5,000/sec at F, 12,000/sec at A
38,000/sec at A during recent DOS attack (saturated
incoming bandwidth)
DNS Background
Hierarchical namespace
13 Root servers
Have nameserver data for TLDs
Have EDU and some IN-ADDR.ARPA data
No longer have COM, NET, ORG
Not recursive, answers are mostly referrals
Query rates at the roots
5,000/sec at F, 12,000/sec at A
38,000/sec at A during recent DOS attack (saturated incoming bandwidth)
Locations of Root and gTLD Servers
Figure 1: Locations of the root
nameservers and gTLD servers. The (x,y)
notation near the city names indicates the number of root servers (x)
followed by the number of gTLD servers (y) in that area. Notice the
large number of both types of servers
around Washington D.C. and in California.
Query Process
Client asks local nameserver
Local nameserver looks in its cache for answer
If it has the answer cached, just returns cached value (non-authoritative)
Else asks a root server for the answer
Root answers with a referral to the correct TLD server
Local nameserver asks the TLD server
TLD server answers with either a referral or an authoritative answer
Might continue several more steps
Might be a site forwarder not the local nameserver asking the questions
Measurements
Measurements done on the F root server
F is two DEC alphas, load balanced with Cisco router (CEF)
Each with dual processors, 4GB memory, lots of disk
Used netstat for 2 weeks to get query load
Used tcpdump on each copy of F to gather traces (1 hour = 6GB)
tcpdump -w froot1.date -n -s 600 udp port 53
tcpdump -w froot2.date -n -s 600 udp port 53
-w writes output to a file
-n says not to do IP address to hostname translations
-s 600 says get 600 bytes (includes whole query)
udp port 53 grabs only dns request/response packets
Query Rate at F Root Servers
Figure 7: Query load at the two F root servers F0 and F1; F1 is
plotted with negative values to display it on the same plot. Black is
the input packet rate and grey is the output packet rate (6-16 jan 2001);
5-minute bins.
Query Rate at F Root Servers
Curves show both F roots, one plotted as negative numbers
Difference between input and output is bogus queries
From RFC 1918 private address space, response not routable
About RFC 1918 addresses, no answer possible
F roots filters were removed for measurements
F responds almost immediately to most queries
93% of the queries with no filtering, 97% with filtering
Remaining queries were unanswerable
Used source port 0 (an error)
Had byte-order error claiming 256 queries per packet
Tried to dynamically update the root's zones
Query Types
Distribution of query types from 1 hour sample trace file
type class #queries %queries
-------------------------------
A IN 2752516 56.8
PTR IN 1467887 30.2
MX IN 257810 5.3
NS IN 117803 2.4
SOA IN 113449 2.3
ANY IN 63361 1.3
SRV IN 34033 .7
AAAA IN 12439 .3 (about 100 A6 queries)
CNAME IN 12333 .3
...
882 29793 1192 .02
1379 26729 1088 .02
...
.2% of the queries were for unknown types or classes
180 different variations, from many different servers
F Root Server Data Sets (tcpdump)
Sample
Size
# queries
# distinct queries (%)
Date/time captured
1 weekend hour
3.6 Gb
10.3 M
2.7 M (26.2%)
Sunday, Jan 7, 11am
1 weekday hour
5.9 Gb
18.0 M
4.8 M (26.7%)
Tuesday, Jan 9, 3pm
2 weekday hours
10.4 Gb
29.1 M
4.5 M (15.5%)
Monday, Jan 8, 1pm
2M packets (~4 min)
338 Mb
1 M
380,000 (37.9%)
Wed. Jan 10, hourly 10am-9pm
4M packets (~8 min)
690 Mb
2 M
622,000 (31.2%)
Jan 12, 17, 18, 19, 24, 2-4 times/day
Full day would need 30GB on each F, had only 35GB
Processing requires both tracefiles on the same machine
Scary filling a root servers disks
Tried to copy files to another machine, NFS performance killed me, gave up
Shorter samples are usually representative
Super Perl Script
Merged the two tcpdump outputs
Sorted output by frequency of query-querier pairs
Top of the list are totally broken machines
Asking the same question hundreds of times a second
Don't take NO for an answer, don't understand referrals or SERVFAIL
Sample output
365326 202.204.32.111.53 A fs.dai.net.
227516 209.88.184.77.53 A axiom.com.
193886 216.76.46.22.53 A www.miamimetrozoo.com.
192118 202.17.127.70.53 PTR 66.64.194.17.211.in-addr.arpa.
131286 129.78.64.1.32842 PTR 254.149.201.211.in-addr.arpa.
97278 63.82.193.252.53 PTR 74.192.117.63.in-addr.arpa.
95778 63.82.193.253.53 PTR 74.192.117.63.in-addr.arpa.
81288 208.155.20.3.1466 MX aol.com.
Repeated Queries
Script shows top offenders at front of output file
564028 IP1.mil PTR 65.224.102.166.in-addr.arpa.
374679 IP2.mil PTR 65.224.102.166.in-addr.arpa.
Dynamic duo of .mil nameservers pummeling the root (9.1% of the queries)
Got a referral to a nameserver that returns SERVFAIL
Who is doing it? nmaping the top 37 repeat offenders yields:
12 down at the time of the nmap run
3 could not be identified, 2 unknown
1 AIX v4.2, Cobalt Linux 4.0
5 Solaris 2.6 - 2.7, Solaris 7
13 Windows NT4 / Win95 / Win98 / Win2k
What percentage of queries are repeated (same querier too)
4-minute trace 1M queries 62% repeats
8-minute 2M 69%
1 hour 10-18M 74%
2 hours 29M 85%
Error Taxonomy, Bogus A Queries
Malformed A queries were 14% of the load
A queries should ask for the IP address of a hostname
These ask for the IP address of an IP address
Hard to track down, nameservers just relay clients queries
Can't see back to the actual client that asked the question
11396 209.64.54.1.2028 A 209.64.54.1.
2838 207.104.128.40.53 A 207.104.128.4.
2798 216.77.6.152.53 A 172.16
2754 207.104.128.40.53 A 207.104.128.40.
2338 206.170.46.10.53 A 206.170.46.10.
2334 206.170.46.10.53 A 206.13.28.11.
2268 204.180.41.2.1067 A 204.180.41.2.
1998 192.53.35.140.53 A 127.0.0.1.
1970 216.146.96.29.2037 A 216.220.0.1.
1968 38.9.202.2.39131 A 38.200.192.70.
Microsoft guilty, Win2k resolver, several viruses, Mac OSX too
Error Taxonomy, Bogus TLDs
20% of the queries asked for a non-existant TLD
Includes the A queries above
Includes lots of internal Microsoft names (active directory)
Lots ending in .local, .localhost, .workgroup, .msft, .domain, etc.
Example
21408 194.18.224.226.2 A loghost.
16322 195.127.121.14.1330 A localhost.
6100 206.141.239.142.52760 A HP_NETWORK_PRINTERS.
5548 10.70.1.1.53 A localhost.
4690 192.168.29.6.1071 Type33 _ldap._tcp.gc._msdcs.AD01.
4430 192.168.29.6.1071 Type33 _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.AD01.
2142 192.168.29.6.1071 Type33 _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.SHFEX01.
2134 192.168.29.6.1071 Type33 _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.SHFEX02.
2040 192.168.29.6.1071 Type33 _ldap._tcp.gc._msdcs.SHFEX01.
2038 192.168.29.6.1071 Type33 _ldap._tcp.gc._msdcs.SHFEX02.
1362 212.144.192.21.64697 Type33 _ldap._tcp.karlshorst-net-noc._sites.gc._msdcs.FAST.
920 192.192.2.4.1109 Type33 _ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.brainiac.
Error Taxonomy, Private Address Space
Private address space is for internal use only
10.0.0.0/8
172.16.0.0 - 172.32.0.0
192.168.0.0/16
Should never sneak out onto the Internet
But they do, as source addresses and as query targets:
4690 192.168.29.6.1071 Type33 _ldap._tcp.gc._msdcs.AD01.
4430 192.168.29.6.1071 Type33 _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.AD01.
4228 192.168.10.4.53 A A.ROOT-SERVERS.NET.
4224 192.168.10.4.53 A M.ROOT-SERVERS.NET.
34856 148.233.129.53.53 PTR 19.1.1.10.in-addr.arpa.
10068 204.71.61.185.767 PTR 1.5.170.10.in-addr.arpa.
9694 195.224.132.250.53 PTR 82.3.168.192.in-addr.arpa.
8952 208.35.134.126.55058 PTR 15.1.16.172.in-addr.arpa.
8610 206.64.46.17.53 PTR 12.0.10.10.in-addr.arpa.
Syslog Errors
Two kinds of errors logged most frequently
Source port 0
Jan 6 13:39:19 drew named[128838]: dropping source port zero packet from [216.161.67.226].0
Jan 6 13:39:23 drew named[128838]: dropping source port zero packet from [63.224.229.252].0
Jan 6 13:39:25 drew named[128838]: dropping source port zero packet from [63.227.214.187].0
Update denied
Jan 6 13:40:28 drew named[128838]: denied update from [24.64.63.195].41151 for in-addr.arpa
Jan 6 13:40:47 drew named[128838]: denied update from [24.64.63.195].41858 for in-addr.arpa
Dynamic updates let DHCP server update DNS database
OK for local networks using automatic configuration
Never OK for the PC on your desk to try to update the root zone
Win2k shipped with default configuration trying to update roots
Graph of Syslog Errors Over a Year
Figure 8: Errors logged at the F root server, showing the number
of packets with source port 0 (dropped) and the number of attempts to
dynamically update the F root's zone data.
Summary of Errors Seen
Crib sheet for column headings of giant table
rfc1918 queries from hosts in RFC 1918 private address space
rfc1918? queries for the hostname of an RFC 1918 address
A+IP queries with IP address target instead of a hostname
TLD queries for a record in an invalid top level domain
windows queries about microsoft document system names (msdcs)
top10 top 10 src/query pairs in trace, repeated query bugs
top100 top 100 src/query pairs in trace, repeated query bugs
>1/min queries repeated more than once a minute
Giant table itself
trace rfc1918 rfc1918? A+IP TLD windows top10 top100 >1/min
----------------------------------------------------------------------
jan7.11a 2.5 2.0 12.0/11.7 19.6/37.1 1.79 15.7 23.7 51.7
jan8.1p 2.6 6.1 14.9/13.1 23.1/37.4 1.36 5.3 14.1 44.0
jan9.3p 3.0 6.2 12.2/12.7 20.0/36.9 1.38 18.0 23.9 50.0
----------------------------------------------------------------------
jan10.10a 3.4 7.7 12.6/13.3 22.3/31.7 1.58 5.9 14.9 24.8
jan10.11a 3.3 8.1 13.3/13.6 23.5/31.8 1.56 6.6 13.8 23.6
jan10.12p 3.6 7.4 13.4/13.8 23.7/31.9 1.91 7.9 15.4 25.1
jan10.1p 3.1 6.9 14.0/13.9 24.6/32.1 1.42 7.9 16.9 26.6
jan10.2p 3.5 7.2 14.5/14.0 25.2/32.6 1.48 6.0 13.9 24.0
jan10.3p 3.4 6.8 14.5/14.0 25.6/33.3 1.55 6.3 15.0 25.9
jan10.4p 3.1 9.7 14.6/14.1 26.1/33.7 1.55 6.2 13.1 22.5
jan10.5p 3.3 10.0 15.5/14.2 25.8/33.4 1.59 6.9 13.6 22.0
jan10.6p 3.1 9.2 17.9/14.7 28.0/34.5 1.80 5.7 12.5 22.9
jan10.7p 3.4 5.7 18.5/14.9 29.1/35.0 1.70 5.2 12.6 23.9
jan10.8p 3.5 6.7 18.7/15.1 29.4/35.2 1.65 5.7 13.3 24.9
jan10.9p 3.3 8.2 18.7/14.8 29.7/35.1 1.74 5.6 13.3 24.8
----------------------------------------------------------------------
jan12.9a 2.8 8.4 13.5/13.2 23.4/33.2 1.45 3.1 9.4 26.1
jan12.5p 2.5 7.0 16.3/13.3 25.7/34.5 1.64 6.8 15.6 30.3
jan17.11 3.1 7.9 13.9/13.0 22.5/32.2 1.44 3.9 10.7 24.9
jan17.4p 3.4 11.5 16.6/13.3 25.7/33.4 1.73 2.9 8.5 23.5
jan18.6p 3.0 8.1 13.6/13.9 20.5/34.4 1.28 12.1 21.5 38.4
jan18.10p 0.03 6.3 14.1/13.6 21.4/37.0 1.34 12.4 22.3 40.1
jan19.2p 0.0 4.3 13.0/13.6 20.3/33.2 1.36 11.9 20.2 36.2
----------------------------------------------------------------------
jan24.10a 0.0 4.2 14.3/13.9 25.2/37.9 1.87 3.0 8.3 22.7
jan24.2p 0.0 4.8 12.4/12.6 20.6/34.2 1.46 3.9 8.2 25.3
jan24.5p 0.0 10.7 14.5/12.6 22.9/34.1 1.60 3.6 10.3 27.7
jan24.9p 0.0 5.5 14.6/12.3 23.1/35.4 1.85 4.4 13.0 34.2
----------------------------------------------------------------------
E.jan25 7.7 5.8 6.9/10.5 14.3/29.0 0.97 6.1 12.7 26.0
M.jan24 1.3 3.3 12.7/21.4 22.3/51.5 1.09 12.6 27.3 49.3
Table 3: Taxonomy of bogus queries received at the F root nameserver
during the month of January 2001 and at the E and M root servers on
a single day in January.
Attacks
Denial of service attacks often use the DNS as reflectors
Two attacks were ongoing during our measurements
One on a register.com client, using spoofed source IPs
One trolling for hostnames but confused about PTR records
Examples
1826 209.67.50.220.1865 A M.ROOT-SERVERS.NET.
1826 209.67.50.220.1865 A K.ROOT-SERVERS.NET.
1826 209.67.50.220.1865 A I.ROOT-SERVERS.NET.
1826 209.67.50.220.1865 A F.ROOT-SERVERS.NET.
...
1824 209.67.50.220.1865 A A.ROOT-SERVERS.NET.
6 199.170.0.2.1024 PTR 54.11.193.155.in-addr.arpa.
5 199.170.0.2.1024 PTR 54.8.235.158.in-addr.arpa.
5 199.170.0.2.1024 PTR 54.3.188.143.in-addr.arpa.
5 199.170.0.2.1024 PTR 54.19.211.129.in-addr.arpa.
5 199.170.0.2.1024 PTR 54.19.109.149.in-addr.arpa.
5 199.170.0.2.1024 PTR 54.13.198.137.in-addr.arpa.
Microsoft's DNS Woes
Microsoft's 4 authoritative nameservers visible to the outside world were on one subnet
They misconfigured the router upstream of that subnet
TTL for their names set to 2 hours
Started timing out of peoples caches
Query load at the roots started climbing
Microsoft properties usually about 6k queries/hour, 0% of the load
Increased to 25% of the load at F
High visibility site with DNS problems affects the whole Internet infrastructure
Conclusions
Performance of the root servers amazing given the bogus query load
Non-stop repeated queries
Bogus A queries (14%)
Bogus TLDs (20%), internal names leaking out to the Internet
Negative caching could help
BIND8 and 9 include it
Wink2k the first Microsoft product to sort of implement it
Services like Akamai changing how DNS is used
Developers need to test not just positive performance of their products
Overlays like new.net will have an impact if they succeed