DNS Measurements





DNS Damage - Measurements at a Root Server





Nevil Brownlee, CAIDA and Univ. of Auckland


kc claffy, CAIDA


Evi Nemeth, CAIDA and Univ. of Colorado













CAIDA is the Cooperative Association for Internet Data Analysis at the San Diego Supercomputer Center on the UC San Diego campus.






DNS Background

  • Hierarchical namespace

  • 13 Root servers

    • Have nameserver data for TLDs

    • Have EDU and some IN-ADDR.ARPA data

    • No longer have COM, NET, ORG

    • Not recursive, answers are mostly referrals

  • Query rates at the roots

    • 5,000/sec at F, 12,000/sec at A

    • 38,000/sec at A during recent DOS attack (saturated incoming bandwidth)











Locations of Root and gTLD Servers




Figure 1: Locations of the root nameservers and gTLD servers. The (x,y) notation near the city names indicates the number of root servers (x) followed by the number of gTLD servers (y) in that area. Notice the large number of both types of servers around Washington D.C. and in California.








Query Process

  • Client asks local nameserver

  • Local nameserver looks in its cache for answer

    • If it has the answer cached, just returns cached value (non-authoritative)

    • Else asks a root server for the answer

    • Root answers with a referral to the correct TLD server

    • Local nameserver asks the TLD server

    • TLD server answers with either a referral or an authoritative answer

  • Might continue several more steps

  • Might be a site forwarder not the local nameserver asking the questions











Measurements

  • Measurements done on the F root server

    • F is two DEC alphas, load balanced with Cisco router (CEF)

    • Each with dual processors, 4GB memory, lots of disk

  • Used netstat for 2 weeks to get query load

  • Used tcpdump on each copy of F to gather traces (1 hour = 6GB)

    • tcpdump -w froot1.date -n -s 600 udp port 53

    • tcpdump -w froot2.date -n -s 600 udp port 53

      • -w writes output to a file

      • -n says not to do IP address to hostname translations

      • -s 600 says get 600 bytes (includes whole query)

      • udp port 53 grabs only dns request/response packets







Query Rate at F Root Servers




Figure 7: Query load at the two F root servers F0 and F1; F1 is plotted with negative values to display it on the same plot. Black is the input packet rate and grey is the output packet rate (6-16 jan 2001); 5-minute bins.








Query Rate at F Root Servers

  • Curves show both F roots, one plotted as negative numbers

  • Difference between input and output is bogus queries

    • From RFC 1918 private address space, response not routable

    • About RFC 1918 addresses, no answer possible

    • F roots filters were removed for measurements

  • F responds almost immediately to most queries

    • 93% of the queries with no filtering, 97% with filtering

  • Remaining queries were unanswerable

    • Used source port 0 (an error)

    • Had byte-order error claiming 256 queries per packet

    • Tried to dynamically update the root's zones










Query Types

  • Distribution of query types from 1 hour sample trace file

    	type  class  #queries  %queries
    	-------------------------------
    	  A     IN    2752516    56.8
    	  PTR   IN    1467887    30.2
    	  MX    IN     257810     5.3
    	  NS    IN     117803     2.4
    	  SOA   IN     113449     2.3
    	  ANY   IN      63361     1.3
    	  SRV   IN      34033      .7
    	  AAAA  IN      12439      .3  (about 100 A6 queries)
    	  CNAME IN      12333      .3
    	  ...
    	  882   29793    1192      .02
    	  1379  26729    1088      .02
    	  ...
    	
  • .2% of the queries were for unknown types or classes

  • 180 different variations, from many different servers











F Root Server Data Sets (tcpdump)

Sample Size # queries # distinct queries (%) Date/time captured
1 weekend hour 3.6 Gb 10.3 M 2.7 M (26.2%) Sunday, Jan 7, 11am
1 weekday hour 5.9 Gb 18.0 M 4.8 M (26.7%) Tuesday, Jan 9, 3pm
2 weekday hours 10.4 Gb 29.1 M 4.5 M (15.5%) Monday, Jan 8, 1pm
2M packets (~4 min) 338 Mb 1 M 380,000 (37.9%) Wed. Jan 10, hourly 10am-9pm
4M packets (~8 min) 690 Mb 2 M 622,000 (31.2%) Jan 12, 17, 18, 19, 24, 2-4 times/day

Table 1: Root Nameserver Data Collection Regime
  • Full day would need 30GB on each F, had only 35GB

    • Processing requires both tracefiles on the same machine

    • Scary filling a root servers disks

    • Tried to copy files to another machine, NFS performance killed me, gave up

  • Shorter samples are usually representative










Super Perl Script

  • Merged the two tcpdump outputs

  • Sorted output by frequency of query-querier pairs

    • Top of the list are totally broken machines

    • Asking the same question hundreds of times a second

    • Don't take NO for an answer, don't understand referrals or SERVFAIL

  • Sample output

    	365326  202.204.32.111.53 A fs.dai.net.
    	227516  209.88.184.77.53 A axiom.com.
    	193886  216.76.46.22.53 A www.miamimetrozoo.com.
    	192118  202.17.127.70.53 PTR 66.64.194.17.211.in-addr.arpa.
    	131286  129.78.64.1.32842 PTR 254.149.201.211.in-addr.arpa.
    	97278   63.82.193.252.53 PTR 74.192.117.63.in-addr.arpa.
    	95778   63.82.193.253.53 PTR 74.192.117.63.in-addr.arpa.
    	81288   208.155.20.3.1466 MX aol.com.
    	







Repeated Queries

  • Script shows top offenders at front of output file

    564028  IP1.mil PTR 65.224.102.166.in-addr.arpa.
    374679  IP2.mil PTR 65.224.102.166.in-addr.arpa. 
    • Dynamic duo of .mil nameservers pummeling the root (9.1% of the queries)

    • Got a referral to a nameserver that returns SERVFAIL

  • Who is doing it? nmaping the top 37 repeat offenders yields:

    12  down at the time of the nmap run
     3  could not be identified, 2 unknown
     1  AIX v4.2, Cobalt Linux 4.0
     5  Solaris 2.6 - 2.7, Solaris 7
    13  Windows NT4 / Win95 / Win98 / Win2k
    
  • What percentage of queries are repeated (same querier too)

    4-minute trace     1M queries     62% repeats
    8-minute           2M             69%
    1 hour             10-18M         74%
    2 hours            29M            85% 









    Error Taxonomy, Bogus A Queries

    • Malformed A queries were 14% of the load

      • A queries should ask for the IP address of a hostname

      • These ask for the IP address of an IP address

      • Hard to track down, nameservers just relay clients queries

      • Can't see back to the actual client that asked the question

      	11396   209.64.54.1.2028 A 209.64.54.1.
      	2838    207.104.128.40.53 A 207.104.128.4.
      	2798    216.77.6.152.53 A 172.16
      	2754    207.104.128.40.53 A 207.104.128.40.
      	2338    206.170.46.10.53 A 206.170.46.10.
      	2334    206.170.46.10.53 A 206.13.28.11.
      	2268    204.180.41.2.1067 A 204.180.41.2.
      	1998    192.53.35.140.53 A 127.0.0.1.
      	1970    216.146.96.29.2037 A 216.220.0.1.
      	1968    38.9.202.2.39131 A 38.200.192.70. 
    • Microsoft guilty, Win2k resolver, several viruses, Mac OSX too










    Error Taxonomy, Bogus TLDs

    • 20% of the queries asked for a non-existant TLD

      • Includes the A queries above

      • Includes lots of internal Microsoft names (active directory)

      • Lots ending in .local, .localhost, .workgroup, .msft, .domain, etc.

    • Example

      21408   194.18.224.226.2 A loghost.
      16322   195.127.121.14.1330 A localhost.
      6100    206.141.239.142.52760 A HP_NETWORK_PRINTERS.
      5548    10.70.1.1.53 A localhost.
      4690    192.168.29.6.1071 Type33 _ldap._tcp.gc._msdcs.AD01.
      4430    192.168.29.6.1071 Type33 _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.AD01.
      2142    192.168.29.6.1071 Type33 _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.SHFEX01.
      2134    192.168.29.6.1071 Type33 _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.SHFEX02.
      2040    192.168.29.6.1071 Type33 _ldap._tcp.gc._msdcs.SHFEX01.
      2038    192.168.29.6.1071 Type33 _ldap._tcp.gc._msdcs.SHFEX02.
      1362    212.144.192.21.64697 Type33 _ldap._tcp.karlshorst-net-noc._sites.gc._msdcs.FAST.
      920     192.192.2.4.1109 Type33 _ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.brainiac.  







    Error Taxonomy, Private Address Space

    • Private address space is for internal use only

      • 10.0.0.0/8

      • 172.16.0.0 - 172.32.0.0

      • 192.168.0.0/16

      • Should never sneak out onto the Internet

    • But they do, as source addresses and as query targets:

      4690    192.168.29.6.1071 Type33 _ldap._tcp.gc._msdcs.AD01.
      4430    192.168.29.6.1071 Type33 _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.AD01.
      4228    192.168.10.4.53 A A.ROOT-SERVERS.NET.
      4224    192.168.10.4.53 A M.ROOT-SERVERS.NET.
      
      34856   148.233.129.53.53 PTR 19.1.1.10.in-addr.arpa.
      10068   204.71.61.185.767 PTR 1.5.170.10.in-addr.arpa.
      9694    195.224.132.250.53 PTR 82.3.168.192.in-addr.arpa.
      8952    208.35.134.126.55058 PTR 15.1.16.172.in-addr.arpa.
      8610    206.64.46.17.53 PTR 12.0.10.10.in-addr.arpa.
      










    Syslog Errors

    • Two kinds of errors logged most frequently

      • Source port 0

        Jan  6 13:39:19 drew named[128838]: dropping source port zero packet from [216.161.67.226].0
        Jan  6 13:39:23 drew named[128838]: dropping source port zero packet from [63.224.229.252].0
        Jan  6 13:39:25 drew named[128838]: dropping source port zero packet from [63.227.214.187].0
        
      • Update denied

        Jan  6 13:40:28 drew named[128838]: denied update from [24.64.63.195].41151 for in-addr.arpa
        Jan  6 13:40:47 drew named[128838]: denied update from [24.64.63.195].41858 for in-addr.arpa
        
    • Dynamic updates let DHCP server update DNS database

      • OK for local networks using automatic configuration

      • Never OK for the PC on your desk to try to update the root zone

      • Win2k shipped with default configuration trying to update roots







    Graph of Syslog Errors Over a Year


    Figure 8: Errors logged at the F root server, showing the number of packets with source port 0 (dropped) and the number of attempts to dynamically update the F root's zone data.








    Summary of Errors Seen

    Crib sheet for column headings of giant table

      rfc1918	queries from hosts in RFC 1918 private address space
      rfc1918?	queries for the hostname of an RFC 1918 address
      A+IP		queries with IP address target instead of a hostname
      TLD		queries for a record in an invalid top level domain
      windows	queries about microsoft document system names (msdcs)
      top10		top 10 src/query pairs in trace, repeated query bugs
      top100	top 100 src/query pairs in trace, repeated query bugs
      >1/min	queries repeated more than once a minute
    

    Giant table itself

    trace  rfc1918 rfc1918?  A+IP      TLD     windows top10 top100 >1/min
    ----------------------------------------------------------------------
    jan7.11a   2.5   2.0   12.0/11.7  19.6/37.1  1.79  15.7   23.7   51.7
    jan8.1p    2.6   6.1   14.9/13.1  23.1/37.4  1.36   5.3   14.1   44.0
    jan9.3p    3.0   6.2   12.2/12.7  20.0/36.9  1.38  18.0   23.9   50.0
    ----------------------------------------------------------------------
    jan10.10a  3.4   7.7   12.6/13.3  22.3/31.7  1.58   5.9   14.9   24.8
    jan10.11a  3.3   8.1   13.3/13.6  23.5/31.8  1.56   6.6   13.8   23.6
    jan10.12p  3.6   7.4   13.4/13.8  23.7/31.9  1.91   7.9   15.4   25.1
    jan10.1p   3.1   6.9   14.0/13.9  24.6/32.1  1.42   7.9   16.9   26.6
    jan10.2p   3.5   7.2   14.5/14.0  25.2/32.6  1.48   6.0   13.9   24.0
    jan10.3p   3.4   6.8   14.5/14.0  25.6/33.3  1.55   6.3   15.0   25.9
    jan10.4p   3.1   9.7   14.6/14.1  26.1/33.7  1.55   6.2   13.1   22.5
    jan10.5p   3.3  10.0   15.5/14.2  25.8/33.4  1.59   6.9   13.6   22.0
    jan10.6p   3.1   9.2   17.9/14.7  28.0/34.5  1.80   5.7   12.5   22.9
    jan10.7p   3.4   5.7   18.5/14.9  29.1/35.0  1.70   5.2   12.6   23.9
    jan10.8p   3.5   6.7   18.7/15.1  29.4/35.2  1.65   5.7   13.3   24.9
    jan10.9p   3.3   8.2   18.7/14.8  29.7/35.1  1.74   5.6   13.3   24.8
    ----------------------------------------------------------------------
    jan12.9a   2.8   8.4   13.5/13.2  23.4/33.2  1.45   3.1    9.4   26.1
    jan12.5p   2.5   7.0   16.3/13.3  25.7/34.5  1.64   6.8   15.6   30.3
    jan17.11   3.1   7.9   13.9/13.0  22.5/32.2  1.44   3.9   10.7   24.9
    jan17.4p   3.4  11.5   16.6/13.3  25.7/33.4  1.73   2.9    8.5   23.5
    jan18.6p   3.0   8.1   13.6/13.9  20.5/34.4  1.28  12.1   21.5   38.4
    jan18.10p  0.03  6.3   14.1/13.6  21.4/37.0  1.34  12.4   22.3   40.1
    jan19.2p   0.0   4.3   13.0/13.6  20.3/33.2  1.36  11.9   20.2   36.2
    ----------------------------------------------------------------------
    jan24.10a  0.0   4.2   14.3/13.9  25.2/37.9  1.87   3.0    8.3   22.7
    jan24.2p   0.0   4.8   12.4/12.6  20.6/34.2  1.46   3.9    8.2   25.3
    jan24.5p   0.0  10.7   14.5/12.6  22.9/34.1  1.60   3.6   10.3   27.7
    jan24.9p   0.0   5.5   14.6/12.3  23.1/35.4  1.85   4.4   13.0   34.2
    ----------------------------------------------------------------------
    E.jan25    7.7   5.8    6.9/10.5  14.3/29.0  0.97   6.1   12.7   26.0
    M.jan24    1.3   3.3   12.7/21.4  22.3/51.5  1.09  12.6   27.3   49.3
    

    Table 3: Taxonomy of bogus queries received at the F root nameserver during the month of January 2001 and at the E and M root servers on a single day in January.









    Attacks

    • Denial of service attacks often use the DNS as reflectors

      • Two attacks were ongoing during our measurements

      • One on a register.com client, using spoofed source IPs

      • One trolling for hostnames but confused about PTR records

    • Examples

          1826    209.67.50.220.1865 A M.ROOT-SERVERS.NET.
          1826    209.67.50.220.1865 A K.ROOT-SERVERS.NET.
          1826    209.67.50.220.1865 A I.ROOT-SERVERS.NET.
          1826    209.67.50.220.1865 A F.ROOT-SERVERS.NET.
          ...
          1824    209.67.50.220.1865 A A.ROOT-SERVERS.NET.
      
          6       199.170.0.2.1024 PTR 54.11.193.155.in-addr.arpa.
          5       199.170.0.2.1024 PTR 54.8.235.158.in-addr.arpa.
          5       199.170.0.2.1024 PTR 54.3.188.143.in-addr.arpa.
          5       199.170.0.2.1024 PTR 54.19.211.129.in-addr.arpa.
          5       199.170.0.2.1024 PTR 54.19.109.149.in-addr.arpa.
          5       199.170.0.2.1024 PTR 54.13.198.137.in-addr.arpa.
      











    Microsoft's DNS Woes

    • Microsoft's 4 authoritative nameservers visible to the outside world were on one subnet

      • They misconfigured the router upstream of that subnet

      • TTL for their names set to 2 hours

      • Started timing out of peoples caches

      • Query load at the roots started climbing

    • Microsoft properties usually about 6k queries/hour, 0% of the load

    • Increased to 25% of the load at F

    • High visibility site with DNS problems affects the whole Internet infrastructure








    Conclusions





    • Performance of the root servers amazing given the bogus query load

      • Non-stop repeated queries

      • Bogus A queries (14%)

      • Bogus TLDs (20%), internal names leaking out to the Internet

    • Negative caching could help

      • BIND8 and 9 include it

      • Wink2k the first Microsoft product to sort of implement it

    • Services like Akamai changing how DNS is used

    • Developers need to test not just positive performance of their products

    • Overlays like new.net will have an impact if they succeed

Related Objects

See https://catalog.caida.org/media/2001_ietf_dnsdamage/ to explore related objects to this document in the CAIDA Resource Catalog.