Sorted String Tables: ISC mtbl and ISC dnstable

Sorted String Tables: ISC mtbl and ISC dnstable

Robert Edmonds ()

22 October 2012

Introduction

  • "Sorted String Table" data structure:

The Google SSTable file format is used internally to store Bigtable data. An SSTable provides a persistent, ordered immutable map from keys to values, where both keys and values are arbitrary byte strings. Operations are provided to look up the value associated with a specified key, and to iterate over all key/value pairs in a specified key range.

  • Advantages:
    • Scales fairly well
    • Construction done with sequential I/O
    • Compact encoding
  • Drawback:
    • Not updateable

Implementations

Problem

No standalone, reusable implementation for C and Python programmers

mtbl

  • ISC mtbl: "immutable string table library"

  • Heavily inspired by SSTable implementation in LevelDB

  • Primitives for creating, searching, merging "MTBL format" SSTables

  • Data block compression (choice of zlib or snappy)

  • libmtbl: C shared library

  • pymtbl: Python extension module

mtbl

  • Key/value entries are arbitrary byte arrays

  • No duplicate keys permitted

  • User provides custom entry merging function

  • Intended to be a component in custom sort-and-merge based workflows

mtbl file layout

mtbl interfaces

  • writer
  • reader
  • sorter
  • merger

mtbl: writer

  • Creates an "MTBL" file on disk
  • Caller provides key/value entries in sorted order
  • Entries serialized into data blocks
  • On close, index block and trailer appended to file

mtbl: reader

  • Reads previously recorded "MTBL" file
  • Iterate over every entry in the file
  • Retrieve an entry by specifying exact key
  • Retrieve entries by specifying key range or key prefix

mtbl: sorter

  • Used when key/value entries are not in sorted order
  • Caller supplies custom "merge function" to resolve conflicts
  • Buffers entries in memory or on disk until closed

mtbl: merger

  • Caller supplies custom "merge function"
  • Provides merged view of entries across multiple data files for iteration, searching

mtbl interface summary

  • Use sorter + writer to convert raw data to MTBL files
  • Use merger + reader to query MTBL files
  • Optionally use merger + writer to periodically aggregate data, if possible or desired

dnstable

  • ISC dnstable: "encoding format, library, and utilities for passive DNS data"

  • Concrete application built on top of libmtbl

  • Used in ISC DNSDB service

dnstable

  • Search by name:

    • Exact match
    • "Left-hand" wildcard: *.example
    • "Right-hand" wildcard: example.*
  • Inverse search:

    • Record data contains a domain name: exact, wildcard searches
    • Record data is an IP address: exact, prefix/range searches

dnstable

  • Each passive DNS payload contains multiple fields:

    • rrname
    • rrtype
    • rdata[]
    • bailiwick
    • time_first
    • time_last
    • count
  • (See ISC Passive DNS Architecture for details.)

dnstable

  • Serialize payloads into multiple key/value pairs

  • First byte of key specifies the entry type (1, 2, etc.):

    • ENTRY_TYPE_RRSET
    • ENTRY_TYPE_RRSET_NAME_FWD
    • ENTRY_TYPE_RDATA
    • ENTRY_TYPE_RDATA_NAME_REV
  • Keys packed for efficient lookups

  • (See dnstable-encoding(5) manpage for details.)

dnstable

  • RRtype handling:

    • All RRtypes are stored by dnstable
    • MX and SRV handled specially for ENTRY_TYPE_RDATA
    • NS, CNAME, DNAME, PTR, MX, SRV generate ENTRY_TYPE_RDATA_NAME_REV entries
    • "DNSSEC" records (DS, RRSIG, NSEC, DNSKEY, NSEC3, NSEC3PARAM, DLV) stored in separate MTBL files

dnstable: command-line tools

  • dnstable_convert

    • Converts payload objects (NMSG) to key/value entries (MTBL)
  • dnstable_merge

    • Combines multiple dnstable-encoded MTBL files together
  • dnstable_dump

    • Converts dnstable-encoded MTBL file to text or JSON
  • dnstable_lookup

    • Performs searches

dnstable: Python extension module

>>> import dnstable
>>> d = dnstable.reader('dns.fileset')
>>> q = dnstable.query(dnstable.RDATA_IP, '149.20.64.42')
>>> for res in d.query(q):
...     print res.to_json()
...
{"count": 1, "time_first": 1326465963, "rrtype": "A", "rrname": "mydots.net.", "rdata": ["149.20.64.42"], "time_last": 1326465963}
{"count": 6400537, "time_first": 1277382144, "rrtype": "A", "rrname": "isc.org.", "rdata": ["149.20.64.42"], "time_last": 1339929008}
{"count": 586095, "time_first": 1277353744, "rrtype": "A", "rrname": "www.isc.org.", "rdata": ["149.20.64.42"], "time_last": 1339928783}
{"count": 30, "time_first": 1288046214, "rrtype": "A", "rrname": "blog.isc.org.", "rdata": ["149.20.64.42"], "time_last": 1338907006}
{"count": 2508, "time_first": 1277492207, "rrtype": "A", "rrname": "f.root-servers.org.", "rdata": ["149.20.64.42"], "time_last": 1339920341}
>>>

dnstable

Inspecting a single MTBL file with the mtbl_info utility:

sql1c3:/srv/dnstable/mtbl# mtbl_info dns.201209.M.mtbl
file name:             dns.201209.M.mtbl
file size:             29,439,550,003
index bytes:           196,620,025 (0.67%)
data block bytes       29,242,929,466 (99.33%)
data block size:       8,192
data block count       8,775,560
entry count:           1,570,389,949
key bytes:             95,944,795,146
value bytes:           13,276,862,800
compression algorithm: zlib
compactness:           26.95%

sql1c3:/srv/dnstable/mtbl# 

dnstable

A single formatted key-value entry, from the dnstable_dump utility:

sql1c3:/srv/dnstable/mtbl# dnstable_dump -j -r dns.201209.M.mtbl | head -1 | fmt
{"bailiwick": ".", "rrname": ".", "time_last": 1349031299,
"time_first": 1346432688, "count": 19087899, "rrtype": 2, "rdata":
["a.root-servers.net.", "b.root-servers.net.", "c.root-servers.net.",
"d.root-servers.net.", "e.root-servers.net.", "f.root-servers.net.",
"g.root-servers.net.", "h.root-servers.net.", "i.root-servers.net.",
"j.root-servers.net.", "k.root-servers.net.", "l.root-servers.net.",
"m.root-servers.net."]}

Same entry, raw byte values, from the mtbl_dump utility:

sql1c3:/srv/dnstable/mtbl# mtbl_dump dns.201209.M.mtbl | head -1 | fmt
"\x00\x00\x02\x00\x14\x01a\x0croot-servers\x03net\x00\x14\x01b\x0croot-servers\x03net\x00\x14\x01c\x0croot-servers\x03net\x00\x14\x01d\x0croot-servers\x03net\x00\x14\x01e\x0croot-servers\x03net\x00\x14\x01f\x0croot-servers\x03net\x00\x14\x01g\x0croot-servers\x03net\x00\x14\x01h\x0croot-servers\x03net\x00\x14\x01i\x0croot-servers\x03net\x00\x14\x01j\x0croot-servers\x03net\x00\x14\x01k\x0croot-servers\x03net\x00\x14\x01l\x0croot-servers\x03net\x00\x14\x01m\x0croot-servers\x03net\x00"
"\xb0\xdd\x83\x82\x05\x83\xab\xa2\x83\x05\x9b\x84\x8d\x09"

Summary

  • No standalone, reuseable "SSTable" implementation available

  • Built "mtbl? based on open source Google code

  • Used mtbl as the file format for "dnstable?to store and index passive DNS data