NAME

file_format - File formats used by CAIDA::Tables for storage.

Text format

The text format used by CAIDA::Tables is simple: All fields of a key/counter pair are on a single (newline-terminated) line, separated by tab characters. The end-of-table token is simply a line beginning with '#'. For example, a Proto_Table with one entry might be saved as:

    17  2       80      1       0       0
    # End of Proto_Table

In this case, key is the protocol (17) and counter consists of the number of packets (2), bytes (80) and flows (1) for that protocol, as well as the first and latest timestamps seen (0). This format currently assumes that the counter is a FlowCounter. However, the first and latest can be omitted.

Binary format

The binary format used by CAIDA::Tables is slightly more complicated. The internal representation of the key to be saved determines the binary format of the saved data. For example, a Tuple_Table saves its two IP addresses as the 4-byte representations returned by Socket::inet_aton, the protocol and ports_ok as 8-bit values, and the ports as 16-bit values. All values are stored in network byte order. However, a table based on strings (such as an AS_Matrix) stores the ASCII characters directly, with fields separated by the NUL character. With this in mind, the binary data order for an entry is as follows:

    key_len, full_count, key_data, count_data

key_len is the byte-length of key_data, and full_count is a boolean which indicates whether or not count_data contains first and latest. key_data is the data stored in the key, and count_data is the data stored in the counter. The end-of-table token is an 8-bit zero value (where a new entry would start), followed by a newline character and a line beginning with #. This format currently assumes that the counter is a FlowCounter.