Researchers studying the Internet face a significant challenge in looking for traffic traces: the fundamental conflict between end-user privacy and the research utility of data. When data is heavily anonymized, important attributes of data that reveal the structure and function of networks are obscured. If data is not heavily anonymized, details about end users, including geographic and network location, organization, names, passwords, and other personal information could be subject to unauthorized access.
We provide the following list of anonymization tools to help those searching for such tools to find them and also better understand and track availability of such methods.
- Value is replaced with a constant value (typically 0) of the same size. Sometimes called "black marker".
- A hash function maps each value to a new (not necessarily unique) value.
- Maps each original value to a unique new value.
- Any two values that had the same n-bit prefix before anonymization will still have the same n-bit prefix as each other after anonymization. (Would be more accurately called "prefix-relationship-preserving", because the actual prefix values are not preserved.)
- Adds a fixed offset to each value.
- Map each original value to a new value such that their ordering is preserved.
- Possible values are partitioned into meaningful sets; actual values are replaced with a fixed value from the same set. E.g., TCP port numbers 0 to 1023 are replaced with 0, and 1024 to 65535 replaced with 65535.
- Checksums are recalculated to reflect changes made to other fields.
- Field is shortened, losing data at the end.
|AnonTool||Netflow (v5 and v9) traces in tcpdump format or on live interfaces||IP address||partial hiding, random permutation, prefix-preserving permutation, hashing, etc.||Built on AAPI.|
|most NetFlow fields||random permutation, hash, hiding, prefix-preserving permutation, etc.|
|NetFlow checksums||updated, etc.|
|CANINE||Cisco NetFlow (v5 and v7), NFDUMP, CiscoNCSA, ArgusNCSA||IPv4 address||partial hiding, random permutation, prefix-preserving permutation||GUI. Predecessor to FLAIM.|
|timestamp||partial hiding, shift, enumeration|
|port number||high/low partitioning, hiding|
|protocol number, byte count||hiding|
|CoralReef||network interfaces; DAG, FORE, and POINT capture cards; trace files in CoralReef (.crl), tcpdump/pcap, DAG (legacy and ERF), or TSH (.tsh) formats||IPv4 address (including those in ICMP headers)||partial hiding, cryptographic prefix-preserving permutation (using Crypto-PAn)||The CoralReef suite provides many other analysis tools, and C and Perl APIs; all allow anonymization.|
|IPv4, TCP, and UDP checksums||updated|
|headers below IP layer||discard|
|payload of any layer 1-4||truncation|
|FLAIM||tcpdump/pcap, netfilter (iptable) syslogs, NFDUMP, Linux process accounting logs, etc.||IPv4/IPv6 addresses||partial hiding, random permutation, cryptographic prefix-preserving permutation, hash, etc.||Scriptable command line tool. Extensible via dynamically loadable modules. Successor to CANINE.|
|Ethernet addresses, other numbers||partial hiding, random permutation, hash, etc.|
|various other fields in Ethernet, IP, TCP, UDP, ICMP||partial hiding, partitioning (for numbers), shift (for timestamps), enumeration (for timestamps), hash, truncation, etc.|
|ipsumdump||tcpdump/pcap, DAG (legacy and ERF), FR, FR+, TSH, ipsumdump (text), NetFlow summary (text), linux network device||IPv4 address (outer header only)||prefix-preserving permutation, class-preserving permutation (based on tcpdpriv)||Outputs only tcpdump (pcap) format or ipsumdump text format|
|most Ethernet, IP, TCP, UDP, ICMP fields; payload||discarded|
|NFDUMP||NetFlow (v5, v7, v9) in NFDUMP format or on live interfaces||IPv4 address||cryptographic prefix-preserving permutation (using Crypto-PAn)|
|SCRUB-tcpdump||tcpdump/pcap, network interface||IPv4 address||partial hiding, random permutation, subnet/host permutation|
|TCP/UDP ports, TCP sequence number, TCP flags, TTL, packet length, transport protocol||hiding, random permutation, partitioning|
|packet timestamp||partial hiding, enumeration, shift, random permutation|
|fragmentation flag||hiding, random permutation|
|tcpanon||tcpdump/pcap||fields in application layers: HTTP, SMTP, POP3, IMAP4, FTP, FTP-data||hiding|
|tcpdpriv||tcpdump/pcap, network interface||IPv4 address (including those in nested headers)||permutation, prefix-preserving permutation, class-preserving permutation|
|TCP/UDP port numbers||permutation|
|tcpmkpub||tcpdump/pcap||IPv4 address (differentiable by external, internal, multicast, private, etc.), including those in ICMP headers||cryptographic prefix-preserving permutation (using Crypto-PAn algorithm), subnet-preserving permutation, etc.||IP address algorithm is particularly well suited for edge networks, which are especially vulnerable to signature attacks. Extensible via policy configuration files and C++ functions.|
|Ethernet address||vendor-preserving anonymization, etc.|
|checksums||updated, hidden, etc.|
|many other fields in Ethernet, ARP, IP, ICMP, UDP, TCP||hidden, etc.|
|TCPurify||tcpdump/pcap, network interface||IPv4 address||hiding, random permutation within specified networks|
|AAPI||C||network interfaces, tcpdump/pcap, and Netflow (v5 and v9) traces in tcpdump format or on live interfaces||IPv4 address||partial hiding, random permutation, prefix-preserving permutation, hashing, etc.||Users may write their own decoders for new protocols.|
|many other header fields in Ethernet, IPv4, TCP, UDP, NetFlow, HTTP, FTP||random permutation, hash, hiding, prefix-preserving permutation, etc.|
|Crypto-PAn||C++||IPv4 address||IPv4 address||cryptographic prefix-preserving permutation||original address can be recovered with a key|
|Lucent's extensions to Crypto-PAn||C++||IPv4 address||IPv4 address||cryptographic prefix-preserving permutation, random permutation||output contains random permutation; one key can be used to recover a prefix-preserving permutation; two keys can be used to recover original address|
|IP::Anonymous||Perl||IPv4 address||IPv4 address||cryptographic prefix-preserving permutation||Perl port of Crypto-PAn|
- FPGA-based Packet Header Anonymization
- As part of the tcpreplay suite, this is primarily intented to rewrite a trace so it can be replayed on a different network.
- http://bittwist.sourceforge.net/. Primarily intended for generating and rewriting packets for replay.
- http://netdude.sourceforge.net/. GUI packet editor.
- Bro IDS
- http://sourceforge.net/projects/anonymizer. Appears to be linux-only, incomplete, undocumented, and unmaintained.