Corsaro ships with several tools which leverage the libcorsaro library. This section describes the purpose of each tool and how to use it.
This is the main tool in the Corsaro suite, it provides a lightweight wrapper around the Corsaro-Out features of libcorsaro. The corsaro tool processes trace files and uses a set of plugins to analyze and generate aggregated statistics about the packets they contain. For more information about this process, see the Corsaro-Out and Plugins sections of this manual.
In addition to processing existing trace files, corsaro can capture packets from a live interface by using the special pcapint:<interface> trace URI parameter. This feature is still in the alpha testing phase and so only has limited functionality, e.g., output files are not rotated.
usage: corsaro -o outfile [-i interval] [-m mode] [-n name]
[-p plugin] [-f filter] trace_uri [trace_uri...]
-o <outfile> use <outfile> as a template for file names.
%P will be replaced with the plugin name
-i <interval> distribution interval in seconds (default: 60)
-m <mode> output in 'ascii' or 'binary'. (default: ascii)
-n <name> monitor name (default: <hostname>)
-p <plugin> enable the given plugin, -p can be used
multiple times (default: all)
-f <filter> BPF filter to apply to packets
corsaro takes two mandatory arguments: an output filename template, and an input trace URI.
The output template must contain the string P which will be expanded to the name of the plugin (and possibly a plugin-specific identifier) for each file created. corsaro will scan the filename to determine which, if any, compression should be used when creating the file. A gz extension will cause gzip compression to be used, whereas a bz2 compression will use bzip compression. Uncompressed files will be created for all other extensions.
For example, at CAIDA, we use:
/path/to/corsaro/data/telescope.<timestamp>.%P.cors.gz
The input URI will most commonly just be the path to a pcap (or similar) file. The IO library (libwandio) used by libcorsaro can automatically detect gzip and bzip compression if it is used. Multiple trace URIs can be supplied and will be processed in the order they are listed. Take care to ensure packets are sorted in chronological order - plugin behavior is undefined for unordered packets.
The remaining arguments are all optional and alter how the trace is processed.
intervalmodebinary and ascii are supported valuesnamepluginfilterThe output files generated by the corsaro tool can be viewed either with a standard text viewer (for the ASCII output format), or using the cors2ascii tool (for the binary output format).
The cors2ascii tool converts binary output from any corsaro plugin that implements the Corsaro-In API to an ASCII format. The output from cors2ascii depends on the specific plugin used, but the output will be in a format which is mostly human-readable, as well as supporting ad-hoc analysis scripts (e.g. written in Perl).
Currently cors2ascii supports the FlowTuple and RS DoS binary output formats, as well as the Corsaro Global Output File . See the File Formats page for details about the output formats for plugins.
usage: cors2ascii input_file
cors2ascii takes a single argument: the path to the file to be converted to ASCII. Because cors2ascii uses the IO Framework, gzip and bzip compressed files are supported also.
The cors-ft-aggregate tool re-aggregates FlowTuple based on time and sub-tuples.
The re-aggregation features of cors-ft-aggregate provide a powerful method for analyzing specific dimensions of a dataset, much more efficiently and reliably than parsing and manually aggregating the data output by cors2ascii.
The current version of the tool only supports the FlowTuple ASCII output format. The fields of the tuple which are not included in the re-aggregation will be zeroed out as shown in the example output below. Also, the tool does not preserve the classes from the original data - tuples from all classes are aggregated into a single table. Support for binary output and class preservation is planned for a future release.
usage: cors-ft-aggregate [-l] [-i interval] [-v value_field] -f field [-f field]... file_list
-l treat the input files as containing legacy format data
-i <interval> new distribution interval in seconds. (default: 0)
a value of -1 aggregates to a single interval
a value of 0 uses the original interval
-v <value> field to use as aggregation value (default: packet_cnt)
-f <field> a tuple field to re-aggregate with
Supported field names are:
src_ip, dst_ip, src_port, dst_port, protocol, ttl, tcp_flags,
ip_len, packet_cnt
fieldfile_list- to read the list from stdin (for use with find (1) etc).l[egacy]interval-1 indicates that all data should be aggregated into a single interval0 uses the original interval in the file (60 seconds for CAIDA data).valuepacket_cnt in a raw FlowTuple filepacket_cnt, the value will be the number of unique elements in the setsrc_ip will give a value for each tuple which is the number of unique source IP addresses which match the sub-tuple (as specified by the field arguments)Re-aggregating data with a 24 hour interval, using protocol as the field, and src_ip as the value, gives per-day tables describing the number of unique source IP addresses observed for each protocol.
Command used:
find /path/to/flowtuple/data/ -type f -name "*.flowtuple.cors.gz" | sort | \ cors-ft-aggregate -i 86400 -v src_ip -f protocol -
Sample output:
# CORSARO_INTERVAL_START 0 1325390400 0.0.0.0|0.0.0.0|0|0|0|0|0x00|0,551 0.0.0.0|0.0.0.0|0|0|1|0|0x00|0,5741 0.0.0.0|0.0.0.0|0|0|6|0|0x00|0,151336 0.0.0.0|0.0.0.0|0|0|8|0|0x00|0,2 0.0.0.0|0.0.0.0|0|0|17|0|0x00|0,1042968 0.0.0.0|0.0.0.0|0|0|28|0|0x00|0,5
Note, this output has been sorted in post-processing
The output shows the familiar FlowTuple ASCII format, albeit with all fields except the protocol zeroed out. Also, the packet count value has been replaced with a count of the number of unique source IPs observed for the corresponding sub-tuple over the interval. In this example, we can see that UDP (protocol 17, 6th line of output) packets were received from a total of 1,042,968 different sources during the interval.
Re-aggregate and filter data using protocol and dst_port fields, leaving the interval and value unchanged.
Command Used:
find /path/to/flowtuple/data/ -type f -name "*.flowtuple.cors.gz" | sort | \ cors-ft-aggregate -i 0 -v packet_cnt -f protocol -f dst_port - | \ fgrep -e "0.0.0.0|0.0.0.0|0|5060|17|" -e "CORSARO_INTERVAL_START"
Sample Output:
# CORSARO_INTERVAL_START 0 1325390400 0.0.0.0|0.0.0.0|0|5060|17|0|0x00|0,3577 # CORSARO_INTERVAL_START 1 1325390460 0.0.0.0|0.0.0.0|0|5060|17|0|0x00|0,3412 # CORSARO_INTERVAL_START 2 1325390520 0.0.0.0|0.0.0.0|0|5060|17|0|0x00|0,3018
We re-aggregate the data using the protocol and dst_port fields, maintaining the original interval (60 seconds) and leaving the packet count as the value field. We then use a simple grep to filter the output to only records which have a destination port of 5060 (SIP) and a protocol value of 17 (UDP). In this example, the first minute of data, which begins at 01/01/12 04:00 UTC (1325390400) contained 3,577 packets to UDP port 5060. Note, future versions of cors-ft-aggregate will directly support filters such as this to greatly improve processing speed.
extend to allow to write out to binary again
respect the tuple classes for reaggregation (currently classes are discarded).
add a BPF-like filter
Takes the output from cors2ascii and splits each interval into a separate file. Useful for generating a single file per interval for further processing without the need to detect interval start and end records.
usage: cors-splitascii.pl output_pattern [input_file] where output_pattern must include %INTERVAL% to be replaced with the interval timestamp
Similar to the corsaro tool, an output file name template must be specified, in which the string INTERVAL% will be replaced with the interval start time in each output file.
The input file can be any file which contains ASCII formatted corsaro interval data. To read data from stdin, use - as the input file. This allows cors-splitascii.pl to be chained directly to cors2ascii like this:
cors2ascii <input_file> | cors-splitascii.pl <output_pattern> -
Quickly converts a trace file into an easily parseable list of tuples.
While cors-trace2tuple does not use the libcorsaro framework, it is useful for quickly generating a high-level representation of the packets contained in a trace file.
Usage: cors-trace2tuple [-H|--libtrace-help] [--filter|-f bpf ]... libtraceuri...
Each packet in the input trace (which is accepted by the optional BPF filter) is represented by a single line in the tab-separated ASCII output.
Output is in the following format:
<timestamp> <src_ip> <dst_ip> <src_port> <dst_port> <protocol> <ipid> <ip_len>
Not yet implemented