- NAME
- SYNOPSIS
- DESCRIPTION
- ERRORS
- EXAMPLES
- ENVIRONMENT
- SEE ALSO
- NOTES
- WARNINGS
- DIAGNOSTICS
- BUGS
- RESTRICTIONS
- AUTHORS
NAME
CAIDA::Tables - General purpose table objects
SYNOPSIS
use CAIDA::Traffic2::FlowCounter; use CAIDA::Tables::Tuple_Table # Tuple_Table is only an example
$table = new CAIDA::Tables::Tuple_Table; $counter = new CAIDA::Traffic2::FlowCounter($packets, $bytes, $flows, $first, $latest); $table->entry_add($src_ip, $dst_ip, $ip_protocol, $ports_ok, $src_port, $dst_port, $counter); $counter = $table->entry_get($src_ip, $dst_ip, $ip_protocol, $ports_ok, $src_port, $dst_port); $data_hash_ref = $table->data(); while (($opaque_key, $counter) = each %$data_hash_ref) { @fields = $table->get_key_fields($opaque_key); # Do stuff with @fields. } @top5 = $table->sort_by_counter_field('bytes', 5); foreach $opaque_key (@top5) { # Do stuff with $opaque_key } $size = $table->num_entries();
#Many many aggregators... %agg_options = ('table_size' => 100); $ip_matrix = $table->aggregate_columns(0, 1); # Avoid using. $ip_matrix = $table->naggregate_columns(0, 1); # Avoid using. $ip_matrix = $table->make_IP_Matrix(\%agg_options); # IP_Matrix is only one example, see Conversion functions below.
$save_file = new FileHandle("> save_file"); $load_file = new FileHandle("< save_file"); $table->save_text($save_file, $full_count); $table->load_text($load_file); $table->save_binary($save_file, $full_count); $table->load_binary($load_file);
$table->add($other_table_of_same_type); $table->nadd($other_table_of_same_type);
$table->clear();
DESCRIPTION
CAIDA::Tables is a set of containers for holding large amounts of data, all of which are associated with a counter. The default is to use a FlowCounter to count the numbers of bytes, packets, and flows associated with a set of data.
The API for these tables is in Perl, but the processing backends
are in both Perl and C++. Other than adding a few extra options, the C++
backend should be transparent (except for the speed increase).
When both are available, the C++ backend will be attempted first,
and if it fails, then the Perl backend is used. (See also FORCE_C
and FORCE_PERL
.)
The Perl versions exist mainly to support systems with archaic C++ compilers; they may be removed in future releases.
Each table type has its own module name. For example, to use an IP_Matrix:
use CAIDA::Tables::IP_Matrix;All tables support a general set of member functions, and in addition have specific tranform functions to convert from one table into another. (See diagrams Table_layout, Tables_convert_from and Tables_convert_to for different ways of visualizing how to convert tables.)
The available tables (and associated keys) are:
- Tuple_Table
- (source IP address, destination IP address, IP protocol, ports ok, source port, destination port) Note: ports ok is a boolean regarding the validity of the source and destination ports.
- IP_Table
- (IP address)
- IP_Matrix
- (source IP address, destination IP address)
- Proto_Ports_Table
- (IP protocol, ports ok, source port, destination port) Note: ports ok is a boolean regarding the validity of the source and destination ports.
- Proto_Port_Table
- (IP protocol, ports ok, port) Note: ports ok is a boolean regarding the validity of the port.
- IP_Proto_Ports_Table
- (IP address, IP protocol, ports ok, source port, destination port) Note: ports ok is a boolean regarding the validity of the port.
- IP_Proto_Port_Table
- (IP address, IP protocol, ports ok, port) Note: ports ok is a boolean regarding the validity of the port.
- Port_Table
- (port)
- Port_Matrix
- (source port, destination port)
- Proto_Table
- (IP protocol)
- AS_Table
- (AS) Note: AS is a string.
- AS_Matrix
- (source AS, destination AS) Note: AS is a string.
- Country_Table
- (country)
- Country_Matrix
- (source country, destination country)
- App_Table
- (application)
- AppInfo_Table
-
(description, name, group, contrib, date, notes, reference, url) Note:
Unlike other Tables, a call to
entry_get()
will only match on the 'name' field. - VPVC_Table
- (vp/vc pair)
- Prefix_Table
- (prefix/masklength)
- Prefix_Matrix
- (source prefix/masklength, destination prefix/masklength)
- Length_Table
- (length)
- LatLon_Table
- (latitude and longitude as a single comma-separated string, eg: '33.023,-117.276')
- String_Table
- (user string)
Global flags
- DEBUG
- When set, enables certain error messages that would otherwise be silent.
- FORCE_C
- When set, only allows the usage of the C++ backend.
- FORCE_PERL
- When set, only allows the usage of the Perl backend.
Common functions
These examples assume Tuple_Table as the object being used.
- new (COUNTER, OPTIONS)
- new (COUNTER)
- new ()
-
Creates a new table object whose class name specifies its key type.
For example,
new CAIDA::Tables::Tuple_Table
creates a table with tuple keys (as shown above),new CAIDA::Tables::IP_Matrix
creates a table of IP address pairs, etc. COUNTER refers to any object that implements new() and add() member functions. COUNTER's add() method takes another object of the same type as an argument, adds that object to itself, and returns a reference to itself. If COUNTER is omitted or undef, the table defaults to using FlowCounter. -
NOTE: The C++ backend currently only supports using a FlowCounter object; the use of a non-FlowCounter for COUNTER will force the use of the Perl backend.
-
OPTIONS is a reference to a hash containing configuration options. The list of options are:
- force_perl
-
Boolean. Rarely used by the user, this option forces a new table
to use the Perl backend instead of the C++ backend. See also
FORCE_PERL
. - table_size
- Used only for the C++ backend, this specifies the size of the underlying hash table to help optimize memory usage.
- entry_add (LIST, COUNTER)
-
Adds an entry into the table. LIST contains all the fields that
make up the table's key, as listed above. COUNTER is the counter
object specified by
new()
. If there is an existing entry, COUNTER is added to the existing counter object. Returns a reference to entry's counter. - entry_get (LIST)
- Returns the counter data from the table. LIST contains all the fields that make up the table's key, as listed above.
- data ()
-
Returns a hash reference that can be used to directly read the data
in the table, using
get_key_fields()
. - get_key_fields (KEY)
-
Returns the individual fields of a particular opaque KEY (such as
that returned by
each
orkeys
on the hash referenced bydata()
's return value). - add (TABLE)
- nadd (TABLE)
-
Performs a merge of another table into the current one. Returns
a reference to the original table. TABLE must be of the same type
as the original.
nadd()
is the same asadd()
, but it is free to do destructive operations on TABLE. TABLE should not be used again for anything after callingnadd()
. - clear ()
- Removes all entries from the table.
- sort_by_counter_fields (FIELD, NUMBER, ASCEND)
- sort_by_counter_fields (FIELD, NUMBER)
- Returns a list of opaque keys, sorted by a specific counter field. FIELD specifies the field by which the keys are sorted. FIELD must be the same name as a method of the counter; for the default FlowCounter, acceptable field names are pkts, bytes, and flows.
-
NOTE: The C++ backend currently only supports using FlowCounter; thus only pkts, bytes, and flows fields are supported in the C++ backend.
-
NUMBER specifies how many of the top keys should be listed. If NUMBER is set to -1, then
sort_by_keys()
returns all keys. ASCEND is a boolean which determines whether the results are sorted in ascending order. If ASCEND is omitted or set to false, then the results are in descending order. - sort_by_keys (NUMBER, ASCEND)
- sort_by_keys (NUMBER)
-
Returns a list of opaque keys, sorted by the key fields. The sort order
is such that the first field is sorted first, the second is sorted
second, etc. NUMBER specifies how many of the top keys should be
listed. If NUMBER is set to -1, then
sort_by_keys()
returns all keys. ASCEND is a boolean which determines whether the results are sorted in ascending order. If ASCEND is omitted or set to false, then the results are in descending order. - num_entries ()
- Returns the number of entries in the table.
- load_text (FILE)
- load_binary (FILE)
-
Loads data into the table from FILE. FILE can either be an open
filehandle or a file name. The data in the table previous to the
load is not overwritten. Instead, the file data is added as if
via a set of
entry_add()
calls. The file format is that which is used by the save functions, although other applications (such as crl_flow) can output in that format as well. -
The return value is the string which terminates the table, which applications might use to store useful information. An example of this might be:
-
# end of Tuple Table, interface 0, vp:vc 2:134
-
In the case of a loading error, these functions return undef.
The file format is described here. - save_text (FILE, SAVE_ALL)
- save_text (FILE)
- save_binary (FILE, SAVE_ALL)
- save_binary (FILE)
- Saves table data to FILE. FILE can either be an open filehandle or a file name. These functions are primarily meant to provide persistent storage of the table and not for reading by humans. However, the text format is provided for those who wish to do quick debugging or scanning of data. SAVE_ALL is a boolean that relates to the FlowCounter object, and when true, saves the first and latest fields. This is, sadly, a hack, which will be removed in the future. When SAVE_ALL is omitted, it defaults to true.
-
The return value is a boolean indicating whether the function successfully saved the table.
-
WARNING: These functions assume that the table uses a FlowCounter object. Use of these functions with a different counter object is a very bad idea.
WARNING: These functions assume that the table uses a FlowCounter object. Use of these functions with a different counter object is a very bad idea.
Conversion functions
Several tables have functions to create a different table from their own data. For example, one can turn an IP_Matrix (with a source and destination IP address) into an IP_Table by combining any entries with the same source IP address.
All of the following accept a reference to a hash containing optional
arguments to the function. For most tables (except make_AS_Matrix
,
make_AS_Table
, make_Country_Matrix
and make_Country_Table
),
the OPTIONS are...optional.
The only option that applies to all tables is 'table_size', which is used with the C++ backend to specify the size of the internal hash table.
- make_IP_Matrix (OPTIONS)
- make_IP_Matrix ()
- Member function of Tuple_Table; makes an IP_Matrix.
- make_src_IP_Table (OPTIONS)
- make_src_IP_Table ()
- Member function of Tuple_Table and IP_Matrix; makes an IP_Table from the source (first) IP address.
- make_dst_IP_Table (OPTIONS)
- make_dst_IP_Table ()
- Member function of Tuple_Table and IP_Matrix; makes an IP_Table from the destination (second) IP address.
- make_IP_Table (OPTIONS)
- make_IP_Table ()
- Member function of IP_Proto_Ports_Table and IP_Proto_Port_Table; makes an IP_Table from the IP address.
- make_Proto_Ports_Table (OPTIONS)
- make_Proto_Ports_Table ()
- Member function of Tuple_Table and IP_Proto_Ports_Table; makes a Proto_Ports_Table.
- make_Port_Matrix (OPTIONS)
- make_Port_Matrix ()
- Member function of Tuple_Table, IP_Proto_Ports_Table and Proto_Ports_Table; makes a Port_Matrix. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Port_Matrix will only include ports used by that protocol.
- make_src_Proto_Port_Table (OPTIONS)
- make_src_Proto_Port_Table ()
- Member function of Tuple_Table, IP_Proto_Ports_Table and Proto_Ports_Table; makes a Proto_Ports_Table from the protocol, ports_ok and source (first) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Proto_Port_Table will only include ports used by that protocol.
- make_dst_Proto_Port_Table (OPTIONS)
- make_dst_Proto_Port_Table ()
- Member function of Tuple_Table, IP_Proto_Ports_Table and Proto_Ports_Table; makes a Proto_Ports_Table from the protocol, ports_ok and destination (second) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Proto_Port_Table will only include ports used by that protocol.
- make_Proto_Port_Table (OPTIONS)
- make_Proto_Port_Table ()
- Member function of IP_Proto_Port_Table; makes a Proto_Port_Table from the protocol, ports_ok and port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Proto_Port_Table will only include ports used by that protocol.
- make_src_Port_Table (OPTIONS)
- make_src_Port_Table ()
- Member function of Tuple_Table, Port_Matrix, IP_Proto_Ports_Table and Proto_Ports_Table; makes a Port_Table from the source (first) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Port_Table will only include ports used by that protocol.
- make_dst_Port_Table (OPTIONS)
- make_dst_Port_Table ()
- Member function of Tuple_Table, Port_Matrix, IP_Proto_Ports_Table and Proto_Ports_Table; makes a Port_Table from the destination (second) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Port_Table will only include ports used by that protocol.
- make_Port_Table (OPTIONS)
- make_Port_Table ()
- Member function of Proto_Port_Table and IP_Proto_Port_Table; makes a Port_Table from the destination (second) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting Port_Table will only include ports used by that protocol.
- make_Proto_Table (OPTIONS)
- make_Proto_Table ()
- Member function of Tuple_Table, Proto_Ports_Table, Proto_Port_Table IP_Proto_Ports_Table and IP_Proto_Port_Table; makes a Proto_Table.
- make_src_IP_Proto_Ports_Table (OPTIONS)
- make_src_IP_Proto_Ports_Table ()
- Member function of Tuple_Table; makes an IP_Proto_Ports_Table from the source (first) IP address, protocol, ports_ok, source (first) port, and destination (second) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting IP_Proto_Ports_Table will only include ports used by that protocol.
- make_dst_IP_Proto_Ports_Table (OPTIONS)
- make_dst_IP_Proto_Ports_Table ()
- Member function of Tuple_Table; makes an IP_Proto_Ports_Table from the destination (second) IP address, protocol, ports_ok, source (first) port, and destination (second) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting IP_Proto_Ports_Table will only include ports used by that protocol.
- make_src_IP_Proto_src_Port_Table (OPTIONS)
- make_src_IP_Proto_src_Port_Table ()
- Member function of Tuple_Table; makes an IP_Proto_Port_Table from the source (first) IP address, protocol, ports_ok and source (first) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting IP_Proto_Port_Table will only include ports used by that protocol.
- make_src_IP_Proto_dst_Port_Table (OPTIONS)
- make_src_IP_Proto_dst_Port_Table ()
- Member function of Tuple_Table; makes an IP_Proto_Port_Table from the source (first) IP address, protocol, ports_ok and destination (second) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting IP_Proto_Port_Table will only include ports used by that protocol.
- make_dst_IP_Proto_src_Port_Table (OPTIONS)
- make_dst_IP_Proto_src_Port_Table ()
- Member function of Tuple_Table; makes an IP_Proto_Port_Table from the destination (second) IP address, protocol, ports_ok and source (first) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting IP_Proto_Port_Table will only include ports used by that protocol.
- make_dst_IP_Proto_dst_Port_Table (OPTIONS)
- make_dst_IP_Proto_dst_Port_Table ()
- Member function of Tuple_Table; makes an IP_Proto_Port_Table from the destination (second) IP address, protocol, ports_ok and destination (second) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting IP_Proto_Port_Table will only include ports used by that protocol.
- make_IP_Proto_src_Port_Table (OPTIONS)
- make_IP_Proto_src_Port_Table ()
- Member function of IP_Proto_Ports_Table; makes an IP_Proto_Port_Table from the IP address, protocol, ports_ok and source (first) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting IP_Proto_Port_Table will only include ports used by that protocol.
- make_IP_Proto_dst_Port_Table (OPTIONS)
- make_IP_Proto_dst_Port_Table ()
- Member function of IP_Proto_Ports_Table; makes an IP_Proto_Port_Table from the IP address, protocol, ports_ok and destination (second) port. OPTIONS may include an entry whose key is 'protocol' and whose value is a protocol number. If the protocol is set, then the resulting IP_Proto_Port_Table will only include ports used by that protocol.
- make_AS_Matrix (OPTIONS)
- Member function of Tuple_Table and IP_Matrix; makes an AS_Matrix. OPTIONS must include an entry whose key is 'as_finder' and whose value is an ASFinder object.
- make_src_AS_Table (OPTIONS)
- make_src_AS_Table ()
- Member function of Tuple_Table, IP_Matrix and AS_Matrix; makes an AS_Table. For Tuple_Table and IP_Matrix, OPTIONS must include an entry whose key is 'as_finder' and whose value is an ASFinder object.
- make_dst_AS_Table (OPTIONS)
- make_dst_AS_Table ()
- Member function of Tuple_Table, IP_Matrix and AS_Matrix; makes an AS_Table. For Tuple_Table and IP_Matrix, OPTIONS must include an entry whose key is 'as_finder' and whose value is an ASFinder object.
- make_AS_Table (OPTIONS)
- Member function of IP_Table; makes an AS_Table. OPTIONS must include an entry whose key is 'as_finder' and whose value is an ASFinder object.
- make_Country_Matrix (OPTIONS)
- Member function of Tuple_Table, IP_Matrix and AS_Matrix; makes a Country_Matrix. OPTIONS must either include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object, or an entry whose key is 'netacq' and whose value is the path to a netacq executable. In addition, if netacq is not used, then Tuple_Table and IP_Matrix require that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object. Because NetGeo returns 2-letter country codes and netacq returns 3-letter country codes, OPTIONS may include an entry whose key is 'countries' and whose value is a Countries object, which is used to convert netacq's responses into 2-letter country codes.
- make_src_Country_Table (OPTIONS)
- make_src_Country_Table ()
- Member function of Tuple_Table, IP_Matrix, AS_Matrix and Country_Matrix; makes a Country_Table from the source IP/AS/Country. For Tuple_Table, IP_Matrix, and AS_Matrix, OPTIONS must either include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object, or an entry whose key is 'netacq' and whose value is the path to a netacq executable. In addition, if netacq is not used, then Tuple_Table and IP_Matrix require that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object. Because NetGeo returns 2-letter country codes and netacq returns 3-letter country codes, OPTIONS may include an entry whose key is 'countries' and whose value is a Countries object, which is used to convert netacq's responses into 2-letter country codes.
- make_dst_Country_Table (OPTIONS)
- make_dst_Country_Table ()
- Member function of Tuple_Table, IP_Matrix, AS_Matrix and Country_Matrix; makes a Country_Table from the destination IP/AS/Country. For Tuple_Table, IP_Matrix, and AS_Matrix, OPTIONS must either include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object, or an entry whose key is 'netacq' and whose value is the path to a netacq executable. In addition, if netacq is not used, then Tuple_Table and IP_Matrix require that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object. Because NetGeo returns 2-letter country codes and netacq returns 3-letter country codes, OPTIONS may include an entry whose key is 'countries' and whose value is a Countries object, which is used to convert netacq's responses into 2-letter country codes.
- make_Country_Table (OPTIONS)
- Member function of IP_Table and AS_Table; makes a Country_Table. For IP_Table, OPTIONS must either include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object, or an entry whose key is 'netacq' and whose value is the path to a netacq executable. In addition, if netacq is not used, then IP_Table requires that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object. Because NetGeo returns 2-letter country codes and netacq returns 3-letter country codes, OPTIONS may include an entry whose key is 'countries' and whose value is a Countries object, which is used to convert netacq's responses into 2-letter country codes.
- make_src_LatLon_Table (OPTIONS)
- Member function of Tuple_Table, IP_Matrix, AS_Matrix and Country_Matrix; makes a LatLon_Table from the source IP/AS/Country. For Tuple_Table, IP_Matrix, and AS_Matrix, OPTIONS must either include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object, or an entry whose key is 'netacq' and whose value is the path to a netacq executable. In addition, if netacq is not used, then Tuple_Table and IP_Matrix require that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object. For Country_Matrix, OPTIONS must include an entry whose key is 'countries' and whose value is a Countries object.
- make_dst_LatLon_Table (OPTIONS)
- Member function of Tuple_Table, IP_Matrix, AS_Matrix and Country_Matrix; makes a LatLon_Table from the destination IP/AS/Country. For Tuple_Table, IP_Matrix, and AS_Matrix, OPTIONS must either include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object, or an entry whose key is 'netacq' and whose value is the path to a netacq executable. In addition, if netacq is not used, then Tuple_Table and IP_Matrix require that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object. For Country_Matrix, OPTIONS must include an entry whose key is 'countries' and whose value is a Countries object.
- make_LatLon_Table (OPTIONS)
- Member function of IP_Table, AS_Table, and Country_Table; makes a LatLon_Table. For IP_Table and AS_Table, OPTIONS must either include an entry whose key is 'netgeo' and whose value is a NetGeo or NetGeoClient object, or an entry whose key is 'netacq' and whose value is the path to a netacq executable. In addition, if netacq is not used, then IP_Table requires that OPTIONS also include an entry whose key is 'as_finder' and whose value is an ASFinder object. For Country_Table, OPTIONS must include an entry whose key is 'countries' and whose value is a Countries object.
- make_App_Table (OPTIONS)
- Member function of Tuple_Table and Proto_Ports_Table; makes an App_Table. OPTIONS must include an entry whose key is 'app_ports' and whose value is an AppPorts object loaded with application mapping information. In addition, OPTIONS may include an entry whose key is 'name_func' and whose value is a reference to a Perl subroutine that will name any applications not found via AppPorts. This function takes (in order) the protocol, ports ok, source port and destination port. If this function is not given, the default naming for unknown applications is UNKNOWN_$proto_($src_port,$dst_port). Also note that because AppPorts can contain IP-based rules, using a Tuple_Table as the source table can lead to greater accuracy in the mapping.
ERRORS
EXAMPLES
ENVIRONMENT
- LD_LIBRARY_PATH
- If using the C++ backend, it might be necessary to set your LD_LIBRARY_PATH to find the proper libraries, such as libstdc++.
SEE ALSO
NOTES
There are three additional tables (only in Perl) which are meant to be used only for the creation of new tables. They are called Generic::Split, Generic::SingleKey, and Generic::Pack (within the CAIDA::Tables namespace). They are not meant for the general user, but anyone wishing to create a new table can use them as a starting point.
WARNINGS
DIAGNOSTICS
BUGS
RESTRICTIONS
Because the text file format used by load_text
and save_text
is tab-delimited, tab characters cannot be used in any keys if the
text format is used. However, they are allowable if only load_binary
and save_binary
are used.
AUTHORS
Ryan Koga <rkoga@caida.org>, David Moore <dmoore@caida.org>