NAME

t2_report++ - Generate HTML reports from the output of crl_flow.


SYNOPSIS

    t2_report++ <options> [source file]


DESCRIPTION

t2_report++ is an HTML report generator that is part of the Caida CoralReef Internet Traffic Monitoring Software suite. It is intended to read output from the crl_flow program and produce a series of HTML pages that can include some pie charts and/or some time-series data stored in RRDtool Round Robin Databases. t2_report++ produces pie charts using either the GD::Graph module freely available for Perl or alternatively the JClass Chart Java applet. The JClass Chart Java applet is a commercial product. See http://www.sitraka.com for more information.


REQUIREMENTS

t2_report++ requires Perl 5.004 or greater and two modules from the CPAN distribution: GD.pm and GD::Graph.pm. These are not included with CoralReef but may be obtained at no cost from any Internet site that provides Perl distributions and the CPAN archive. These modules depend on some other modules - check the module documentation for exact details. GD.pm in turn depends on the C graphics library gd, which has its own requirements. In order to get AS information, you will need the ASFinder module which is included with CoralReef. However, you will also need a BGP routing table. Information on getting such a table can be found in the ASFinder documentation. Country information requires the NetGeo client that is also included in the CoralReef distribution. Finally, if you wish to use Round Robin Databases you will need RRDtool. For more information on RRDtool see: http://oss.oetiker.ch/rrdtool/index.en.html


CONFIGURATION

Because t2_report++ supports long command line options, all options require a space between the letter (or now word) and the value. In addition, the command line option formerly -c (for combining) has been changed to -m (or --merge). -c (or --command) now allows the user to specify extra options.

t2_report++ looks for a configuration file in the etc directory of the CoralReef file tree, the directory ~/.Coral directory of the user running t2_report++ and then configuration files specified in the command line arguments. This permits nested configuration commands. A configuration file can be specified on the command line and it will be processed before any other command line arguments. The default name for the configuration file is t2_report.conf. However, any filename can be passed as a command line argument. Because the same parser is used for command line arguments as for config file items, the same syntax is used for both. The only difference is that on the command line, the configuration variable names are preceded by one or two dashes. It is recommended that users only modify variables on the command line for which single character aliases exist.

 +-------------------------------------------------------------------------+
 |              Config file variables and command line options             |
 +--------------+-----+---------------------------+------------------------+
 | Long name    |short| Controls                  | Arguments              |
 +--------------+-----+---------------------------+------------------------+
 | ASfinder     |  a  | Using ASFinder for IP->AS | 0, 1, or BGP table     |
 | backdate     |  b  | Backdate RRDtool database | RRDtool date format    |
 | command      |  c  | Configuration commands    | see below              |
 | debug        |  d  | Turn on debugging messages| 0, 1                   |
 | configfile   |  f  | read in config file       | path to config file    |
 | graphtool    |  g  | set graphing output tool  | none, JCChart, GDGraph |
 | help         |  h  | access to UserGuide       | none, list, all, ...   |
 | internetapps |  i  | Track Top-N applications  | 0, 1                   |
 | javaserver   |  j  | Java server for JCChart   | URL                    |
 | merge        |  m  | time or samples to merge  | see below for format   |
 | netgeo       |  n  | enable IP->country data.  | 0, 1                   |
 | promiscuous  |  p  | turn off IP encoding      | 0, 1                   |
 | RRDtool      |  r  | enable time-series plots  | path to RRD data       |
 | samples      |  s  | limit processing to N     | number of samples      |
 | todisplay    |  t  | change top-10 to top-N    | number to display      |
 | zaptotals    |  z  | suppress totals calcs     | 0, 1                   |
 +--------------+-----+---------------------------+------------------------+

There is considerable flexibility in specifying command line arguments. For example, one or two dashes can be used to specify long command line arguments. Also case is ignored. so -r, -rrdtool, and --RRDtool have exactly the same effect.

There is usage information available through the option --help or -h. Simply specifying that option alone will provide brief usage information along with the list above of command line options. --help list provides the same information with as --help, namely a list of options that can be used along with general usage information. --help <name> will provide particular details of any command line option. --help all is a verbose listing all usage information.

As a matter of efficiency, t2_report++ only handles absolute pathnames for files in the configuration and command-line system.

Four command line options require further explanation. The -c or --command option allows for t2_report++ configuration options to be specified in a manner somewhat resembling commands passed to other CoralReef programs. This is intended to simplify the use of t2_report++ for coral users. For example, the option --debug 1 can also be written --command 'debug 1'. This also allows for configuration file variables that aren't normally changed on the command line to be changed. There is one command line option which is not accessible in this way. The configuration file option (--configfile or -f option will not work via the -c syntax because it processed before any other command line arguments.

The merge option (not implemented in the current release) allows for data to be combined from a shorter crl_flow sampling interval into reports with a longer time-span between reports. Merging can happen on either a sample by sample basis or by specifying a time interval. For example, if you have crl_flow producing output every minute (60 seconds) then --merge 5 would produce 5 minute reports. Specifying a time is more complicated. All times require some sort of colon designation (to differentiate from merging a fixed number of samples.) In order to provide full functionality, the program assumes that smaller time steps are implied not larger. Thus, -m 5: means 5 hours not 5 seconds. Five minutes would be -merge 0:5 and five seconds would be --merge 0:0:5

The zaptotals option (--zaptotals or -z) allows t2_report++ to skip computing the totals for all the subinterfaces on this link. This can be useful on a heavily congested link where the report generator is unable to keep up with the crl_flow feed.

t2_report++ InternetApps option (--InternetApps or -i) enables timeseries tracking of the top ``N'' (user configurable) number of Internet Applications. Internet Applications are identified by the port numbers they use and as recorded in the AppPorts.pm module. Unlike the configuration settings below which allow users to track a fixed number of applications, the purpose of this option is to allow t2_report++ to track trends in the top applications. It thus, can only be on or off. It displays the top ``N'' applications as observed in the last sample period. Thus, the cumulative plots do not necessarily correspond to top applications over the longer term.

In addition to variables listed above. There are some other variables that may be set in configuration files only (or by the --command option). These are listed below.

 +----------------------------------------------------------------------+
 |                 Config file variables                                |
 +-----------------+---------------------------+------------------------+
 |   Name          | Controls                  | Arguments              |
 +-----------------+---------------------------+------------------------+
 | RRD_DIR         | path to RRD databases     | path to RRD databases  |
 | RRD_GRAPH_PROTOS| Protocols to graph        | RRD protocols to be    |
 |                 |                           | graphed                |
 | PROTO_NAMES     | names displayed for proto.| hash numbers to names  |
 | subif_names     | names displayed for subifs| hash subifs to name    |
 | RRDCOLORS       | values for colors in RRD  | list of hex values     |
 | html_dir        | location for html files   | directory path         |
 | incremental_read| read data in parts instead| 0, 1 (default to 1)    |
 |                 | of all at once            |                        |
 | site_name       | name of site for reports  | text                   |
 | route_table     | BGP table used by ASFinder| path to BGPtable       |
 | source_table    | source data file          | path to file           |
 | rrd_graph_width | width of timeseries graphs| width in pixels        |
 | rrd_graph_height| height of graphs          | height in pixels       |
 | rrd_time_samples| Intervals for RRDtool     | list of times in hours |
 |                 | timeseries databases      |                        |
 | RRD_nograph     | RRD updates but no graphs | 0, 1                   |
 | pie_slices      | Number of slices shown    | number                 |
 | 3d_pie          | Enables 3d pie chart      | 0, 1                   |
 | rrd_graph_apps  | application to graph      | list of names to graph |
 |traffic2_dump_dir| directory for data dumps  | directory for data     |
 +-----------------+---------------------------+------------------------+

Since t2_report++ can left running for extended periods of time, it now permits you to reload the configuration settings by sending a HUP signal to the report generator while running. This causes t2_report++ go through the same initialization process that occurs at the start of the program. Because of this, the same command line arguments that were passed to the program at startup are retained. While this could be seen as a feature, it means that these settings cannot be changed while the program is running. Items in the configuration files on the otherhand can be edited while t2_report++ is running. For this reason, it is best to place most configuration information in the configuration files.

If you send a USR1 signal, t2_report++ will finish processing its current interval and then exit.


CONFIGURATION FILE FORMAT

The sample configuration file example_t2_report.conf is included in the CoralReef distribution. The format is consistent with the AppConfig.pm module and most UNIX applications. Comment lines are preceded with the pound sign # . Identifiers are followed by values. To initialize arrays and hashes, each value must be specified on a separate line. Below is some sample settings from the t2_report++ configuration file:

     #  Disable ASfinder by default (requires a routing table).
     ASFinder = 0 
     # Disable NetGeo client (it requires ASfinder).
     NetGeo = 0
     # Disable RRDtool unless installed. 
     RRDtool = 0
     # Protocols to graph.
     RRD_GRAPH_PROTOS = 6
     RRD_GRAPH_PROTOS = 17 
     RRD_GRAPH_PROTOS = 1 
     RRD_GRAPH_PROTOS = 4
     # Set directory to store resulting HTML
     html_dir = "/usr/local/Coral/data/CoralReef_html_dir"
     # Set default graph tool to GDGraph
     # graphtool = "GDGraph"
     # Enable debugging information
     debug = 1

Note that for the sake of consistency, the logical values assigned to a variable should be explicitly listed as 0 or 1. The values for arrays are entered by listing the same variable multiple times. hashes are handled in a similar fashion with key/value pairs being listed.


APPLICATION PORTS FILE

t2_report++ can track application based on their ports they use. The capability is implemented in the Perl module AppPorts.pm and the configuration file t2_report.ports. More information can be found in the AppPorts documentation. An example ports file is provided in the CoralReef distribution in: etc/Application_ports_Master.txt.

The module provides for both a UNIX/RRDtool compatible name to be used for nonambiguous identification and a human-friendly description for display in graphs. While not presently used, there exists a group field for coarser aggregation. Information is also stored to facilitate tracking the source of the entry.

t2_report++ will attempt to collect data on every item in the t2_report.ports file. However, it will not display more than the web page limit sizes set by the configuration file.


CONFIGURING RRDTOOL TIMESERIES GRAPHS

t2_report++ can optionally use RRDtool to keep time series data on protocols and applications. To use RRDtool with t2_report++, first install RRDtool. You can then use the --with-rrd option when using configure to install the CoralReef software suite to set the directory where t2_report++ will look for the RRDtool perl libraries.

NOTE: For CoralReef 3.5, the RRD directory structure changed from its previous form. If you wish to use RRDs created before 3.5, you will need to convert some of the files. See the file README.RRDtool in the apps/traffic/Reporting directory.

RRDtool can be used to collect data and display graphs. All protocols and applications identified are automatically collected, and any subset of them can be graphed.

You may use the variable PROTO_NAMES to assign descriptive text to each protocol number.

     # Hash of protocol numbers to names
     PROTO_NAMES 1 = 'ICMP'
     PROTO_NAMES 2 = 'IGMP'
     PROTO_NAMES 4 = 'IPENCAP'
     PROTO_NAMES 6 = 'TCP'
     PROTO_NAMES 17 = 'UDP'
     PROTO_NAMES 47 = 'GRE'
     PROTO_NAMES 50 = 'IPSEC-ESP'
     PROTO_NAMES 51 = 'IPSEC-AH'
     PROTO_NAMES 93 = 'AX.25'
     PROTO_NAMES 94 = 'IPIP'

RRD_GRAPH_PROTOS is another array variable that specifies which protocols will be graphed.

     # Protocols to graph.
     RRD_GRAPH_PROTOS = 6
     RRD_GRAPH_PROTOS = 17

For t2_report++, configurating applications is similar, but the application names come from AppPorts and thus need not be explicitly specified.

As with protocols, t2_report++ can graph any application that it can identify. It is likely that you will wish only to graph a subset of these. This can be done by assigning values to the rrd_graph_apps array. This variable expects application identifiers instead of protocol numbers. These identifiers come from the t2_report.ports file.

     # Applications to be graphed.
     rrd_graph_apps     NAPSTER_DATA
     rrd_graph_apps     REALAUDIO_UDP
     rrd_graph_apps     GNUTELLA
     rrd_graph_apps     QUAKE
     rrd_graph_apps     AOL


DUMPING ADDITIONAL DIAGNOSTIC INFORMATION

t2_report++ takes data from crl_flow in order to generate the report pages. While crl_flow is not as complete as raw trace data, nonetheless the crl_flow output can be useful for diagnosing events in a network. t2_report++ provides a toggle switch to permit the saving of crl_flow output for further processing. Sending a USR2 signal to t2_report++ will cause it to dump each sample processed into the directory assigned by the configuration variable traffic2_dump_dir. If traffic2_dump_dir is not set, the USR2 signal will be ignored. To stop the data dumps, one can send another USR2 signal. Note: While crl_flow files are much smaller than raw trace files, they still grow very quickly. Leaving t2_report++ in dump mode is likely to fill up all available disk space in a short time. When possible, try to use a separate partition to store the dumps from t2_report++, and always be cautious in its use.


T2_REPORT++ USAGE

    t2_report++ <options> [source file]

If a configuration file is used, often no other command line options are needed. The source file is also optional. If there is no file specified, t2_report++ will assume it is getting data from standard in. This allows t2_report++ to be used as a real-time report generator. Visit the CoralReef web site at http://www.caida.org/tools/measurement/coralreef/ to see an example of the report generator being used in this way.


EXAMPLE

Assuming that all other options are included in the configuration file ~/.Coral/t2_report.conf, the command to run t2_report++ on the file vbns_sample would be:

    t2_report++  vbns_sample

To do the same thing but limit the samples read to 9 and turn off IP encoding, the command would be:

    t2_report++ -p -s 9 vbns_sample

or using long command line options:

    t2_report++ -promiscuous -sample 9 vbns_sample

To run the report generator on live data, it is important to NOT use a pipe to connect crl_flow and t2_report. If the pipe ever fills up, crl_flow will block until it has cleared, often causing it to lose data. A better solution is to dump to intermediate files and have t2_report read those:

    crl_flow -I -O%s.t2 /dev/fatm0
    spoolcat '*.t2' | t2_report++ -c 'site_name="vBNS data"'

(spoolcat is a small script that waits forever in a directory for any files that match a pattern, prints them to stdout, and can delete/move them.)

Note the use of the -c command to change the site name. This can be convenient method to ``tweak'' multiple runs of t2_report++ that otherwise use the same configuration file.

Also note that the input files to t2_report++ MUST have been generated using the -I option of crl_flow. The reports require that all flows be expired at the end of every interval.


HTML OUTPUT

The default output of t2_report++ includes a link to Caida's website, a logo, and an email address for reaching Caida. This was done by creating a Perl module called CustomCaidaHTML.pm (in libsrc/Traffic2, in the source directory), which is derived from CustomBaseHTML.pm. To customize the output for your own site, create your own output module derived from CustomBaseHTML.pm, and change the source file (t2_report++.pl) to use it.


KNOWN BUGS

In this release the feature to accumulate samples is disabled.

t2_report++ can create a large memory footprint if it reports on data with many flows. When listening to a link with many subinterfaces (such as vp:vcs on an ATM link), the incremental_read option can significantly reduce memory used. Memory use can also be reduced by using a shorter interval with crl_flow.

If memory and/or CPU limitations prevent you from running t2_report++ on a collection machine, t2_report++ can be run on a separate machine. This only requires copying the output files (created with crl_flow's -O option) to the reporting machine.


AUTHORS

CoralReef Development team, CAIDA <coral-info@caida.org>


COPYRIGHT

Copyright 1999, 2000, 2001, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2011, 2012 The Regents of the University of California All Rights Reserved

Permission to use, copy, modify and distribute any part of this CoralReef software package for educational, research and non-profit purposes, without fee, and without a written agreement is hereby granted, provided that the above copyright notice, this paragraph and the following paragraphs appear in all copies.

Those desiring to incorporate this into commercial products or use for commercial purposes should contact the Technology Transfer Office, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0910, Ph: (858) 534-5815, FAX: (858) 534-7345.

IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

THE SOFTWARE PROVIDED HEREIN IS ON AN ``AS IS'' BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. THE UNIVERSITY OF CALIFORNIA MAKES NO REPRESENTATIONS AND EXTENDS NO WARRANTIES OF ANY KIND, EITHER IMPLIED OR EXPRESS, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, OR THAT THE USE OF THE SOFTWARE WILL NOT INFRINGE ANY PATENT, TRADEMARK OR OTHER RIGHTS.

The CoralReef software package is developed by the CoralReef development team at the University of California, San Diego under the Cooperative Association for Internet Data Analysis (CAIDA) Program. Support for this effort is provided by the CAIDA grant NCR-9711092, DARPA NGI Contract N66001-98-2-8922, DARPA NMS Grant N66001-01-1-8909, and by CAIDA members.

Report bugs and suggestions to coral-bugs@caida.org.