NAME

AppPorts.pm - Application name lookup from port numbers

SYNOPSIS

    # Use module
    use CAIDA::AppPorts;

    # Create a new AppPort object.
    my $app_port_obj = new CAIDA::AppPorts;

    # Clear the contents of the rules list
    $app_port_obj->clear();

    # Load in a rule file.
    $app_port_obj->load_rules($rules_file);

    # Match some rule.
    my $match = $app_port_obj->match_rule($src_ip, $dst_ip, $ip_proto,
                                            $ports_ok, $sport, $dport);
    my @other = $app_port_obj->match_rule($ip_proto, $ports_ok, $sport, $dport);
    print $match->desc, "\n";
    foreach my $proto ($match->protocols) {
        print $proto, "\n";
    }
    ... etc

    my @rules = $app_port_obj->get_rule("HTTP");
    # Dump rules for debugging purposes.
    $app_port_obj->dump_rules();

DESCRIPTION

AppPorts is a module that reads in a formatted text file of application rules, and then uses those rules to convert application port numbers and protocols to application names.

The module consists of methods to manage the rules and to search the rules for matching applications to the protocol, and source and destination ports. The rule format and module architecture is extensible so that this module can be used in a wide variety of settings.

RULE FILE FORMAT

NOTE: Prior to CoralReef 3.5.0, the file format was different. Most importantly, any strings with spaces required double quotes around them, and the rule priority was implicit in their ordering. If you want to use an older rule file with the current module, you should strip out extra double quotes, and add some explicit priorities. With no priorities set, the chance of conflicting rule matching is increased greatly.

AppPorts expects a file with entries defining the characteristics of an application. The required field is:

name

A unique name to identify the application. Must be able to be used as a UNIX filename.

The optional fields are:

description

The full name of the application.

group

A category for organizing applications, possibly in directory structures, or for aggregating similar applications, etc. Must be able to be used as a UNIX filename.

srcnet
dstnet

IP subnets to match (source and destination, respectively). Written in the form x.x.x.x/x; multiple subnets can be separated by commas. Defaults to matching all subnets (can also be defined explicitly with 0.0.0.0/0). If the mask length is omitted, it defaults to /32.

sport
dport

The source and destination TCP/UDP/ICMP ports, respectively, to match. In the case of ICMP, 'sport' means 'type' and 'dport' means 'code'. Defined as a list of ports or port ranges, separated by commas. The special character * indicates all ports, and a port range is specified with a hyphen. For example:

        sport: 123, 200-300, 4567
        dport: *

Either sport or dport (or both) can be set to 'none', which specifies that the rule only matches when the ports_ok argument is 0. For other values of sport and dport, ports_ok must be non-zero. If sport and dport are omitted, the rule will ignore ports_ok completely.

sym

0 or 1, indicates whether the source and destination ports and subnets are symmetric. Defaults to 0 (false). Ports and networks are matched pairwise.

protocol

The IP protocol number(s) to match. Multiple protocols are separated by commas. No ranges are currently supported, but * can be used to indicate all protocols.

priority

Used for sorting multiple matching rules. 1 is highest priority; defaults to 50. If two or more matching rules with equal priority exist, their position relative to each other is undefined.

contributor

Who added this rule or most recently updated it.

date

When the rule was last updated.

notes

Extra information.

reference

Source of information for the rule.

url

URL with the most definitive source for the rule.

An example entry for the World Wide Web would be:

        description:    World Wide Web
        name:           WWW
        group:          WWW
        srcnet:         0.0.0.0/0
        dstnet:         0.0.0.0/0
        sport:          80,8080
        dport:          *
        sym:            1
        protocol:       6
        priority:       50
        contributor:    bigj
        date:           1999-07-08
        notes:
        reference:      IANA Port assignments
        url:            http://www.iana.org/assignments/port-numbers

Note that srcnet and dstnet can (and should) be omitted, since they default to matching all subnets. The notes and priority are similarly the same as the default.

If one wanted to override this rule with a domain specific application, such a rule might look like:

        description:    Our non-web app
        name:           APPFOO
        group:          INTERNAL
        srcnet:         10.0.0.0/8
        sport:          8080
        dport:          *
        sym:            1
        protocol:       6
        priority:       10

Due to the way symmetry works, this would match data from the local network with a source port of 8080, or to the local network with a destination port of 8080, but not from the local network with a destination port of 8080, for example.

It is possible to create a rule to match all cases, as a fall-through. It should have a low priority in order to be useful, however:

        description:    Unknown TCP
        name:           UNKNOWN_TCP
        protocol:       6
        priority:       90

        description:    Unknown
        name:           UNKNOWN
        protocol:       *
        priority:       100

METHODS

new ()

Creates a new AppPorts object.

clear ()

Removes all entries from the AppPorts object. Good to do before re-loading a rules file.

load_rules (FILENAME)

Takes a path to a rules file and converts it into rules that can be applied to application port numbers.

match_rule (SRC_IP, DST_IP, PROTOCOL, PORTS_OK, SPORT, DPORT)
match_rule (PROTOCOL, PORTS_OK, SPORT, DPORT)

Does the application matching for the flow data. If the source and destination IP addresses are omitted, no rules with subnets will match. If no match occurs, it returns undef, otherwise it returns a list of references (or the highest priority reference in a scalar context) to objects with accessor members for each field:

Note that protocols, src_nets, dst_nets, src_ports and dst_ports are arrays of strings. The port ranges and wildcards are in the same form as the config file. The no_ports indicates whether the rule specifically matches ports_ok == 0, and check_ports indicates whether the rule looks at ports_ok at all.

get_rule (NAME)

Returns a list of rules that have the name NAME. Returns a single rule in scalar context.

dump_rules ()

A debugging method that outputs the rules in the same format as the input file. Presently just dumps to standard out.

API CHANGES

In CoralReef 3.8.0, the ports_ok rule field was removed and made implicit based on the sport/dport fields. The ports_ok accessor function was replaced with no_ports and check_ports.

CAIDA::AppPorts from CoralReef < 3.5.0 had a different API. The primary differences are:

Removed functions

Changed return type

match_rule now returns an opaque object, whereas match_first_rule would return an array reference. To access the fields of a rule, we now use accessor functions instead of reading the array directly. For example, the old way was:

        $match = $obj->match_first_rule($sport, $dport, $proto);
        $name = $match->[NAME];

The current way is:

        $match = $obj->match_rule($proto, $ok, $sport, $dport);
        $name = $match->name;

AUTHOR

CoralReef Development team, CAIDA <coral-info@caida.org>