Streams, Flows and Torrents

Nevil Brownlee, (CAIDA / The University of Auckland)
and Margaret Murray (CAIDA)

PAM2001 Workshop, April 2001


  1. Overview

    • Introduction: Flows, RTFM, NeTraMet
    • Flows, Streams, NeTraMet implementation
      • Meter Data Structures
      • Timeout Algorithms
      • Rulesets, Packet Matching
    • Example: DNS Rersponse Time Distributions
    • Conclusion

  2. Torrents, Flows, RTFM, NeTraMet

    • Traffic Measurement
      • Active, using probe packets. Requires careful design so as not to perturb the network
      • Passive, using trace files. A large-scale data analysis problem
      • Passive, using real-time data reduction

    • RTFM
      • Flows are bi-directional
      • Flows specified by end-point attributes
      • Meters - gather flow data
      • Meter Readers - collect flow data
      • Managers - configure meters and meter readers
      • Flows of interest (and data reduction) specified by Rulesets, written in SRL

    • NeTraMet: an open-source implementation of RTFM

  3. Streams and Flows

    • Torrent = sum of all flows on a link
    • Flow was defined above
      • elephants = flows carrying lots of traffic for a long time
      • mice = flows carrying small amounts of traffic

    • Stream = bi-directional IP microflow
    • Streams are individual IP sessions, i.e. 5-tuples
    • A flow can comprise few or many streams

  4. NeTraMet implementation: Data Structures

    • Control Blocks provide RMON-style control of the meter
    • Flow Table stores data about flows. Accessed via Flow Hash Table
    • Stream data accessed via Stream Info Block (minimises overhead for flows which don't use streams
      • Stream data blocks chain streams together for each flow
      • Each stream may maintain a queue of data for packet pair matching

  5. Timeout Algorithms

    • For flows, RFC 2720 specifies fixed InactivityTimeout
    • Streams are often of short duration
    • TCP streams have session state, UDP and ICMP don't
    • NeTraMet uses dynamic timeouts for streams
    • Stream Timeout Algorithm has two parameters
      • FixedTime: stream can't be timed out before this
      • TimeMultiplier: specifies minimum space/mark ratio for active streams
      • NeTraMet uses FixedTime = 5s and TimeMultiplier = 20x

    • Note that any dynamic timeout scheme must make assumptions about stream behaviour

  6. Rulesets, Packet Matching

    • A ruleset configures an RTFM meter to gather flow data. It must
      1. Specify flows of interest
      2. Determine each flow's direction (which endpoint is the source?)
      3. Specify each flow's address granularity
      4. Specify any required data reduction, e.g. computed or distribution-valued attributes to be saved

    • Example ruleset (in SRL) builds Flow Lifetime distributions for streams to web servers
    • It creates two flows:
      • FlowKind 1: Server is outside our HOME network
      • FlowKind 2: Server is inside HOME network
         if SourceTransType == TCP save;
         else ignore;  # Only interested in TCP flows
      
         if SourcePeerAddress == HOME {
            if DestTransAddress == WWW
               store FlowKind := 1;  # Server outside HOME
            else if SourceTransAddress == WWW
               store FlowKind := 2;  # Server inside HOME
      
            save FlowTime = 50.0.0!0 & 2.4.1!12000;
            # 50 buckets, log transform, 10**4 scale factor
            #              => buckets from 10 ms to 120 s
            count;
            }
         

  7. Avoiding Ambiguity in Rulesets

    • Above ruleset is ambiguous, i.e. packets in each direction of a stream could belong to two different flows
    • Streams with one endpoint inside HOME and the other outside work as expected
    • Streams with both ends inside HOME are ambiguous:
      • HOME.1 port 12345 -> HOME.2 port 80 sets FLowKind 1
      • HOME.2 port 80 -> HOME.1 port 12345 sets FlowKind 2
      • But FlowKind is used by the meter as an address attribute, i.e. as part of the flow's address hash
      • The meter will create two (uniderectional) flows!

    • The ruleset is easily modified to remove the ambiguity:
    •    if SourcePeerAddress == HOME {
      
            if DestPeerAddress == HOME
               store FlowKind := 3;  # Host, Server both HOME
      
            else if DestTransAddress == WWW
               store FlowKind := 1;  # Server outside HOME
            else if SourceTransAddress == WWW
               store FlowKind := 2;  # Server inside HOME
      
            save FlowTime = 50.0.0!0 & 2.4.1!12000;
            # 50 buckets, log transform, 10**4 scale factor
            #              => buckets from 10 ms to 120 s
            count;
            }
        

  8. Stream Distributions Example: DNS Response Times

    • TurnaroundTimes - as specified in RFC 2724 - use packet pair matching of request and response packets
    • We use this to observe behaviour of global DNS servers, both root and gTLD
    • This work is described elsewhere, we present only a brief summary here
    • We have installed a NeTraMet meter at UCSD, monitoring traffic between three UCSD netblocks and the Internet
    • The following is an overview of the ruleset which collects our DNS TurnarondTime data:
    •    if SourcePeerType == IPv4 save;
         else ignore;  # Not IP
         if SourceTransType == UDP save;
         else ignore;  # Not UDP
      
         TestDestAddress;  # Sets FlowKind
         if FlowKind == 0 nomatch;  # Not root server
         else {
            if DestTransAddress == DNS save;
            else ignore;  # Dest not DNS port
      
            save ToTurnaroundTime = 50.11.0!0 & 2.3.7!700;
            # 50 buckets, log transform, 10**3 scale factor
            #              => buckets from 7 ms to 700 ms
            count;
            }
         

  9. Root response time
    Medians of 5-minute distributions. Filtered: at least 10 requests per 5-min interval

    • Most roots have steady response times, with short bursts of slower response
    • A, F, J, K and L show many of the same bursts; indicating a common route
    • B behaved well during the weekend, but badly on Monday and Tuesday, suggesting a server performance problem
    • Hour-long step at 30T18 is common to A, F, G, J and K. The paths to these go via CERFnet, suggesting a denial of service attack on our CERFnet link

  10. gTLD response time
    Medians of 5-minute distributions. Filtered: at least 10 requests per 5-min interval

    • Generally better-behaved than the roots
    • B, C and D shared the step at 30T18
    • Similar step at 29T23 was common to all gTLDs, suggesting an attack on the UCSD-SDSC link
    • Steps for A and G at 2903 indicate a routing change for one or more links in common to those servers

  11. Conclusion

    • Streams, Flows and Torrents provide a useful taxonomy for network traffic. Bi-directional streams are required for TurnaroundTime measurements; they fit naturally within the RTFM architecture
    • NeTraMet implements stream-related attributes in ways which don't impose penalties on rulesets which don't use streams
    • NeTraMet's dynamic timeout algorithm for streams works well, largely because most streams are very short-lived. Any dynamic timeout algorithm must make assumptions about the temporal behaviour of streams
    • Strip charts for the root and gTLD servers provide a good indication of Internet performance on the paths to them. They are produced using passive measurements from a single point

  12. Nevil Brownlee (nevil@caida.org)
    Last updated: 14 April 2001