Methodology for Passive Analysis of a University Internet Link

Nevil Brownlee, (CAIDA / The University of Auckland)
kc claffy, Margaret Murray and Evi Nemeth (CAIDA)

PAM2001 Workshop, April 2001


  1. Overview

    • Measurement in large production networks
    • Measurement infrastructure design issues
    • Passive Measurement Methodology
      • UCSD Network Topology
      • Measurement goals, Writing rulesets

    • Case Studies
      • Short-term data rates
      • Time Variations of Stream Lifetime and Size

    • Conclusion

  2. Measurement Design Issues

    • Active vs passive
    • Monitor placement within the network
      • Need to understand the physical topology
      • Can discover the IP Address ranges (netblocks) in use
      • Could monitor every link, but this doesn't scale
      • Simpler approaches: monitor only busiest links, monitor only at edges

    • Selecting Metrics
      • Many metrics to choose from, e.g. IPPM, CAIDA Metrics FAQ
      • We use NeTraMet (RTFM) attributes, i.e. those described in RFC 2720 and RFC 2724

    • Data Collection and Archiving
      • What data is to be measured and stored?
        (NeTraMet flows, 5-minute meter readings)
      • How will the data be stored and accessed?
        (Flow data files, one per day)
      • What interface will be provided to make the data easily accessible to users?
        (Web page for daily/weekly plots)

  3. UCSD/SDSC network topology

    • SDSC has links to CERFnet, vBNS+ and Abilene
    • UCSD network has CalREN link and SDSC links
    • Routing is asymmetric across the four Internet links
    • NeTraMet meters are installed at two points, `UCSD,' and `SDSC'

  4. Developing Rulesets

    • Write ruleset to select flows of interest, and gather the required flow data
    • Careful testing is essential!
    • Must make sure ruleset covers all possible pairs of hosts
      (even the ones one didn't expect)
    • Distribution parameters (number of bins, upper and lower limits, etc.) work well for the expected traffic load
    • Rulesets evolve - they are refined as one's understanding of the measured traffic improves

  5. Case Study 1: Short-term Data Rates

    • We measure the total data rate into and out from SDSC via the CERFnet link. Link is rate-limited to 20 Mbps
    • We determine each packet's direction by testing whether its Source IP Address lies within one of UCSD/SDSC's 14 netblocks
    • Our ruleset for this is as follows ..
    • # Ruleset to get 10-second data rates for CERFnet link
      
      define CAIDA          = 192.172.226/24;
      define HYPERNET       = 153.105/16;
      define MPL106         = 192.135.237/24;
      define MPL4           = 192.135.238/24;
      define NET_NSI        = 198.133.185/24;
      define SCRIPPSNET_BIG = 137.131/16;
      define SDSCFDDIDMZ    = 198.17.46/24;
      define SDSC2          = 132.249/16;
      define SDSC_APOLLO    = 192.31.21/24;
      define SDSCNET_CBLK   = 198.202.64/18;
      define UCSD           = 128.54/16;
      define UCSD_CERF      = 199.105.0/18;
      define UCSD_EXTRN     = 137.110/16;
      define UCSD_SUB       = 132.239/16;
      
      define UCSD_NETS = 
         UCSD, UCSD_SUB, UCSD_EXTRN, MPL106, MPL4, UCSD_CERF;
      define SDSC_NETS =
         SDSC2, SCRIPPSNET_BIG, HYPERNET, SDSC_APOLLO, CAIDA,
         SDSCFDDIDMZ, SDSCNET_CBLK, NET_NSI;
      
      define SOURCE_NETS = UCSD_NETS, SDSC_NETS;
      
         if SourcePeerType == IPv4 save;
         else ignore;
      
         if SourcePeerAddress == (SOURCE_NETS) {
            # To means 'away from SOURCE'
      
            save ToBitRate   = 48.10.0!0 & 1.3.1!24000;
            save FromBitRate = 48.10.0!0 & 1.3.1!24000;
            # 48 buckets, 10s rates, linear, **3 => 1k..24M B/s
            count;
            }
      
      set data_rate_n;
      format
        FlowRuleSet FlowIndex FirstTime SourcePeerType
        "  " ToPDUs FromPDUs "  " ToOctets FromOctets
        "  (" ToBitRate
        ") (" FromBitRate
        ")";
        

    • This ruleset builds a single flow, with `all netblocks within UCSD/SDSC' as its Source
    • The Save To/FromBitRate statements tell the meter it should compute n-second bit rates To and From
    • The distributions parameters are given in the comment line. We are using 10-second bit rate distributions with 48 bins

  6. 10-second Data Rates for week from Sat 17 Feb 2001

    • 18T09 means 0900 (UTC) on 18 Feb 2001, i.e. 0100 (PST)
    • Diurnal variations In and Out (minimum around 1200 UTC, i.e. 0400 PST)
    • Maximum is clearly rate-limited. This limiting is not at all visible in the 10-second medians
    • More data out than in for this week (probably not true for the other three research/education networks
    • Could do this with SNMP, would need to read interface counters every 10 seconds

  7. Case Study 2: Time Variations of Stream Lifetime and Size

    • We extended NeTraMet to build stream size and lifetime distributions.
      When a stream terminates, its size (packets and kB) and duration (ms) are added into the distributions for its flow
    • As well as time variations, we are interested in differences between protocols: UDP, non-web TCP, web
    • We also distinguish `outside' web (data imported to UCSD) and `outside' web (data exported from UCSD)
    • Our ruleset for this is as follows ..
    • # Collect stream lifetime and size distributions
      
      define UCSD_SUB    = 132.239/16;
      define UCSD_EXTRN  = 137.110/16;
      define UCSD_CERF   = 199.105.0/26;
      
      define SOURCE_NETS = UCSD_SUB, UCSD_EXTRN, UCSD_CERF;
      
      define WWW = 80;  # www port number
      
         if SourcePeerType == IPv4 save;
         else ignore;
         if SourceTransType == TCP save,
            store FlowKind := 2;
         else if SourceTransType == UDP save,
            store FlowKind := 1;
         else ignore;
      
         if SourcePeerAddress == (SOURCE_NETS) {
            # To means 'away from SOURCE'
      
            if DestPeerAddress == (SOURCE_NETS)
               ignore;  # Internal UCSD flow, ambiguous
      
            if SourceTransType == TCP {
               if SourceTransAddress == WWW && 
                     DestTransAddress == WWW
                  store FlowKind := 5;  # Would be ambiguous
               else if DestTransAddress == WWW
                  store FlowKind := 3;  # Server outside UCSD
               else if SourceTransAddress == WWW
                  store FlowKind := 4;  # Server inside UCSD
               }
      
            save ToFlowOctets   = 50.0.0!0 & 2.2.1!1000;
            save FromFlowoctets = 50.0.0!0 & 2.2.1!1000;
               # 50 buckets, PP_NO_TEST, log, 100..100k B
            save FlowTime = 50.0.0!0 & 2.4.1!12000
               # 50 buckets, PP_NO_TEST, log, 10 ms .. 120 s
            count;
            }
      
      set flow_stats_size;
      format
        FlowRuleSet FlowIndex FirstTime SourcePeerType
        SourceTransType "  " FlowKind
        "  " ToPDUs FromPDUs "  " ToOctets FromOctets
        "  (" ToFlowOctets ") (" FromFlowOctets
        ") (" FlowTime
        ")";
        

  8. Stream Lifetimes for 5 days from Fri 2 Feb 2001

      Medians of 5-minute distributions.

    • Bottom trace shows number of streams for each protocol per 5-minute interval.
    • UDP streams are mostly short; median <= 10 ms
    • Non-WWW streams last much longer
    • `Outside' web streams have median around 300 ms
    • `Inside' web streams are very simlar to `outside'

  9. Inbound Stream Sizes for 5 days from Fri 2 Feb 2001

      Medians of 5-minute distributions.

    • Most UDP streams are very small; 95% import < 500 Bytes
    • Non-WWW streams are bigger; 75% <= 10 kB
    • `Outside' web streams have median below 800 Bytes, but 95%-ile is higher, abround 30 kB
    • `Inside' web streams have median below 800 Bytes, but 95%-ile about 200 Bytes. For inside servers, inbound packets are only carrying TCP acks for exported web objects

  10. Outbound Stream Sizes for 5 days from Fri 2 Feb 2001

      Medians of 5-minute distributions.

    • UDP outbound streams are even smaller than inbound ones
    • Non-WWW outbound streams are also similar to inbound ones
    • Web streams inbound are very similar to outbound, except that their `import' and `export' roles are reversed

  11. Cummulative Distributions for Streams

    • Some diurnal variations are visible, together with occasional bursty changes, but overall the distributions are surprisingly stable
    • The next three plots are cummulative distributions for the 5-minute interval ending at 2200 (UTC) on Fri 2 Feb

  12. Cumulative Stream Lifetime Distributions

      Medians of 5-minute distributions.

    • Nearly 60% of UDP streams last 10 ms or less
    • TCP streams are longer-lived; only 10% of them last 10 ms or less, and their
      60th percentile is close to 1 s
    • non-WWW streams are more long-lived than web streams

  13. Cumulative Stream Inbound Size Distributions

      Medians of 5-minute distributions.

    • UDP streams reach 99% at about 10 kB, but there are a few larger ones.
    • Non-WWW and `outside' web streams have a similar distribution shapes. These should be similar to distributions of file sizes.
    • `Inside' web streams show a sharp rise at 600 Bytes; these are TCP acks from outside web clients

  14. Conclusion

    • Building measurement infrastructure is a non-trivial task. It requires careful design and implementation so as to ensure that the measurements provide effective support for their users
    • A clear understanding of the network topology is vital, but it is not always easy to achieve.
    • NeTraMet is a very effective measurement tool, providing a very general way to specify flows, and a reasonable amount of front-end (wuthin the metere) data reduction.
      However, care is needed when creating rulesets. In particular, a ruleset should be unambiguous for all possible source-destination pairs
    • Short-term bit rate distributions are very useful for monitoring links. NeTraMet can easily produce them for flows within a torrnet (which is not possible using SNMP interface counters)
    • We have extended NeTraMet to collect Stream Size and Lifetime distributions. Their behaviour reinforces our earlier experiences (most UDP and TCP streams are short-lived, etc.)
      However, there is scope for more work ..
      • Overall, the distributions are fairly stable
      • But there are short-term burts, which don't seem to be correlated between the various protocols
      • These distributions could provide a way to identify various kinds of network attacks (similar to FlowScan), in real time

    • Overall, NeTraMet provides a good platform on which to build
    • network measurement systems

  15. Nevil Brownlee (nevil@caida.org)
    Last updated: 14 April 2001