Methodology for Passive Analysis of a University Internet Link
Nevil Brownlee, (CAIDA / The University of Auckland)
kc claffy, Margaret Murray and Evi Nemeth (CAIDA)
PAM2001 Workshop, April 2001
- Overview
- Measurement in large production networks
- Measurement infrastructure design issues
- Passive Measurement Methodology
- UCSD Network Topology
- Measurement goals, Writing rulesets
- Case Studies
- Short-term data rates
- Time Variations of Stream Lifetime and Size
- Conclusion
- Measurement Design Issues
- Active vs passive
- Monitor placement within the network
- Need to understand the physical topology
- Can discover the IP Address ranges (netblocks) in use
- Could monitor every link, but this doesn't scale
- Simpler approaches: monitor only busiest links, monitor only at edges
- Selecting Metrics
- Many metrics to choose from, e.g. IPPM, CAIDA Metrics FAQ
- We use NeTraMet (RTFM) attributes, i.e. those described in RFC 2720 and RFC 2724
- Data Collection and Archiving
- What data is to be measured and stored?
(NeTraMet flows, 5-minute meter readings) - How will the data be stored and accessed?
(Flow data files, one per day) - What interface will be provided to make the data
easily accessible to users?
(Web page for daily/weekly plots)
- UCSD/SDSC network topology
- SDSC has links to CERFnet, vBNS+ and Abilene
- UCSD network has CalREN link and SDSC links
- Routing is asymmetric across the four Internet links
- NeTraMet meters are installed at two points, `UCSD,' and `SDSC'
- Developing Rulesets
- Write ruleset to select flows of interest, and gather the required flow data
- Careful testing is essential!
- Must make sure ruleset covers all possible pairs of hosts
(even the ones one didn't expect) - Distribution parameters (number of bins, upper and lower limits, etc.) work well for the expected traffic load
- Rulesets evolve - they are refined as one's understanding of the measured traffic improves
- Case Study 1: Short-term Data Rates
- We measure the total data rate into and out from SDSC via the CERFnet link. Link is rate-limited to 20 Mbps
- We determine each packet's direction by testing whether its Source IP Address lies within one of UCSD/SDSC's 14 netblocks
- Our ruleset for this is as follows ..
- This ruleset builds a single flow, with `all netblocks within UCSD/SDSC' as its Source
- The Save To/FromBitRate statements tell the meter it should compute n-second bit rates To and From
- The distributions parameters are given in the comment line. We are using 10-second bit rate distributions with 48 bins
# Ruleset to get 10-second data rates for CERFnet link define CAIDA = 192.172.226/24; define HYPERNET = 153.105/16; define MPL106 = 192.135.237/24; define MPL4 = 192.135.238/24; define NET_NSI = 198.133.185/24; define SCRIPPSNET_BIG = 137.131/16; define SDSCFDDIDMZ = 198.17.46/24; define SDSC2 = 132.249/16; define SDSC_APOLLO = 192.31.21/24; define SDSCNET_CBLK = 198.202.64/18; define UCSD = 128.54/16; define UCSD_CERF = 199.105.0/18; define UCSD_EXTRN = 137.110/16; define UCSD_SUB = 132.239/16; define UCSD_NETS = UCSD, UCSD_SUB, UCSD_EXTRN, MPL106, MPL4, UCSD_CERF; define SDSC_NETS = SDSC2, SCRIPPSNET_BIG, HYPERNET, SDSC_APOLLO, CAIDA, SDSCFDDIDMZ, SDSCNET_CBLK, NET_NSI; define SOURCE_NETS = UCSD_NETS, SDSC_NETS; if SourcePeerType == IPv4 save; else ignore; if SourcePeerAddress == (SOURCE_NETS) { # To means 'away from SOURCE' save ToBitRate = 48.10.0!0 & 1.3.1!24000; save FromBitRate = 48.10.0!0 & 1.3.1!24000; # 48 buckets, 10s rates, linear, **3 => 1k..24M B/s count; } set data_rate_n; format FlowRuleSet FlowIndex FirstTime SourcePeerType " " ToPDUs FromPDUs " " ToOctets FromOctets " (" ToBitRate ") (" FromBitRate ")";
- 10-second Data Rates for week from Sat 17 Feb 2001
- 18T09 means 0900 (UTC) on 18 Feb 2001, i.e. 0100 (PST)
- Diurnal variations In and Out (minimum around 1200 UTC, i.e. 0400 PST)
- Maximum is clearly rate-limited. This limiting is not at all visible in the 10-second medians
- More data out than in for this week (probably not true for the other three research/education networks
- Could do this with SNMP, would need to read interface counters every 10 seconds
- Case Study 2: Time Variations of Stream Lifetime and Size
- We extended NeTraMet to build stream size and lifetime distributions.
When a stream terminates, its size (packets and kB) and duration (ms) are added into the distributions for its flow - As well as time variations, we are interested in differences between protocols: UDP, non-web TCP, web
- We also distinguish `outside' web (data imported to UCSD) and `outside' web (data exported from UCSD)
- Our ruleset for this is as follows ..
# Collect stream lifetime and size distributions define UCSD_SUB = 132.239/16; define UCSD_EXTRN = 137.110/16; define UCSD_CERF = 199.105.0/26; define SOURCE_NETS = UCSD_SUB, UCSD_EXTRN, UCSD_CERF; define WWW = 80; # www port number if SourcePeerType == IPv4 save; else ignore; if SourceTransType == TCP save, store FlowKind := 2; else if SourceTransType == UDP save, store FlowKind := 1; else ignore; if SourcePeerAddress == (SOURCE_NETS) { # To means 'away from SOURCE' if DestPeerAddress == (SOURCE_NETS) ignore; # Internal UCSD flow, ambiguous if SourceTransType == TCP { if SourceTransAddress == WWW && DestTransAddress == WWW store FlowKind := 5; # Would be ambiguous else if DestTransAddress == WWW store FlowKind := 3; # Server outside UCSD else if SourceTransAddress == WWW store FlowKind := 4; # Server inside UCSD } save ToFlowOctets = 50.0.0!0 & 2.2.1!1000; save FromFlowoctets = 50.0.0!0 & 2.2.1!1000; # 50 buckets, PP_NO_TEST, log, 100..100k B save FlowTime = 50.0.0!0 & 2.4.1!12000 # 50 buckets, PP_NO_TEST, log, 10 ms .. 120 s count; } set flow_stats_size; format FlowRuleSet FlowIndex FirstTime SourcePeerType SourceTransType " " FlowKind " " ToPDUs FromPDUs " " ToOctets FromOctets " (" ToFlowOctets ") (" FromFlowOctets ") (" FlowTime ")";
- We extended NeTraMet to build stream size and lifetime distributions.
- Stream Lifetimes for 5 days from Fri 2 Feb 2001
- Bottom trace shows number of streams for each protocol per 5-minute interval.
- UDP streams are mostly short; median <= 10 ms
- Non-WWW streams last much longer
- `Outside' web streams have median around 300 ms
- `Inside' web streams are very simlar to `outside'
Medians of 5-minute distributions. - Inbound Stream Sizes for 5 days from Fri 2 Feb 2001
- Most UDP streams are very small; 95% import < 500 Bytes
- Non-WWW streams are bigger; 75% <= 10 kB
- `Outside' web streams have median below 800 Bytes, but 95%-ile is higher, abround 30 kB
- `Inside' web streams have median below 800 Bytes, but 95%-ile about 200 Bytes. For inside servers, inbound packets are only carrying TCP acks for exported web objects
Medians of 5-minute distributions. - Outbound Stream Sizes for 5 days from Fri 2 Feb 2001
- UDP outbound streams are even smaller than inbound ones
- Non-WWW outbound streams are also similar to inbound ones
- Web streams inbound are very similar to outbound, except that their `import' and `export' roles are reversed
Medians of 5-minute distributions. - Cummulative Distributions for Streams
- Some diurnal variations are visible, together with occasional bursty changes, but overall the distributions are surprisingly stable
- The next three plots are cummulative distributions for the 5-minute interval ending at 2200 (UTC) on Fri 2 Feb
- Cumulative Stream Lifetime Distributions
- Nearly 60% of UDP streams last 10 ms or less
- TCP streams are longer-lived; only 10% of them last 10 ms
or less, and their
60th percentile is close to 1 s - non-WWW streams are more long-lived than web streams
Medians of 5-minute distributions. - Cumulative Stream Inbound Size Distributions
- UDP streams reach 99% at about 10 kB, but there are a few larger ones.
- Non-WWW and `outside' web streams have a similar distribution shapes. These should be similar to distributions of file sizes.
- `Inside' web streams show a sharp rise at 600 Bytes; these are TCP acks from outside web clients
Medians of 5-minute distributions. - Conclusion
- Building measurement infrastructure is a non-trivial task. It requires careful design and implementation so as to ensure that the measurements provide effective support for their users
- A clear understanding of the network topology is vital, but it is not always easy to achieve.
- NeTraMet is a very effective measurement tool, providing
a very general way to specify flows, and a reasonable
amount of front-end (wuthin the metere) data reduction.
However, care is needed when creating rulesets. In particular, a ruleset should be unambiguous for all possible source-destination pairs - Short-term bit rate distributions are very useful for monitoring links. NeTraMet can easily produce them for flows within a torrnet (which is not possible using SNMP interface counters)
- We have extended NeTraMet to collect Stream Size and Lifetime
distributions. Their behaviour reinforces our earlier experiences
(most UDP and TCP streams are short-lived, etc.)
However, there is scope for more work ..- Overall, the distributions are fairly stable
- But there are short-term burts, which don't seem to be correlated between the various protocols
- These distributions could provide a way to identify various kinds of network attacks (similar to FlowScan), in real time
- Overall, NeTraMet provides a good platform on which to build
network measurement systems
Last updated: 14 April 2001