Skip to main content

NLANR AMP: Lessons Learned


Archived MagicPoint presentation slides, compiled into a single PDF document.

cominf.pdf (9 slides, 801 KB)

Slide text transcript

Slide 1: NLANR AMP

NLANR AMP 

Lessons Learned


Tony McGregor

tonym@nlanr.net

Slide 2: Outline

Outline

Intro to the existing NLANR AMP mesh
(optional)

The Lessons

Slide 3: Introduction the the NLANR AMP Mesh

Introduction the the NLANR AMP Mesh




AMP is the active measurement component of NLANR's NAI project

Monitors are inexpensive FreeBSD based boxes

Slide 4: Introduction

Introduction

Aim to deploy 
a monitor at all HPC sites in the US 
one monitor in each other country with an HPC connection to the US
Currently 151 monitors deployed

Slide 5: AMP System Architecture

AMP System Architecture
(Mostly) Full mesh of measurement probe machines
Some destination only machines
                    

Central data repository and visualisation machines
Data available through
Web pages
An NLANR developed 3D animation tool (Cichlid)
Raw Data (web and webservices) Interface

Slide 6: Re-implementation

Re-implementation

People would like to replicate AMP:

in other countries

within their own network

on a campus

in a distributed computing environment

But:
Original AMP wasn't designed with portability in mind

The code base was experimental

Slide 7: Highlights of the AMP package

Highlights of the AMP package

Packaged (tarball, GNU configure, make, make install, CVS etc)

More tests and test options

Modular -Can easily add new tests

More flexible scheduling

Open SSL based certificates, CA and encryption

IPv6 aware

Better web interface

Slide 8: Lessons

Lessons
Cost/Complexity per monitor vs Depth of Deployment Tradeoff
c.f. Metcalf and/or Reed
1ms is ushally enough, RTT will mostly tell the story
the more samples the less history

Large scale infrastructure is difficult but doable
96/4 rule
solid state

Reputation matters if you want people to deploy your box

There is a big demand for active measurement
lots of network operators
the more general you try to be the less likley you'll fly

Humans can be useful too

Slide 9: More Lessons

More Lessons
There's never enough time for analysis if you do infrastructure
there's never the infrastructure if you do analysis

It's hard to be right all the time
mostly right is easy but not very useful

"High Precision Traffic Measurement by the WAND Research Group", Cleary, J.G. and Graham, I. and McGregor, A. and Pearson, M. and Ziedins, I. and Curtis, J. and Donnelly, S. and Martens, J. and Martin, S. IEEE Communication Magazine. Mar 2002, 167-173

Diurnal variation cancellation is important
event detection and network tomography need it to work well 

If it's not on the web forget it

Good coders are gold