why is it so hard to do research w real data?


short answer: there hasn't been any since 1995 
long answer: there's way too much data floating around
disadvantage: inappropriate data can be distracting or worse
advantage: publishing inappropriate data can incent people to offer you better data (`desperate times call for desperate measures' methodology)

2 outstanding talks about problems w Internet data
vern's talk aug2001 www.icir.org/vern/talks/vp-nrdm01.ps.gz
david's talk apr2002 www.caida.org/publications/presentations/