CAIDA 2003-2005 Program Plan

| 
|
|
A summary of research goals and plans for September 2003 through September 2005.
| 
|

|
For further information contact k claffy, kc@caida.org
Executive Summary:
The Cooperative Association for Internet Data Analysis (CAIDA) is an
independent analysis and research group based at the University of
California's San Diego Supercomputer Center, seeking to foster
collaboration among the commercial, government, and research sectors
of the Internet industry.
Aimed at promoting greater cooperation in the engineering and
maintenance of a robust, scalable global Internet
infrastructure, CAIDA provides a neutral framework to support cooperative
technical endeavors in measurement, analysis, and tool development.
Mission Statement:
CAIDA investigates both practical and theoretical aspects of the Internet,
with particular focus on topics that:
- are macroscopic in nature and provide enhanced insight into the
function of Internet infrastructure worldwide
- provide free access to traffic analysis and visualization tools
to facilitate network measurement and management
Research Program Areas:
CAIDA is actively engaged in the following six program areas:
| program area | goal |
| I | routing, addressing and topology |
Develop a calculus to describe and model the structure and
dynamics of global Internet topology. |
| II | workload characterization |
Analyze pertinent (and to the extent possible, `typical')
features of Internet usage, including by protocol,
application, and location dynamics. |
| III | network security |
Monitor unsolicited Internet traffic and distill
malicious activity, including DOS attacks, worms,
host and port scanning, and novel attacks. |
| IV | DNS |
Develop and evolve tools for improving DNS measurement
and analysis, focused on implications for future
DNS functionality (IPv4 & IPv6). |
| V | performance |
Develop methodologies and tools for measuring Internet
performance characteristics, in particular estimation
of path capacity and available bandwidth |
| VI | trends |
Correlate heterogeneous network measurement data to
identify, describe, and analyze Internet traffic trends. |
In each of the program areas, CAIDA
- collects Internet measurement data sets and makes them available to
other researchers
- develops software tools
- performs research and analysis
- provides multiple outreach and educational resources
CAIDA actively collaborates with other researchers by releasing
tools and data sets. This document describes in more detail
CAIDA's activities regarding research and analysis, tool development
and data availability. Outreach and educational activities are
described at: http://www.caida.org/publications/ and http://www.caida.org/workshops/.
Allocation of Effort:
|
 |
-
Routing, Addressing and Topology
-
Research and Analysis
CAIDA investigates Internet topology growth and routing system
characteristics in support of future growth of the Internet.
Specific projects include:
-
Routing and peering analysis - "Routing and Peering
Analysis for Enhancing Internet Performance and Security" is funded by
NCS via NSF ANIR.
Main funded tasks are:
- investigate patterns of IP address space usage;
- develop methodologies to identify 'core' Internet nodes,
prefixes, ASes, geographic regions;
- track growth, refinement and churn and categorize contributors
to BGP dynamics;
- assess incongruity between actual and announced paths and
develop taxonomy of incongruity types.
Additional funds are needed to:
- evaluate effects of incongruities on Internet performance
and stability;
- create an apparatus for parameterizing and modeling of
peering policies;
- assist vendors and U.S. government agencies with
prioritized emergency notification processes
-
Routing atoms - "Next Generation Routing (Atoms)"
is funded by NLnet and RIPE. CAIDA is researching and implementing
modifications to BGP routing that aggregate prefixes into
equivalence classes (policy atoms) based on common AS path from a
given topological location. The motivation for this project is the
recognized concern regarding the increased instability imposed by
inherent additional computation and communication
costs as the global BGP table size increases.
Main funded tasks are:
- write software to compute policy atoms based on data from
the U. of Oregon Route Views project;
- investigate properties and dynamics of atoms with
respect to attributes other than AS path;
- comparing models of atoms derived from CAIDA topology
measurements with those from Route Views BGP tables;
- develop atomized BGP routing software and
simulate and test it in a confined deployment scenario;
- making tested and optimized (zebra-based) implementations
publicly available.
Additional funds are needed to:
- analysis of atoms assuming that providers themselves
aggregate prefixes originated by customers into atoms.
-
AS connectivity and ranking - "Connectivity Ranking of
Autonomous Systems" is funded by Cisco. CAIDA has developed an
algorithm to rank ASes by their outdegree in the Internet graph
constructed from our macroscopic topology measurements. This ranking
reveals critical subsets of peering relationships as well as the extent
of connectivity coverage and relative market share among different
providers.
Main funded tasks are:
- evaluate inter-AS connectivity based on prefix-level
granularity as well as AS granularity;
- develop a set of utilities processing BGP data (IOS and zebra
formats) for heuristic analysis of peering, transit, and
customer relationships.
Additional funds are needed to:
- develop a visualization tool for navigating BGP tables
capable of visualizing at least half a million nodes with a
graceful drill-down.
-
IPv4 and IPv6 macroscopic topology - is funded by
WIDE (Japan) gift support. CAIDA has been conducting active
measurements of global IPv4 IP level connectivity since 1998. We are now
in the process of extending these measurements to track IPv6 connectivity
as well.
Main funded tasks are:
- design and test IPv6 monitoring tool scamper;
- investigate IPv6 tunnel links between IPv4 routers;
- compare current structure and characteristics of IPv4
and IPv6 topologies.
Additional funds will be needed to:
- develop and deploy IPv6 monitoring infrastructure to
support continuous collection, storage, and analysis of
IPv6 routing and topology data (monitors around the
world, hardware and software resources for continuous
data collection and downloading, sysadmin support).
-
Geolocation of IP resources.
Accurately identifying the geographic location of network objects is
critical to projects in all six of CAIDA's focus areas.
This program area is currently not funded.
As of September 2003 we are exploring a partnership with Digital
Envoy for providing IP geolocation services to CAIDA for internal use.
If this partnership succeeds, we will offer suggestions for
strategies and techniques for geo-location of IP
resources (including parsing registry databases, automated name
recognition in ISPs host naming patterns, using RTTs for
triangulation) and heuristics for integrating available techniques.
CAIDA will continue to try to support publicly accessible technology
for the mapping of AS numbers to geographic locations
according to the main regional Internet registries (RIRs): ARIN, APNIC,
LACNIC, RIPE.
The table below summarizes the status of the listed projects.
-
Supporting Tool Development
Building and maintaining software tools to measure and
analyze Internet topology is an
important part of CAIDA activities. Existing tools are:
-
skitter -
actively probes IPv4 connectivity
- update in 2003 to include intermediate hop RTTs & optimize storage
-
rocketfuel -
actively probes IPv4 connectivity (AS-specific). CAIDA will assume
support for U. Washington's
rocketfuel at the tool authors' request. (undecided)
-
scamper - actively probes IPv6 connectivity
- continue development in 2004 and 2005
-
iffinder -
identifies interfaces belonging to the same router
-
arts++ - a C++
class library used by CAIDA software packages
- needs update or replacement
-
NetGeo - a
database and collection of Perl scripts to map IP addresses and AS numbers to
geographic locations
- needs maintenance and update (Project I-v) or commercial alternative
-
Walrus -
interactively visualizes large directed graphs in 3D space
-
Otter -
visualizes arbitrary network data expressed as a set of nodes, links or paths
-
GeoPlot -
creates a geographical image of an arbitrary network data set
-
plot-latlong -
simple tool for plotting lat/long points on geographic maps
-
LibSea - a
Java library for representing large directed graphs
-
PlotPaths
- displays forward and reverse network paths from a single source to one or
more destinations
The project "Macroscopic Internet Data Measurement and Analysis"
funded by NCS via NSF-NPACI-CISE
(www.caida.org/funding/ncs/)
supports the following development of Internet topology tools:
- maintain Walrus 3D hyperbolic viewer and apply it to
specific tasks, e.g., Internet worm/virus spread;
- develop and implement navigational techniques and visual
representations for routing table and peering relationships.
This project is in Q2Y1 funding.
CAIDA needs more funds to maintain and update existing tools and to
continue developing new, better tools for the Internet research community.
-
Data to Community
At the moment CAIDA makes available to other researchers the following
data sets relevant for routing and topology studies:
-
Workload Characterization
-
Research and Analysis
CAIDA aims to measure and analyze traffic on production Internet links
in pursuit of better understanding of that traffic. Specific
projects are listed below, currently
funded by NCS via NSF-NPACI CISE
(www.caida.org/funding/ncs/) and are currently in the first half of
year 1 funding. Results and progress are at
www.caida.org/research/traffic-analysis/.
-
Flow estimation and taxonomy by size/speed/duration.
Main funded tasks are:
- develop self-tuning measurement algorithms that are robust in the
face of anomalous traffic patterns, e.g., port scans, DOS attacks;
- develop a measurement system that concisely summarizes
traffic on a link;
- test real-time algorithms and software using traffic at UCSD/SDSC,
e.g., SDNAP, to identify common network applications or
groups of applications.
-
Analysis of peer-to-peer traffic. CAIDA develops methods to
identify peer-to-peer (p2p) traffic that no longer uses fixed port numbers.
Main funded tasks are:
- develop p2p command strings searching algorithms simple and fast
enough for real-time implementation, e.g., in NeTraMet;
- set up a long-term p2p monitor on a backbone Internet link;
- create a dynamic web page showing traffic levels of various
p2p applications.
-
Modeling TCP dynamics. CAIDA researchers study
features of TCP flows that can be reliably estimated from packet header
trace data.
Main funded tasks are:
- compare various algorithms determining for round trip time (RTT)
of TCP flow packets in captured traffic samples;
- identify and analyze the behavior of long TCP flows, i.e.,
those responsive to TCP's feedback and
congestion control algorithms;
- implement a new NeTraMet attribute indicating the status of TCP
control mechanisms for a given flow.
Additional funds are needed to:
- constructing a measurement-based TCP traffic model.
-
Compare workload characteristics between IPv4 and IPv6, and
correlate with topology data where applicable. CAIDA
will devise methodologies for joint analysis of data collected from its
macroscopic topology monitors and from passive traffic monitors.
Main funded tasks are:
- compare characteristics of IPv4 and IPv6 workload (e.g., flow
lengths, ports, protocols);
- correlation of workload and topology characteristics, i.e.,
do patterns differ between access and core links;
- establish real-time online tracking of workload characteristics
at representative locations.
Additional funds are needed to:
- track propagation of active probes continuously sent by
topology monitors through collected traffic samples;
- monitor IPv4 and IPv6 workload and performance over several years.
-
Supporting Tool Development
CAIDA develops device drivers and applications enabling network data
collection and workload
characterization, in real time or from trace files. Existing tools are:
-
CoralReef -
a comprehensive software suite to collect and analyze data from passive Internet
traffic monitors
-
NeTraMet -
an implementation of the RTFM architecture for Network Traffic Flow
Measurement
-
flowplot
- a new visualization module tool to visualize
output of aguri, CoralReef, and new flow estimation tools
Current development and maintenance of workload characterization
tools are supported by DARPA NMS and WIDE.
The DARPA project is in the 3rd (last) year of funding. We will also use
Endace's gift of DAG traffic monitoring cards to deploy monitors at
strategic monitoring points in the backbone networks.
Main funded tasks are:
- make CoralReef and NeTraMet work with all current models of
Endace DAG cards;
- establish GPS synchronization of DAG cards in SDSC computer room;
- re-establish long-term CAIDA monitoring systems, as SDSC/UCSD network
changes from OC12 to Gigabit Ethernet links.
-
Data to Community
At the moment CAIDA makes available for other researchers the following
data sets relevant for workload characterization:
- NetTraMet anonymized traces
Archived flow data files used for DNS performance summaries (see
Research Area IV) can be requested from nevil@caida.org.
- backbone traces
CAIDA has a growing collection of traces from OC48 backbone links.
Visitors to CAIDA may use those data while at SDSC
(AUPs apply.) Researchers requesting access may discuss visit schedule
options by contacting kc@caida.org.
-
Network Security
-
Research and Analysis
CAIDA researchers pioneered the application of the
backscatter technique to study denial-of-service
(DoS) attacks worldwide. We developed a network
telescope to study Denial-of-Service attacks,
Internet worm spread, and host and port scan
characteristics. We are currently developing
real-time publicly available reports that quantify
global network security threats worldwide. Research
tasks for this project are partially funded by a
Cisco URP grant and an NSF Trusted Computing grant
until September 2006. An additional Cisco URP
grant and a proposal to support operational tasks
are pending.
Main funded tasks are:
- identify the scope and characteristics of distributed
denial-of-service attacks and Internet worms
- develop a tool to monitor and generate graphical
representations of malicious traffic
- classify victims of wide-are-network security events
- develop real-time, adaptive denial-of-service-attack
definitions
- quantify the damage experienced by DoS and worm attack
victims
- assess the efficacy of nascent efforts at distributed
attack mitigation
- identify trends in attack types over time
- investigate the ways that telescope size and position
influence results
Additional funds will be needed to:
- long-term network telescope operation (disk space,
network infrastructure)
- honeynet development to validate telescope results
- development of long-term patching/vulnerability profile
studies
- expanding the network telescope to cover additional locations
- public release of recent worm, denial-of-service attack,
and host scan datasets
-
Supporting Tool Development
-
- CAIDA has developed software based on the Coralreef
API to capture and analyze denial-of-service
attacks and Internet worms. A modified version
of the Coralreef Report Generator helps to display
realtime security reports.
- - Countries.pm: this perl module provides country code,
country name, and continent location information.
-
- crl_attack_flow: specialized high-speed
security event monitoring and classification
software. Specialized event-based, rather than
flow-based, attack monitoring and reporting
software is also under development.
-
- plot_country_intervals: perl software that uses
the fly and gifsicle open-source programs to
generate animations of denial-of-service attacks,
Internet worms, and host/port scans worldwide
-
- The "Macroscopic Internet Data Measurement and
Analysis" project, funded by by NCS via NSF-NPACI-CISE
(www.caida.org/funding/ncs/),
supports the application of the Walrus 3D hyperbolic viewer
for tasks such as visualization of worm/virus spread
throughout the Internet. This project is in the first
half of year 1 funding.
-
Data to Community
Three weeks of backscatter data are available to academic researchers, US
government funded researchers, US agencies, and CAIDA members via
www.caida.org/data/passive/backscatter_request.xml. These
are some of the most comprehensive publicly
available datasets of distributed denial-of-service
attacks around the world. We will continue to support
community data needs as resources permit.
-
Domain Name System (DNS)
-
Research and Analysis
The Internet depends on reliable DNS service for correct, robust operation.
Relentlessly increasing Internet growth and the rise of IPv6 will further load
the DNS infrastructure. The U.S. research agenda must pursue better
understanding of this critical element of the global Internet.
CAIDA has been actively involved in DNS data analysis since
1999. Currently, our specific projects in this program area are:
-
Characterization of DNS workload and performance. This project
is supported by WIDE (Japan) gift funds and is currently in the 2nd half
of year 2 funding. Additional WIDE gift funds are pending.
Main funded tasks are:
- maintain NeTraMet meter in Tokyo; make measurements web-accessible;
- repeat analysis of private (RFC1918) update traffic to determine
whether there has been any improvement in vendor software to
alleviate spurious updates;
- investigate the impact of anycast on DNS root service.
-
DNS Modeling - "Network Modeling and Simulation"
funded by DARPA NMS is currently in the 1st half of year 3 funding.
CAIDA uses the simulation lab at
the Measurement Factory
for conducting laboratory experiments simulating DNS behavior
under controlled conditions. Results are at
www.caida.org/projects/dns/dns-root-gtld/.
Main funded tasks are:
- refine laboratory simulations of large-scale DNS behavior, e.g.,
use more realistic TTLs, add some percentage of lame delegations,
try another set of names to query, address the `replaying trace too
fast' problem, etc.;
- develop statistical techniques to categorize the state of TLD
nameserver operation and monitor configuration changes as they
occur;
- collect and analyze BIND log files.
Additional funds will be needed to:
- define parameters of realistic DNS scenarios for use in
network models;
- investigate scalability of the proposed parameters from
laboratory environment to the global Internet.
-
DNS Security - DNS-OARC proposal to incorporate research and analysis
capabilities into trusted operational centers has been submitted to NCS.
If funded
CAIDA will conduct DNS performance and vulnerability analyses and produce
recommendations on hardening DNS for both IPv4 and IPv6 operations.
Main proposed tasks are:
- Build tools for automatic analysis of BIND log files;
- develop statistical techniques to categorize the state of TLD
nameserver operation and recognize operating state changes as they
occur;
- investigate implications and effects of anycast on
root server operation;
- maintain continual communication with NCS and DHS
on DNS performance as viewed through our measurement;
Additional funds will be needed to:
- build and test tools enhancing server security;
- evaluate quality of other macroscopic DNS measurements procured
by the federal government for use in cybersecurity
monitoring;
- archive long-term performance data on root and gTLD DNS use;
- maintain existing measurement tools as protocols and
formats evolve.
-
Supporting Tool Development
CAIDA documents, packages, and distributes passive traffic monitoring tools
(
CoralReef and
NeTraMet
- II-B above) and methods for their use to monitor the DNS infrastructure.
Other DNS related tools supported by CAIDA are:
-
dnsstat - collects accurate statistics of DNS queries on a specific nameserver (or
client)
-
dnsstop
- displays various tables of DNS traffic on your network
Further development of tools to monitor DNS behavior is
supported by DARPA NMS and is in the first half of year 3 funding.
Main funded tasks are:
- add DNS TLD attribute to NeTraMet to simplify collection of
country-code TLD (ccTLD) data;
- devise method of monitoring ccTLD performance and add reports to
public web pages.
-
Data to Community
Two of the three production NeTraMet meters at Auckland (New Zealand), and
Boulder (Colorado, USA) continuously collect data for plotting daily root and
gTLD server performance. The third meter (at UCSD) is temporary unusable. It
will be revived after the UCSD/SDSC network upgrade. The following DNS
relevant data are available:
- DNS performance summaries - via
www.caida.org/cgi-bin/dns_perf/main.pl.
- root nameserver workload data sets (logs)
Visitors to CAIDA may use those data while at SDSC
(AUPs apply.) Researchers requesting access may discuss visit schedule
options by contacting kc@caida.org.
-
Performance
-
Research and Analysis
CAIDA measures Internet performance and develops methodologies for its
improvement. Our tool for plotting RTTs and packet loss to
all IP hops along a specified forward IP path
(beluga)
remains unfunded.
Our main funded activity in the performance area this year
is the DOE-funded project "Bandwidth Estimation Research"
(www.caida.org/projects/bwest/),
currently in early year 3 funding. Results and progress are at
www.caida.org/research/performance/. We have surveyed
existing bandwidth estimation
tools and algorithms and given results to tool developers. We have set up
a test lab environment, developed testing procedures and
evaluated tools on 100 MB links. We have started and will
continue testing available tools on GigEther speed paths.
Remaining funded tasks are:
- evaluate performance of bandwidth estimation tools
on 3- and 4-hop GigEther paths;
- develop a GUI interface for scheduling and visualizing bandwidth
measurements on end-to-end paths;
- test implementation of bandwidth measurement methodology (capacity
and available bandwidth) in high-performance academic and
research networks;
- develop application-layer techniques to help TCP to achieve its
maximum feasible bandwidth on a path (SOBAS).
Additional funds will be needed to:
- developing middleware to support bandwidth estimation
in collaboration with SciDAC Grid Portals researchers;
- integrating bandwidth estimation technologies into DOE
network infrastructures;
-
Supporting Tool Development
Available BWEST tools are:
-
pathload - estimates end-to-end available bandwidth
-
pathrate - estimates end-to-end bandwidth capacity
-
beluga - plots
RTTs and packet loss to all IP hops along a specified forward IP path
pathload and pathrate tools are maintained by C. Dovrolis at
Georgia Tech.
The project "Macroscopic Internet Data Measurement and Analysis"
funded by NCS via NSF-NPACI-CISE
(www.caida.org/funding/ncs/)
supports integration of visualization tools with performance data in real time.
This project is in the first half of year 1 funding.
-
Data and Infrastructure to Community
- Survey of tools and methodologies (accepted by IEEE Network)
- Tool evaluation results given to tool developers
- BWEST testbed (at SDSC) available to qualified researchers
-
Trends
-
Research and Analysis
The research community suffers from a lack of coherent longitudinal datasets
for cross-domain analysis of traffic on the wide-area Internet. CAIDA
proposed to create a database (Internet Measurement Data Catalog, or IMDC)
that will index distributed repositories and archives of Internet data and
tools. Potential benefits from unifying individual heterogeneous data sets and
making them available to all interested researchers are enormous, and we
envision the IMDC database as supporting the field of network research
for the foreseeable future.
This research will also provide measurement-based input to answer
public policy and
regulatory questions regarding administration, stability and security of
Internet infrastructure. The project "Correlating heterogeneous measurement
data to achieve system-level analysis of Internet traffic trends" funded by
NSF
(www.caida.org/funding/trends/)
is currently at the end of year 1 funding. Results and progress are at
www.caida.org/funding/trends/imdc/.
Main funded tasks are:
- design a universal annotation system (and database support)
suitable for describing heterogeneous Internet data sets;
- make recommendations for meaningful, maintainable long-term passive
traffic samples collection;
- apply IMDC to Internet research problems.
Additional funds will be needed to:
- strategic deployment of high-speed passive monitors;
- continuing analysis and visualization activities to answer currently
overarching Internet research issues and questions.
-
Supporting Tool Development
Available NSF funding supports creating the IMDC database and populating
it (initially) with CAIDA data sets.
Additional funding will be needed to maintain, improve and
enlarge the database in the future.
-
Data and Infrastructure to Community
The use of the IMDC database will be open to the research community and highly
encouraged. (AUPs will apply.)
Personnel
CAIDA currently employs 15 researchers and support staff based at SDSC; 3 remotely based staff/consultants; 3 undergraduate student workers; and 4 graduate student researchers.
Financial support
Sponsors
CAIDA has garnered significant corporate support through its
Membership program during the
Internet bubble, and lost several members when that bubble burst.
Currently, the following organizations
have made designated gifts to support CAIDA activities:
-
Cisco Systems -- the worldwide leader in networking for the Internet.
-
WIDE -- a consortium of
Japanese research organizations and companies working to establish
a Widely Integrated Distributed Environment.
-
Endace -- the only company
in the world that specializes in building high performance PCI cards
for remote network monitoring and surveillance
applications. The range of their products covers almost every physical
layer at every network speed up to OC192 and 10GigE.
Designated gifts to CAIDA enable us to maximize use of research dollars.
CAIDA could not survive without the generosity of its sponsors.
Past Performance
|
|