The ability to capture the fundamental characteristics that define
the Internet topology, and then apply this knowledge to produce a
tool that generates such a topology stands as a key component toward
the development and evaluation of new routing algorithms and protocols
and to the study of what drives Internet evolution.
Intuitively, one focuses on practically important characteristics
such as 1) distance (shortest path length) distribution, critical
to scalability of routing protocols; 2) spectrum (set of eigenvalues
of graphs adjacency matrix), important for network robustness and
performance analysis; and 3) betweenness (the number of shortest
paths passing through a node or link) distribution, important for
analyzing the hierarchical structure of the network and for estimating
the accuracy of traceroute-like explorations of the network when
developing methods for synthesizing topologies.
Brute force attempts at generating synthetic topologies that reproduce
any of the above three metrics do not work. We know of no known
algorithms for constructing a graph with a target distance distribution,
betweenness distribution, or spectrum. Furthermore, these approaches
suffer from a methodological problem whereby the discovery of new
metrics require updates to the underlying algorithm for the topology
generator.
So, CAIDA researchers postulated that one could reproduce most
simple characteristics that may not have direct practical importance
and, in the process, also capture other properties including
practically important attributes. More precisely, for any graph
G, we wish to identify a series of graph properties
that satisfy the requirements:
-
constructibility, we can construct graphs having these properties;
-
inclusion, any property Ρd subsumes all properties
Ρi with i = 0, ... , d - 1: that is,
a graph having property Ρd is guaranteed to
also have all properties Ρi for i <
d.
-
convergence: as d increases, the set of graphs having
property Ρd "converges" to G: that is, there exists a
value of index d = D such that all graphs having
property ΡD are isomorphic to G.
Our research shows that the connectivity
of the nodes in a network provide the most fundamental topology
metrics for use in synthetic topology generation. Table 1 below lists these
characteristics ordered by an increasing amount of information
about local connectivity structure of the network.
| Tag |
Name |
Degree correlations of nodes at distance |
| 0K | Average node degree | None |
| 1K | Degree distribution | 0 |
| 2K | Joint node degree distribution | 1 |
| 3K | Joint edge degree distribution | 2 |
| ... | ... | ... |
| (d+1)K | Full degree distribution | D |
Table 1: Connectivity characteristics of a network with maximum distance D.
Table 1 shows that as we increase the value of d in our
notation dK, where d represents the number of specified
nodes with degree correlations, we define distributions of degrees
of nodes located at a maximum distance D from each other.
In doing so, we gain information about the local structure of the
topology which allows a more accurate description of a node's
(increasingly wider) neighborhood. This area of research asks the
question: do we need to go all the way up to (d+1)K to
construct graphs that accurately reproduce values of commonly-used
graph metrics or can we obtain reasonable accuracy at a lower value
of d, such as 2 or 3?
To this end, we implemented efficient topology generators that
construct graphs for d = 0,1,2, and 3. We
find that these graphs reproduce, with increasing accuracy, important
properties of measured and modeled Internet topologies. In particular,
we find that the d = 2 case provides
sufficient accuracy for most practical purposes, while d = 3
reconstructs the Internet AS- and router-level topologies exactly. This
systematic method of analyzing and synthesizing topologies offers a
dramatic improvement to the set of tools available to network topology
and protocol researchers.
The coarsest approximation of the structure of a complex network, such as the Internet, is a
simple undirected unweighted graph. This approximation, however, loses too much detail. In
reality, objects represented by vertices and edges in such a graph possess some non-trivial internal
structure that varies across and differentiates among distinct types of links or nodes. In this
work, we abstract such additional information as network annotations. We introduce a network
topology modeling framework that treats annotations as an extended correlation profile of a
network. Assuming we have this profile measured for a given network, we present an algorithm to
rescale it in order to construct networks of varying size that still reproduce the original measured
annotation profile.
Using this methodology, we accurately capture the network properties essential for realistic simulations
of network applications and protocols, or any other simulations involving complex network
topologies, including modeling and simulation of network evolution. We apply our approach to the
Autonomous System (AS) topology of the Internet annotated with business relationships between
ASs. This topology captures the large-scale structure of the Internet. In depth understanding of
this structure and tools to model it are cornerstones of research on future Internet architectures
and designs. We find that our techniques are able to accurately capture the structure of annotation
correlations within this topology, thus reproducing a number of its important properties in
synthetically-generated random graphs.