Center for Applied Internet Data Analysis
The dK-series and dK Annotations    dK-series and dK annotations are overviewed here.

For further details on the dK-series and dK annotations, see the original paper "Systematic Topology Analysis and Generation Using Degree Correlations" (2006) describing the dK-series, and "How Small Are Building Blocks of Complex Networks" (2009) which discusses dK-randomization.

## Realistic Topology Generation: The dK-series

The ability to capture the fundamental characteristics that define the Internet topology, and then apply this knowledge to produce a tool that generates such a topology stands as a key component toward the development and evaluation of new routing algorithms and protocols and to the study of what drives Internet evolution.

Intuitively, one focuses on practically important characteristics such as 1) distance (shortest path length) distribution, critical to scalability of routing protocols; 2) spectrum (set of eigenvalues of graphs adjacency matrix), important for network robustness and performance analysis; and 3) betweenness (the number of shortest paths passing through a node or link) distribution, important for analyzing the hierarchical structure of the network and for estimating the accuracy of traceroute-like explorations of the network when developing methods for synthesizing topologies.

Brute force attempts at generating synthetic topologies that reproduce any of the above three metrics do not work. We know of no known algorithms for constructing a graph with a target distance distribution, betweenness distribution, or spectrum. Furthermore, these approaches suffer from a methodological problem whereby the discovery of new metrics require updates to the underlying algorithm for the topology generator.

So, CAIDA researchers postulated that one could reproduce most simple characteristics that may not have direct practical importance and, in the process, also capture other properties including practically important attributes. More precisely, for any graph G, we wish to identify a series of graph properties that satisfy the requirements:

1. constructibility, we can construct graphs having these properties;
2. inclusion, any property Ρd subsumes all properties Ρi with i = 0, ... , d - 1: that is, a graph having property Ρd is guaranteed to also have all properties Ρi for i < d.
3. convergence: as d increases, the set of graphs having property Ρd "converges" to G: that is, there exists a value of index d = D such that all graphs having property ΡD are isomorphic to G.

Our research shows that the connectivity of the nodes in a network provide the most fundamental topology metrics for use in synthetic topology generation. Table 1 below lists these characteristics ordered by an increasing amount of information about local connectivity structure of the network.

TagNameDegree correlations
of nodes at distance
0KAverage node degreeNone
1KDegree distribution0
2KJoint node degree distribution1
3KJoint edge degree distribution2
.........
(d+1)KFull degree distributionD

Table 1: Connectivity characteristics of a network with maximum distance D.

Table 1 shows that as we increase the value of d in our notation dK, where d represents the number of specified nodes with degree correlations, we define distributions of degrees of nodes located at a maximum distance D from each other. In doing so, we gain information about the local structure of the topology which allows a more accurate description of a node's (increasingly wider) neighborhood. This area of research asks the question: do we need to go all the way up to (d+1)K to construct graphs that accurately reproduce values of commonly-used graph metrics or can we obtain reasonable accuracy at a lower value of d, such as 2 or 3?

To this end, we implemented efficient topology generators that construct graphs for d = 0,1,2, and 3. We find that these graphs reproduce, with increasing accuracy, important properties of measured and modeled Internet topologies. In particular, we find that the d = 2 case provides sufficient accuracy for most practical purposes, while d = 3 reconstructs the Internet AS- and router-level topologies exactly. This systematic method of analyzing and synthesizing topologies offers a dramatic improvement to the set of tools available to network topology and protocol researchers.

## dK annotations

The coarsest approximation of the structure of a complex network, such as the Internet, is a simple undirected unweighted graph. This approximation, however, loses too much detail. In reality, objects represented by vertices and edges in such a graph possess some non-trivial internal structure that varies across and differentiates among distinct types of links or nodes. In this work, we abstract such additional information as network annotations. We introduce a network topology modeling framework that treats annotations as an extended correlation profile of a network. Assuming we have this profile measured for a given network, we present an algorithm to rescale it in order to construct networks of varying size that still reproduce the original measured annotation profile.

Using this methodology, we accurately capture the network properties essential for realistic simulations of network applications and protocols, or any other simulations involving complex network topologies, including modeling and simulation of network evolution. We apply our approach to the Autonomous System (AS) topology of the Internet annotated with business relationships between ASs. This topology captures the large-scale structure of the Internet. In depth understanding of this structure and tools to model it are cornerstones of research on future Internet architectures and designs. We find that our techniques are able to accurately capture the structure of annotation correlations within this topology, thus reproducing a number of its important properties in synthetically-generated random graphs.