The combinatorial core of a directed graph is its subset obtained by iterative stripping of nodes that have outdegree 0 and of 2-loops involving nodes that have no other outgoing edges except those connecting them to each other. Note that the indegree of either type of stripped node may be arbitrarily large.
In typical skitter snapshots we have studied, the combinatorial core contains approximately 10% of the nodes of the original non-stripped graph.
The iterative stripping procedure may be selectively applied only to the nodes of outdegree 0 that have indegree 1. In that case, the stubs of the graph, i.e., trees connected to the rest of the graph by a single link, are stripped. We call this superset of the core the extended core, and in our skitter topology snapshots the extended core typically contains 2/3 of the nodes of the original graph, a much greater fraction than in the positive outdegree part (our combinatorial core).
- We call a node B reachable from a node A if there is a directed path from A to B, i.e. a path in which each directed edge is taken toward B in the proper direction.
- A strongly connected component of a graph is a set of nodes such that each node pair (A,B) has a path from A to B and a path from B to A.
If A is reachable from B and B is reachable from A, then any node reachable from A is reachable from B and vice-versa. Conversely, if the set of nodes reachable from A is the same as the set of nodes reachable from B, then A is reachable from B and B is reachable from A, since we always assume that A is reachable from A. This property means that a connected component is uniquely defined by the set of nodes that are reachable from nodes of this component. Note that different connected components will reach different, although not necessarily disjoint, sets of nodes.
A connected component is called giant component if it contains more than 50% of the nodes in the graph.
In the cases of skitter data we have analyzed, the giant component and the combinatorial core are very close to each other - in all our examples the giant component contains over 85% of the core. Furthermore, almost all of the core is reachable from the giant component, typically near 99%.
Shortest Paths vs. Routing Policy
Shortest paths on Internet topology graphs suffer from potential non-conformance with routing policies. We can view them only as shortcuts or lower bounds for paths actually taken by packets. Although shortest-path physical connectivity is there, policy prevents its use. (Skitter data suggests that Internet paths actually traversed by packets were often twice as long as the shortest possible paths.)