# Graph Theory Applied to Topology Analysis

## Combinatorial Core

The **combinatorial core** of a directed graph is its subset
obtained by iterative stripping of nodes that have outdegree
0 and of 2-loops involving nodes that have no other outgoing
edges except those connecting them to each other. Note that
the indegree of either type of stripped node may be
arbitrarily large.

In typical skitter snapshots we have studied, the combinatorial core contains approximately 10% of the nodes of the original non-stripped graph.

## Extended Core

The iterative stripping procedure may be selectively applied
only to the nodes of outdegree 0 that have indegree 1. In
that case, the stubs of the graph, i.e., trees connected
to the rest of the graph by a single link, are stripped.
We call this superset of the core the **extended core**, and
in our skitter topology snapshots the extended core typically
contains 2/3 of the nodes of the original graph, a much
greater fraction than in the positive outdegree part (our
**combinatorial core**).

## Reachability

- We call a node B
**reachable**from a node A if there is a directed path from A to B, i.e. a path in which each directed edge is taken toward B in the proper direction. - A
**strongly connected component**of a graph is a set of nodes such that each node pair (A,B) has a path from A to B and a path from B to A.

## Connected Component

If A is reachable from B and B is reachable from A, then
any node reachable from A is reachable from B and vice-versa.
Conversely, if the set of nodes reachable from A is the
same as the set of nodes reachable from B, then A is
reachable from B and B is reachable from A, since we always
assume that A is reachable from A. This property means that
**a connected component is uniquely defined by the set of
nodes that are reachable from nodes of this component**. Note
that different connected components will reach different,
although not necessarily disjoint, sets of nodes.

## Giant Component

A connected component is called giant component if it contains more than 50% of the nodes in the graph.

In the cases of skitter data we have analyzed, the
**giant component** and the **combinatorial core** are very
close to each other - in all our examples the giant
component contains over 85% of the core. Furthermore,
almost all of the core is reachable from the **giant
component**, typically near 99%.

## Shortest Paths vs. Routing Policy

**Shortest paths** on Internet topology graphs suffer from
potential non-conformance with **routing policies**. We can
view them only as shortcuts or lower bounds for paths
actually taken by packets. Although shortest-path physical
connectivity is there, policy prevents its use. (Skitter
data suggests that Internet paths actually traversed by
packets were often twice as long as the shortest possible
paths.)