The contents of this legacy page are no longer maintained nor supported, and are made available only for historical purposes.

topostats

topostats is a package of programs that calculate various statistics on network topologies (that is, on graphs). The computed statistics are defined in the paper "Lessons from Three Views of the Internet Topology: Technical Report", which also gives a sample of the computed statistics. The topostats package computes most but not all of these statistics, and this package does not itself make any plots.

Download

topostats-1.1beta.tar.gz

Also available on GitHub

Description of results

The statistics currently output by these tools are:

  • Number of nodes
    The count of unique nodes found in the graph
  • Number of edges
    The count of undirected edges between nodes found in the graph
  • Average node degree
    The sum of the degrees for each node, divided by the number of nodes
  • Maximum node degree
    The largest degree found across all nodes
  • Degree distribution exponent
    The calculated exponent of a power law distribution that best fits the distribution of degrees in the graph
  • Power-law maximum degree
    The natural cutoff value of the power law distribution that best fits the distribution of degrees in the graph, calculated from the degree distribution exponent
  • Normalized average average neighbor degree
    The value of each node's average neighbor degree (the sum of its neighbors' degrees, divided by the number of neighbors), summed over all nodes and divided by the number of nodes, then further divided by the number of nodes minus 1
  • Normalized maximum average neighbor degree
    The maximum value of average neighbor degree (the sum of a node's neighbors' degrees, divided by the number of neighbors) across all nodes, divided by the number of nodes minus 1
  • Assortative coefficient
    The Pearson correlation coefficient, which indicates the amount of relationship between nodes of similar or dissimilar degrees
  • Mean clustering
    The local clustering of each node (defined as the ratio of the number of links between a node's neighbors to the maximum possible number of links), summed over all nodes and divided by the total number of nodes.
  • Clustering coefficient
    The percentage of 3-cycles (caused by three nodes that are all connected to each other) among all the connected node triplets in the graph
  • Top clique size
    The largest number of nodes, when sorted by descending degree, that are completely connected to each other
  • Minimum node coreness
    The smallest coreness (defined as the maximum k for which, after nodes with degree k are iteratively removed from the graph, the node is still in the graph) across all nodes
  • Average node coreness
    The coreness (defined as the maximum k for which, after nodes with degree k are iteratively removed from the graph, the node is still in the graph) of all nodes, divided by the number of nodes
  • Maximum node coreness
    The largest coreness (defined as the maximum k for which, after nodes with degree k are iteratively removed from the graph, the node is still in the graph) across all nodes
  • Core size
    The number of nodes that have maximum coreness (defined above)
  • Minimum degree in core
    The smallest degree (prior to node removal) found amongst those nodes with maximum coreness (defined above)
  • Fringe size
    The number of nodes that have minimum coreness (defined above)
  • Maximum degree in fringe
    The largest degree (prior to node removal) found amongst those nodes with minimum coreness (defined above)
  • Average distance
    The number of hops between all pairs of nodes, summed and divided by the number of node pairs.
  • Standard deviation of distance
    The standard deviation of the distribution of the number of hops between all pairs of nodes
  • Average eccentricity
    The eccentricity (defined as the maximum distance from one node to any other node) of each node, summed and divided by the number of nodes
  • Graph radius
    The minimum eccentricity (defined above) across all nodes
  • Minimum degree in graph center
    The smallest degree amongst all nodes that have minimum eccentricity (defined above)
  • Maximum degree in graph periphery
    The largest degree amongst all nodes that have maximum eccentricity (defined above)
  • Minimum node betweeness
    The smallest betweenness (which measures how often a node or edge is on the shortest path between other nodes) across all nodes
  • Average node betweeness
    The betweenness (which measures how often a node or edge is on the shortest path between other nodes) of all nodes, summed and divided by the number of nodes
  • Maximum node betweeness
    The largest betweenness (which measures how often a node or edge is on the shortest path between other nodes) across all nodes
  • Minimum edge betweeness
    The smallest betweenness (which measures how often a node or edge is on the shortest path between other nodes) across all edges
  • Average edge betweeness
    The betweenness (which measures how often a node or edge is on the shortest path between other nodes) of all edges, summed and divided by the number of edges
  • Maximum edge betweeness
    The largest betweenness (which measures how often a node or edge is on the shortest path between other nodes) across all edges

Tools and example output

The included tools are

  • topology_stats: calculates many simple statistics,
  • distance: calculates statistics on the distance between all nodes in the graph,
  • betweenness: calculates statistics on the betweenness of all nodes/links in the graph, and
  • components: a utility for viewing all connected components in a graph and for optionally extracting the largest connected component into a separate graph file.

Shown below is the output of these tools on the skitter dataset from the data supplement to "Lessons from Three Views of the Internet Topology: Technical Report".

topology_stats output:

Number of nodes:  9204
Number of edges:  28959
Avg node degree:  6.29269882659713
Max node degree:  2070
Degree dist exponent (via CCDF) [warning: can be inaccurate]:  2.13036587884151
Power-law maximum degree [warning: can be inaccurate]:  3212
Normalized avg avg neighbor degree:  0.0469423442421394
Normalized max avg neighbor degree:  0.0530352021237445
Assortative coefficient:  -0.235624710744895
Mean clustering:  0.456656295004884
Clustering coefficient:  0.0257810954106928
Top clique size:  16
Min node coreness:  0
Avg node coreness:  2.22761842677097
Max node coreness:  27
Core size:  47
Min degree in core:  68
Fringe size:  2460
Max degree in fringe:  5

distance output:

loaded 9204 nodes, 28959 undirected links, 42352206 pairs
... decreasing radius from 0 to 5 with node 1
... raising diameter from 0 to 5 with node 1
... raising diameter from 5 to 6 with node 9
... decreasing radius from 5 to 4 with node 22
... raising diameter from 6 to 7 with node 3287
... decreasing radius from 4 to 1 with node 21372
average distance = 3.115
std deviation of distance = 0.635
average eccentricity = 5.108
graph radius = 1
graph diameter = 7
min degree in graph center = 1
max degree in graph periphery = 1

betweenness output:

loaded 9204 nodes, 28959 undirected links, 42352206 pairs
min node betweenness = 0.0000e+00
average node betweenness = 2.2997e-04
max node betweenness = 2.4114e-01
min edge betweenness = 2.3617e-08
average edge betweenness = 1.0760e-04
max edge betweenness = 8.6025e-03

components output:

loaded 9204 nodes, 28959 undirected links
component at node 1: 9200 nodes, 28957 undirected links
component at node 21372: 2 nodes, 1 undirected links
component at node 21437: 2 nodes, 1 undirected links
3 components; largest at node ID 1
    

Related Objects

See https://catalog.caida.org/software/topostats/ to explore related objects to this document in the CAIDA Resource Catalog.
Published
Last Modified