Copy this text
Graphlet correlation distance to compare small graphs
Graph models are standard for representing mutual relationships between sets of entities. Often, graphs deal with a large number of entities with a small number of connections (e.g. social media relationships, infectious disease spread). The distances or similarities between such large graphs are known to be well established by the Graphlet Correlation Distance (GCD). This paper deals with small graphs (with potentially high densities of connections) that have been somewhat neglected in the literature but that concern important fora like sociology, ecology and fisheries, to mention some examples. First, based on numerical experiments, we study the conditions under which Erdős-Rényi, Fitness Scale-Free, Watts-Strogatz small-world and geometric graphs can be distinguished by a specific GCD measure based on 11 orbits, the GCD11. This is done with respect to the density and the order (i.e. the number of nodes) of the graphs when comparing graphs with the same and different orders. Second, we develop a randomization statistical test based on the GCD11 to compare empirical graphs to the four possible null models used in this analysis and apply it to a fishing case study where graphs represent pairwise proximity between fishing vessels. The statistical test rules out independent pairing within the fleet studied which is a standard assumption in fisheries. It also illustrates the difficulty to identify similarities between real-world small graphs and graph models.
Full Text
File | Pages | Size | Access | |
---|---|---|---|---|
Publisher's official version | 20 | 4 Mo | ||
S1 Fig. Quality of clustering (AUPR) for pairs of models. | - | 296 Ko | ||
S2 Fig. Probability of correctly distinguishing two random graph models. | - | 458 Ko | ||
S1 Eq. To account for the difference in variability between the correlation coefficients of each pair of orbits, we also computed the following standardised distance between GCM(Mk) and where σ(i, j) | 1 | 130 Ko | ||
S1 Table. Estimated p-values (std). | 1 | 107 Ko |