Skip to contents

Calculates goodness of fit statistics for relational data matrices, evaluating second-order (dyadic) and third-order (triadic) dependence patterns. These statistics are useful for assessing model fit in network analysis and relational data modeling.

Usage

gof_stats(Y)

Arguments

Y

a relational data matrix (n x n square matrix) where Y\[i,j\] represents the relationship from node i to node j. Missing values (NA) are allowed and will be handled appropriately.

Value

A named numeric vector containing five goodness-of-fit statistics:

sd.rowmean

Standard deviation of row means. Measures the heterogeneity in out-degree centrality (sender effects). Higher values indicate more variation in how active nodes are as senders.

sd.colmean

Standard deviation of column means. Measures the heterogeneity in in-degree centrality (receiver effects). Higher values indicate more variation in how popular nodes are as receivers.

dyad.dep

Dyadic dependence/reciprocity correlation. Pearson correlation between Y\[i,j\] and Y\[j,i\] across all dyads. Positive values indicate reciprocity (mutual relationships), negative values indicate anti-reciprocity. Range: \[-1, 1\].

cycle.dep

Cyclic/transitive triadic dependence. Normalized sum of products along three-cycles (i to j to k to i). Positive values indicate transitivity clustering, where 'a friend of a friend is a friend'. Based on the trace of the cubed centered matrix, normalized by the trace of the cubed data availability matrix and the cubed standard deviation.

trans.dep

Transitive triadic dependence. Normalized sum of products along two-paths that close into triangles (i to j to k with k to i). Measures the tendency for open triads to close. Based on the trace of the product E*E'*E where E is the centered matrix, normalized appropriately.

Details

The function computes network statistics that capture different aspects of network structure beyond simple density. These statistics are particularly useful for:

  • Model checking: comparing observed statistics to those from simulated networks

  • Model selection: choosing between models that better capture network dependencies

  • Descriptive analysis: summarizing key structural features of the network

Missing values in Y are handled by pairwise deletion for correlations and are excluded from matrix products in triadic calculations.

Author

Cassy Dorff, Shahryar Minhas, Tosin Salau

Examples


data(YX_nrm) 

gof_stats(YX_nrm$Y) 
#> sd.rowmean sd.colmean   dyad.dep  cycle.dep  trans.dep 
#> 0.92646818 0.27555881 0.66792884 0.06139376 0.07380099