Tests whether connected actors have similar attributes (homophily). Calculates the correlation between attribute similarity and tie presence, with support for multiple similarity metrics and significance testing.
Usage
homophily(
netlet,
attribute,
method = "correlation",
threshold = 0,
significance_test = TRUE,
n_permutations = 1000,
alpha = 0.05,
other_stats = NULL,
...
)
Arguments
- netlet
A netify object containing network data.
- attribute
Character string specifying the nodal attribute to analyze.
- method
Character string specifying the similarity metric:
- "correlation"
Negative absolute difference for continuous data (default)
- "euclidean"
Negative euclidean distance for continuous data
- "manhattan"
Negative Manhattan/city-block distance for continuous data
- "cosine"
Cosine similarity for continuous data
- "categorical"
Binary similarity (0/1) for categorical data
- "jaccard"
Jaccard similarity for binary/presence-absence data
- "hamming"
Negative Hamming distance for categorical data
- threshold
Numeric value or function to determine tie presence in weighted networks. If numeric, edges with weights > threshold are considered ties. If a function, it should take the network matrix and return a logical matrix. Default is 0 (any positive weight is a tie). Common values: 0 (default), mean(weights), median(weights), or quantile-based thresholds. For pre-binarized networks, consider using
mutate_weights()
first.- significance_test
Logical. Whether to perform permutation test. Default TRUE.
- n_permutations
Number of permutations for significance testing. Default 1000.
- alpha
Significance level for confidence intervals. Default 0.05.
- other_stats
Named list of custom functions for additional statistics.
- ...
Additional arguments passed to custom functions.
Value
Data frame with homophily statistics per network/time period:
net
Network/time identifier
layer
Layer name
attribute
Analyzed attribute name
method
Similarity method used
threshold_value
Threshold used for determining ties (NA for binary networks)
homophily_correlation
Correlation between similarity and tie presence (binary: tie/no tie)
mean_similarity_connected
Mean similarity among connected pairs (weight > threshold)
mean_similarity_unconnected
Mean similarity among unconnected pairs (weight <= threshold or missing)
similarity_difference
Difference between connected and unconnected mean similarities
p_value
Permutation test p-value
ci_lower
,ci_upper
Confidence interval bounds
n_connected_pairs
Number of connected pairs
n_unconnected_pairs
Number of unconnected pairs
Details
Similarity Metrics:
For continuous attributes:
correlation
: Based on absolute difference, good general purpose metriceuclidean
: Similar to correlation for single attributesmanhattan
: Less sensitive to outliers than euclideancosine
: Useful for normalized data or when sign matters
For categorical/binary attributes:
categorical
: Simple matching (1 if same, 0 if different)jaccard
: For binary data, emphasizes shared presence over shared absencehamming
: Counts positions where values differ (negated for similarity)
Threshold Parameter:
For weighted networks, the threshold
parameter determines what edge weights
constitute a "connection". You can specify:
A numeric value: edges with weight > threshold are ties
A function: should take a matrix and return a single numeric threshold
Common threshold functions:
function(x) mean(x, na.rm = TRUE)
- mean weightfunction(x) median(x, na.rm = TRUE)
- median weightfunction(x) quantile(x, 0.75, na.rm = TRUE)
- 75th percentile
For more complex binarization needs (e.g., different thresholds by time period),
consider using mutate_weights()
to pre-process your network.
Examples
if (FALSE) { # \dontrun{
# Basic homophily analysis with default threshold (> 0)
homophily_default <- homophily(net, attribute = "group")
# Using different similarity metrics for continuous data
homophily_manhattan <- homophily(
net,
attribute = "age",
method = "manhattan" # Less sensitive to outliers
)
# For binary attributes (e.g., gender, membership)
homophily_jaccard <- homophily(
net,
attribute = "member",
method = "jaccard" # Better for binary data than correlation
)
# For categorical attributes
homophily_categorical <- homophily(
net,
attribute = "department",
method = "categorical"
)
# Combining method and threshold
homophily_combined <- homophily(
net,
attribute = "score",
method = "manhattan",
threshold = function(x) quantile(x, 0.75, na.rm = TRUE)
)
} # }