Skip to contents

The problem

Actors in political and social networks do not behave independently. When a state initiates conflict with a particular target, that action may reshape how other states behave toward the same or related targets in subsequent periods. Alliance partners may follow suit, rivals may escalate, and third parties may recalibrate their own strategies in response. These dynamics generate higher-order dependencies across the network that standard regression approaches cannot capture. The fundamental challenge is not merely that observations are correlated, but that the structure of influence itself is substantively meaningful: understanding who influences whom, and through what channels, is often the central question.

Traditional latent variable models for networks describe the overall structure of interactions, positioning actors in a social space based on transitivity or stochastic equivalence. However, while these models can effectively characterize broad network patterns, they frequently fall short in providing detailed explanations for the specific influence that actors exert on one another. The factors driving influence are left unexplored, attributed to latent dimensions rather than to the observable actor-level and dyad-level covariates that substantive theories emphasize. To address this limitation, the Social Influence Regression (SIR) model regresses influence patterns directly on observable covariates. The model operates on longitudinal network data (a time series of n×nn \times n relational matrices) and estimates how past interactions across the network predict current outcomes as a function of covariates such as alliances, trade ties, and geographic proximity.

The methodological framework is introduced in:

Minhas, S. & Hoff, P.D. (2025). Decomposing Network Dynamics: Social Influence Regression. Political Analysis.

Model specification

Let Y={Yt:t=1,,T}Y = \{Y_t : t = 1, \ldots, T\} be a time series of n×nn \times n relational matrices, where yi,j,ty_{i,j,t} represents the directed outcome from node ii to node jj at time tt. The SIR model specifies:

μi,j,t=𝛉𝐳i,j,t+𝛂X̃i,j,t𝛃\mu_{i,j,t} = \boldsymbol{\theta}^\top \mathbf{z}_{i,j,t} + \boldsymbol{\alpha}^\top \tilde{X}_{i,j,t} \boldsymbol{\beta}

The first term is a standard regression: 𝐳i,j,t\mathbf{z}_{i,j,t} collects exogenous covariates for dyad (i,j)(i,j) at time tt (geographic distance, alliance status, trade flows), and 𝛉\boldsymbol{\theta} gives their direct effects on the outcome. The second term is where the model departs from standard approaches. The quantity X̃i,j,t\tilde{X}_{i,j,t} is constructed from the lagged network state interacted with influence covariates 𝐖\mathbf{W}, and the bilinear form 𝛂X̃i,j,t𝛃\boldsymbol{\alpha}^\top \tilde{X}_{i,j,t} \boldsymbol{\beta} allows each dyad’s outcome to depend on the entire prior network, weighted by sender (𝛂\boldsymbol{\alpha}) and receiver (𝛃\boldsymbol{\beta}) influence parameters. This structure is what allows the model to capture third-order dependencies: how one actor’s past behavior toward a third party predicts another actor’s current behavior.

For any pair of actors (i,i)(i, i'), the entry ai,ia_{i,i'} tells us how predictive ii'’s past sending behavior is of ii’s current sending behavior. If aGBR,USA>0a_{\text{GBR}, \text{USA}} > 0 in a conflict network, it indicates that countries the USA initiated conflict with in period t1t-1 tend to also face conflict from the UK in period tt. The influence is directional and asymmetric: the USA’s conflict behavior is predictive of the UK’s, but the reverse need not hold with the same magnitude.

Influence matrices

The influence parameters ai,ia_{i,i'} and bj,jb_{j,j'} are not estimated freely for every pair, as that would require O(n2)O(n^2) parameters. Instead, the model explains influence in terms of covariates:

ai,i=αwi,ibj,j=βwj,ja_{i,i'} = \alpha^\top w_{i,i'} \qquad b_{j,j'} = \beta^\top w_{j,j'}

where wi,iw_{i,i'} is a vector of covariates describing the relationship between actors ii and ii' (distance, alliance status, trade ties). The full influence matrices are then:

𝐀=r=1pαr𝐖r𝐁=r=1pβr𝐖r\mathbf{A} = \sum_{r=1}^{p} \alpha_r \mathbf{W}_r \qquad \mathbf{B} = \sum_{r=1}^{p} \beta_r \mathbf{W}_r

with α1=1\alpha_1 = 1 fixed for identifiability. This parameterization brings the parameter count down to O(p)O(p) where pp is the number of influence covariates, typically a handful. The α\alpha and β\beta coefficients tell us which covariates matter for influence and by how much: a positive α\alpha on the alliance covariate, for instance, would indicate that allied countries tend to initiate conflict with the same targets. This emphasis on covariate-driven explanation is what distinguishes the SIR framework from latent variable approaches. Rather than describing influence through unobserved dimensions, the model links influence directly to measured actor and dyad attributes.

Distribution families

The framework is based on a generalized bilinear model and extends naturally to different outcome types:

Family Link Example
Poisson g(μ)=log(μ)g(\mu) = \log(\mu) Monthly conflict event counts between countries
Normal g(μ)=μg(\mu) = \mu Bilateral trade volumes
Binomial g(μ)=logit(μ)g(\mu) = \text{logit}(\mu) Presence or absence of a diplomatic tie

Identifiability

The bilinear term 𝛂X𝛃\boldsymbol{\alpha}^\top X \boldsymbol{\beta} has a scale ambiguity: multiplying 𝛂\boldsymbol{\alpha} by 1/c1/c and 𝛃\boldsymbol{\beta} by cc yields the same product for any nonzero scalar cc. To resolve this, the first element of 𝛂\boldsymbol{\alpha} is fixed at 1. The package handles this constraint automatically during estimation.

A simpler alternative is to fix 𝐁=𝐈\mathbf{B} = \mathbf{I} entirely (fix_receiver = TRUE), which removes the identification issue and reduces the model to a standard GLM. This is a useful starting point when the research question concerns sender-side influence alone, and it produces well-conditioned standard errors without requiring bootstrap corrections.

Estimation

Estimating {𝛉,𝛂,𝛃}\{\boldsymbol{\theta}, \boldsymbol{\alpha}, \boldsymbol{\beta}\} jointly is difficult because the model is bilinear in 𝛂\boldsymbol{\alpha} and 𝛃\boldsymbol{\beta}. The package addresses this with an iterative block coordinate descent algorithm that exploits a key structural property: for fixed 𝛃\boldsymbol{\beta}, the model is linear in 𝛉\boldsymbol{\theta} and 𝛂\boldsymbol{\alpha} (and vice versa). The procedure initializes 𝛃\boldsymbol{\beta}, then alternates between two steps: first, fixing 𝛃\boldsymbol{\beta} and estimating (𝛉,𝛂)(\boldsymbol{\theta}, \boldsymbol{\alpha}) via GLM; second, fixing 𝛂\boldsymbol{\alpha} and estimating (𝛉,𝛃)(\boldsymbol{\theta}, \boldsymbol{\beta}) via GLM. Iteration continues until the relative change in deviance falls below a tolerance. Each sub-problem is a standard generalized linear model solved by iterative weighted least squares, so the full estimation reduces to a sequence of low-dimensional optimizations. This is substantially faster than the Bayesian approach originally used for bilinear network autoregressions.

The package also provides a direct BFGS method (method = "optim") that optimizes all parameters simultaneously using analytical gradients computed via C++. This approach can converge faster for small networks but tends to be less stable when pp is large.

Inference

Standard errors come from the Hessian of the log-likelihood at the MLE. The package computes classical standard errors from the observed information matrix H1H^{-1} as well as robust (sandwich) standard errors H1SH1H^{-1} S H^{-1}, where SS is the empirical score covariance. The sandwich estimator remains valid under model misspecification.

The Hessian can be ill-conditioned in bilinear models, and the package warns when this occurs. In such cases, bootstrap standard errors via boot_sir() provide a more reliable basis for inference. The bootstrap supports both block resampling of time periods and parametric simulation from the fitted model.

Choosing a model configuration

Consideration Full bilinear Fixed receiver
Both sender and receiver influence
Standard GLM with well-conditioned standard errors
Bilinear identification resolved
Richer influence structure
Fewer parameters

For exploratory analysis, the fixed-receiver specification is a reasonable starting point: fewer parameters, no identification issues, and proper standard errors from the GLM. The full bilinear model is appropriate when there is reason to believe that both sender and receiver channels contribute to influence dynamics.

Consideration ALS Direct optimization
High-dimensional problems
Small problems (fast convergence)
Numerical stability
Simultaneous parameter optimization

Citation

If you use this package in your research, please cite:

Minhas, S. & Hoff, P.D. (2025). Decomposing Network Dynamics: Social Influence Regression. Political Analysis.

The package source code and documentation are available at https://github.com/netify-dev/sir.