Title: | Non-Regularized Gaussian Graphical Models |
Description: | Estimate non-regularized Gaussian graphical models, Ising models, and mixed graphical models. The current methods consist of multiple regression, a non-parametric bootstrap <doi:10.1080/00273171.2019.1575716>, and Fisher z transformed partial correlations <doi:10.1111/bmsp.12173>. Parameter uncertainty, predictability, and network replicability <doi:10.31234/osf.io/fb4sa> are also implemented. |
Authors: | Donald Williams [aut, cre] |
Maintainer: | Donald Williams <[email protected]> |
License: | GPL-2 |
Version: | 1.1.0 |
Built: | 2025-02-09 03:30:08 UTC |
Source: | https://github.com/donaldrwilliams/ggmnonreg |
The goal of GGMnonreg is to estimate non-regularized graphical models. Note that the title is a bit of a misnomer, in that Ising and mixed graphical models are also supported. Graphical modeling is quite common in fields with wide data, that is, when there are more variables than observations. Accordingly, many regularization-based approaches have been developed for those kinds of data. There are key drawbacks of regularization when the goal is inference, including, but not limited to, the fact that obtaining a valid measure of parameter uncertainty is very (very) difficult.
More recently, graphical modeling has emerged in psychology, where the data are typically long or low-dimensional (Williams and Rast 2019; Williams et al. 2019). The primary purpose of GGMnonreg is to provide methods specifically for low-dimensional data
Supported Models
Gaussian graphical model. The following data types are supported.
Ising model
Mixed graphical model
Additional Methods
Expected network replicability (Williams 2020)
Compare Gaussian graphical models
Measure of uncertainty (Williams 2021)
Edge inclusion "probabilities"
Network visualization
Constrained precision matrix (the network, given an assumed graph)
Predictability (variance explained)
Williams DR (2020).
“Learning to live with sampling variability: Expected replicability in partial correlation networks.”
Williams DR (2021).
“The Confidence Interval that Wasn’t: Bootstrapped “Confidence Intervals” in L1-Regularized Partial Correlation Networks.”
doi:10.31234/osf.io/kjh2f, psyarxiv.com/kjh2f.
Williams DR, Rast P (2019).
“Back to the basics: Rethinking partial correlation network methodology.”
British Journal of Mathematical and Statistical Psychology.
ISSN 0007-1102, doi:10.1111/bmsp.12173.
Williams DR, Rhemtulla M, Wysocki AC, Rast P (2019).
“On nonregularized estimation of psychological networks.”
Multivariate behavioral research, 54(5), 719–750.
A correlation matrix with 17 variables in total (autsim: 9; OCD: 8). The sample size was 213.
A correlation matrix including 17 variables. These data were measured on a 4 level likert scale.
Circumscribed interests
Unusual preoccupations
Repetitive use of objects or interests in parts of objects
Compulsions and/or rituals
Unusual sensory interests
Complex mannerisms or stereotyped body movements
Stereotyped utterances/delayed echolalia
Neologisms and/or idiosyncratic language
Verbal rituals
Concern with things touched due to dirt/bacteria
Thoughts of doing something bad around others
Continual thoughts that do not go away
Belief that someone/higher power put reoccurring thoughts in their head
Continual washing
Continual checking CntCheck
Continual counting/repeating
Repeatedly do things until it feels good or just right
Jones, P. J., Ma, R., & McNally, R. J. (2019). Bridge centrality: A network approach to understanding comorbidity. Multivariate behavioral research, 1-15.
Ruzzano, L., Borsboom, D., & Geurts, H. M. (2015). Repetitive behaviors in autism and obsessive-compulsive disorder: New perspectives from a network analysis. Journal of Autism and Developmental Disorders, 45(1), 192-202. doi:10.1007/s10803-014-2204-9
data("asd_ocd") # generate continuous Y <- MASS::mvrnorm(n = 213, mu = rep(0, 17), Sigma = asd_ocd, empirical = TRUE)
data("asd_ocd") # generate continuous Y <- MASS::mvrnorm(n = 213, mu = rep(0, 17), Sigma = asd_ocd, empirical = TRUE)
This dataset and the corresponding documentation was taken from the psych package. We refer users to that package for further details (Revelle 2019).
A data frame with 25 variables and 2800 observations (including missing values)
Am indifferent to the feelings of others. (q_146)
Inquire about others' well-being. (q_1162)
Know how to comfort others. (q_1206)
Love children. (q_1364)
Make people feel at ease. (q_1419)
Am exacting in my work. (q_124)
Continue until everything is perfect. (q_530)
Do things according to a plan. (q_619)
Do things in a half-way manner. (q_626)
Waste my time. (q_1949)
Don't talk a lot. (q_712)
Find it difficult to approach others. (q_901)
Know how to captivate people. (q_1205)
Make friends easily. (q_1410)
Take charge. (q_1768)
Get angry easily. (q_952)
Get irritated easily. (q_974)
Have frequent mood swings. (q_1099)
Often feel blue. (q_1479)
Panic easily. (q_1505)
Am full of ideas. (q_128)
Avoid difficult reading material.(q_316)
Carry the conversation to a higher level. (q_492)
Spend time reflecting on things. (q_1738)
Will not probe deeply into a subject. (q_1964)
Males = 1, Females =2
1 = HS, 2 = finished HS, 3 = some college, 4 = college graduate 5 = graduate degree
Revelle W (2019). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University, Evanston, Illinois. R package version 1.9.12, https://CRAN.R-project.org/package=psych.
ObjectsExtract Confidence Intervals from ggm_inference
## S3 method for class 'ggm_inference' confint(object, ...)
## S3 method for class 'ggm_inference' confint(object, ...)
object |
An object of class |
... |
Currently ignored. |
A matrix including bootstrap confidence intervals.
# data Y <- ptsd # eip fit <- ggm_inference(Y, method = "spearman", boot = TRUE, B = 100) # cis confint(fit)
# data Y <- ptsd # eip fit <- ggm_inference(Y, method = "spearman", boot = TRUE, B = 100) # cis confint(fit)
Compute the maximum likelihood estimate of the precision matrix, given a known graphical structure (i.e., an adjacency matrix). This approach was originally described in "The Elements of Statistical Learning" (see pg. 631, Hastie et al. 2009).
constrained(Sigma, adj)
constrained(Sigma, adj)
Sigma |
Covariance matrix |
adj |
An adjacency matrix that encodes the constraints, where a zero indicates that element should be zero. |
A list containing the following:
Theta: Inverse of the covariance matrix (precision matrix), that encodes the conditional (in)dependence structure.
Sigma: Covariance matrix.
wadj: Weighted adjacency matrix, corresponding to the partial correlation network.
The algorithm is written in c++
, and should scale to high dimensions.
Note there are a variety of algorithms for this purpose. Simulation studies indicated that this approach is both accurate and computationally efficient (HFT therein, Emmert-Streib et al. 2019)
Emmert-Streib F, Tripathi S, Dehmer M (2019).
“Constrained covariance matrices with a biologically realistic structure: Comparison of methods for generating high-dimensional Gaussian graphical models.”
Frontiers in Applied Mathematics and Statistics, 5, 17.
Hastie T, Tibshirani R, Friedman J (2009).
The elements of statistical learning: data mining, inference, and prediction.
Springer Science \& Business Media.
# data Y <- ptsd # estimate graph fit <- ggm_inference(Y, boot = FALSE) # constrain to zero constrained_graph <- constrained(cor(Y), fit$adj)
# data Y <- ptsd # estimate graph fit <- ggm_inference(Y, boot = FALSE) # constrain to zero constrained_graph <- constrained(cor(Y), fit$adj)
A dataset containing items from the Contingencies of Self-Worth Scale (CSWS) scale. There are 35 variables and 680 observations
A data frame with 35 variables and 680 observations (7 point Likert scale)
When I think I look attractive, I feel good about myself
My self-worth is based on God's love
I feel worthwhile when I perform better than others on a task or skill.
My self-esteem is unrelated to how I feel about the way my body looks.
Doing something I know is wrong makes me lose my self-respect
I don't care if other people have a negative opinion about me.
Knowing that my family members love me makes me feel good about myself.
I feel worthwhile when I have God's love.
I can’t respect myself if others don't respect me.
My self-worth is not influenced by the quality of my relationships with my family members.
Whenever I follow my moral principles, my sense of self-respect gets a boost.
Knowing that I am better than others on a task raises my self-esteem.
My opinion about myself isn't tied to how well I do in school.
I couldn't respect myself if I didn't live up to a moral code.
I don't care what other people think of me.
When my family members are proud of me, my sense of self-worth increases.
My self-esteem is influenced by how attractive I think my face or facial features are.
My self-esteem would suffer if I didn’t have God's love.
Doing well in school gives me a sense of selfrespect.
Doing better than others gives me a sense of self-respect.
My sense of self-worth suffers whenever I think I don't look good.
I feel better about myself when I know I'm doing well academically.
What others think of me has no effect on what I think about myself.
When I don’t feel loved by my family, my selfesteem goes down.
My self-worth is affected by how well I do when I am competing with others.
My self-esteem goes up when I feel that God loves me.
My self-esteem is influenced by my academic performance.
My self-esteem would suffer if I did something unethical.
It is important to my self-respect that I have a family that cares about me.
My self-esteem does not depend on whether or not I feel attractive.
When I think that I’m disobeying God, I feel bad about myself.
My self-worth is influenced by how well I do on competitive tasks.
I feel bad about myself whenever my academic performance is lacking.
My self-esteem depends on whether or not I follow my moral/ethical principles.
My self-esteem depends on the opinions others hold of me.
"M" (male) or "F" (female)
There are seven domains
FAMILY SUPPORT: items 7, 10, 16, 24, and 29.
COMPETITION: items 3, 12, 20, 25, and 32.
APPEARANCE: items 1, 4, 17, 21, and 30.
GOD'S LOVE: items 2, 8, 18, 26, and 31.
ACADEMIC COMPETENCE: items 13, 19, 22, 27, and 33.
VIRTUE: items 5, 11, 14, 28, and 34.
APPROVAL FROM OTHERS: items: 6, 9, 15, 23, and 35.
Briganti, G., Fried, E. I., & Linkowski, P. (2019). Network analysis of Contingencies of Self-Worth Scale in 680 university students. Psychiatry research, 272, 252-257.
A data frame containing 403 observations (n = 403) and 16 variables (p = 16) measured on the 4-point likert scale (depression: 9; anxiety: 7).
A data frame containing 403 observations (n = 7466) and 16 variables (p = 16) measured on the 4-point likert scale.
Little interest or pleasure in doing things?
Feeling down, depressed, or hopeless?
Trouble falling or staying asleep, or sleeping too much?
Feeling tired or having little energy?
Poor appetite or overeating?
Feeling bad about yourself — or that you are a failure or have let
yourself or your family down?
Trouble concentrating on things, such as reading the newspaper or
watching television?
Moving or speaking so slowly that other people could have noticed? Or so
fidgety or restless that you have been moving a lot more than usual?
Thoughts that you would be better off dead, or thoughts of hurting yourself
in some way?
Feeling nervous, anxious, or on edge
Not being able to stop or control worrying
Worrying too much about different things
Trouble relaxing
Being so restless that it's hard to sit still
Becoming easily annoyed or irritable
Feeling afraid as if something awful might happen
Forbes, M. K., Baillie, A. J., & Schniering, C. A. (2016). A structural equation modeling analysis of the relationships between depression,anxiety, and sexual problems over time. The Journal of Sex Research, 53(8), 942-954.
Forbes, M. K., Wright, A. G., Markon, K. E., & Krueger, R. F. (2019). Quantifying the reliability and replicability of psychopathology network characteristics. Multivariate behavioral research, 1-19.
Jones, P. J., Williams, D. R., & McNally, R. J. (2019). Sampling variability is not nonreplication: a Bayesian reanalysis of Forbes, Wright, Markon, & Krueger.
data("depression_anxiety_t1") labels<- c("interest", "down", "sleep", "tired", "appetite", "selfest", "concen", "psychmtr", "suicid", "nervous", "unctrworry", "worrylot", "relax", "restless", "irritable", "awful")
data("depression_anxiety_t1") labels<- c("interest", "down", "sleep", "tired", "appetite", "selfest", "concen", "psychmtr", "suicid", "nervous", "unctrworry", "worrylot", "relax", "restless", "irritable", "awful")
A data frame containing 403 observations (n = 403) and 16 variables (p = 16) measured on the 4-point likert scale (depression: 9; anxiety: 7).
A data frame containing 403 observations (n = 7466) and 16 variables (p = 16) measured on the 4-point likert scale.
Little interest or pleasure in doing things?
Feeling down, depressed, or hopeless?
Trouble falling or staying asleep, or sleeping too much?
Feeling tired or having little energy?
Poor appetite or overeating?
Feeling bad about yourself — or that you are a failure or have let
yourself or your family down?
Trouble concentrating on things, such as reading the newspaper or
watching television?
Moving or speaking so slowly that other people could have noticed? Or so
fidgety or restless that you have been moving a lot more than usual?
Thoughts that you would be better off dead, or thoughts of hurting yourself
in some way?
Feeling nervous, anxious, or on edge
Not being able to stop or control worrying
Worrying too much about different things
Trouble relaxing
Being so restless that it's hard to sit still
Becoming easily annoyed or irritable
Feeling afraid as if something awful might happen
Forbes, M. K., Baillie, A. J., & Schniering, C. A. (2016). A structural equation modeling analysis of the relationships between depression,anxiety, and sexual problems over time. The Journal of Sex Research, 53(8), 942-954.
Forbes, M. K., Wright, A. G., Markon, K. E., & Krueger, R. F. (2019). Quantifying the reliability and replicability of psychopathology network characteristics. Multivariate behavioral research, 1-19.
Jones, P. J., Williams, D. R., & McNally, R. J. (2019). Sampling variability is not nonreplication: a Bayesian reanalysis of Forbes, Wright, Markon, & Krueger.
data("depression_anxiety_t2") labels<- c("interest", "down", "sleep", "tired", "appetite", "selfest", "concen", "psychmtr", "suicid", "nervous", "unctrworry", "worrylot", "relax", "restless", "irritable", "awful")
data("depression_anxiety_t2") labels<- c("interest", "down", "sleep", "tired", "appetite", "selfest", "concen", "psychmtr", "suicid", "nervous", "unctrworry", "worrylot", "relax", "restless", "irritable", "awful")
Compute the proportion of bootstrap samples that each relation was selected, corresponding to an edge inclusion "probability".
eip(Y, method = "pearson", B = 1000, progress = TRUE)
eip(Y, method = "pearson", B = 1000, progress = TRUE)
Y |
The data matrix of dimensions n (observations) by p (nodes). |
method |
Character string. Which type of correlation coefficients
to be computed. Options include |
B |
Integer. Number of bootstrap replicates (defaults to |
progress |
Logical. Should a progress bar be included (defaults to |
The order is the upper-triangular.
An object of class eip
, including a matrix of edge inclusion
In the context of regression, this general approach was described in see Figure 6.4. see Figure 6.4, Hastie et al. (2015). In this case, the selection is based on classical hypothesis testing instead of L1-regularization.
Hastie T, Tibshirani R, Wainwright M (2015). Statistical Learning with Sparsity: The Lasso and Generalizations. CRC Press, Boca Raton. ISBN 978-1-4987-1217-0, doi:10.1201/b18401.
# data Y <- ptsd # eip fit_eip <- eip(Y, method = "spearman") # print fit_eip
# data Y <- ptsd # eip fit_eip <- eip(Y, method = "spearman") # print fit_eip
Investigate network replicability for any kind of partial correlation, assuming there is an analytic solution for the standard error (e.g., Pearson's or Spearman's).
enr(net, n, alpha = 0.05, replications = 2, type = "pearson")
enr(net, n, alpha = 0.05, replications = 2, type = "pearson")
net |
True network of dimensions p by p. |
n |
Integer. The samples size, assumed equal in the replication attempts. |
alpha |
The desired significance level (defaults to |
replications |
Integer. The desired number of replications. |
type |
Character string. Which type of correlation coefficients
to be computed. Options include |
An list of class enr
including the following:
ave_power: Average power.
cdf: cumulative distribution function.
p_s: Power for each edge, or the probability of success for a given trial.
p: Number of nodes.
n_nonzero: Number of edges.
n: Sample size.
replication: Replication attempts.
var_pwr: Variance of power.
type: Type of correlation coefficient.
This method was introduced in Williams (2020).
The basic idea is to determine the replicability of edges in a partial correlation network. This requires defining the true network, which can include edges of various sizes, and then solving for the proportion of edges that are expected to be replicated (e.g. in two, three, or four replication attempt).
Williams DR (2020). “Learning to live with sampling variability: Expected replicability in partial correlation networks.” PsyArXiv. doi:10.31234/osf.io/fb4sa.
# (1) define partial correlation network # correlations from ptsd symptoms cors <- cor(GGMnonreg::ptsd) # inverse inv <- solve(cors) # partials pcors <- -cov2cor(inv) # set values to zero # (this is the partial correlation network) pcors <- ifelse(abs(pcors) < 0.05, 0, pcors) # compute ENR in two replication attempts fit_enr <- enr(net = pcors, n = 500, replications = 2) # intuition for the method: # The above did not require simulation, and here I use simulation # for the same purpose. # location of edges # (where the edges are located in the network) index <- which(pcors[upper.tri(diag(20))] != 0) # convert network a into correlation matrix # (this is needed to simulate data) diag(pcors) <- 1 cors_new <- corpcor::pcor2cor(pcors) # replicated edges # (store the number of edges that were replicated) R <- NA # simulate how many edges replicate in two attempts # (increase 100 to, say, 5,000) for(i in 1:100){ # two replications Y1 <- MASS::mvrnorm(500, rep(0, 20), cors_new) Y2 <- MASS::mvrnorm(500, rep(0, 20), cors_new) # estimate network 1 fit1 <- ggm_inference(Y1, boot = FALSE) # estimate network 2 fit2 <- ggm_inference(Y2, boot = FALSE) # number of replicated edges (detected in both networks) R[i] <- sum( rowSums( cbind(fit1$adj[upper.tri(diag(20))][index], fit2$adj[upper.tri(diag(20))][index]) ) == 2) } # combine simulation and analytic cbind.data.frame( data.frame(simulation = sapply(seq(0, 0.9, 0.1), function(x) { mean(R > round(length(index) * x) ) })), data.frame(analytic = round(fit_enr$cdf, 3)) ) # now compare simulation to the analytic solution # average replicability (simulation) mean(R / length(index)) # average replicability (analytic) fit_enr$ave_pwr
# (1) define partial correlation network # correlations from ptsd symptoms cors <- cor(GGMnonreg::ptsd) # inverse inv <- solve(cors) # partials pcors <- -cov2cor(inv) # set values to zero # (this is the partial correlation network) pcors <- ifelse(abs(pcors) < 0.05, 0, pcors) # compute ENR in two replication attempts fit_enr <- enr(net = pcors, n = 500, replications = 2) # intuition for the method: # The above did not require simulation, and here I use simulation # for the same purpose. # location of edges # (where the edges are located in the network) index <- which(pcors[upper.tri(diag(20))] != 0) # convert network a into correlation matrix # (this is needed to simulate data) diag(pcors) <- 1 cors_new <- corpcor::pcor2cor(pcors) # replicated edges # (store the number of edges that were replicated) R <- NA # simulate how many edges replicate in two attempts # (increase 100 to, say, 5,000) for(i in 1:100){ # two replications Y1 <- MASS::mvrnorm(500, rep(0, 20), cors_new) Y2 <- MASS::mvrnorm(500, rep(0, 20), cors_new) # estimate network 1 fit1 <- ggm_inference(Y1, boot = FALSE) # estimate network 2 fit2 <- ggm_inference(Y2, boot = FALSE) # number of replicated edges (detected in both networks) R[i] <- sum( rowSums( cbind(fit1$adj[upper.tri(diag(20))][index], fit2$adj[upper.tri(diag(20))][index]) ) == 2) } # combine simulation and analytic cbind.data.frame( data.frame(simulation = sapply(seq(0, 0.9, 0.1), function(x) { mean(R > round(length(index) * x) ) })), data.frame(analytic = round(fit_enr$cdf, 3)) ) # now compare simulation to the analytic solution # average replicability (simulation) mean(R / length(index)) # average replicability (analytic) fit_enr$ave_pwr
Tranform correlations to Fisher's Z
r |
correlation (can be a vector) |
Fisher Z transformed correlation(s)
Back tranform Fisher's Z to correlations
z |
Fisher Z |
Correlation (s) (backtransformed)
Simulate a Partial Correlation Matrix
gen_net(p = 20, edge_prob = 0.3, lb = 0.05, ub = 0.3)
gen_net(p = 20, edge_prob = 0.3, lb = 0.05, ub = 0.3)
p |
number of variables (nodes) |
edge_prob |
connectivity |
lb |
lower bound for the partial correlations |
ub |
upper bound for the partial correlations |
A list containing the following:
pcor: Partial correlation matrix, encoding the conditional (in)dependence structure.
cors: Correlation matrix.
adj: Adjacency matrix.
trys: Number of attempts to obtain a positive definite matrix.
The function checks for a valid matrix (positive definite),
but sometimes this will still fail. For example, for
larger p
, to have large partial correlations this
requires a sparse GGM
(accomplished by setting edge_prob
to a small value).
true_net <- gen_net(p = 10)
true_net <- gen_net(p = 10)
Extract the necessary ingredients to visualize the conditional dependence structure.
x |
An object of class |
A list including two matrices (the weighted adjacency and adjacency matrices)
# data Y <- ptsd # estimate graph fit <- ggm_inference(Y, boot = FALSE) # get info for plotting get_graph(fit)
# data Y <- ptsd # estimate graph fit <- ggm_inference(Y, boot = FALSE) # get info for plotting get_graph(fit)
Establish whether each of the corresponding edges are significantly different in two groups
ggm_compare(Yg1, Yg2, method = "spearman", alpha = 0.05)
ggm_compare(Yg1, Yg2, method = "spearman", alpha = 0.05)
Yg1 |
The data matrix of dimensions n (observations) by p (nodes) for group one. |
Yg2 |
The data matrix of dimensions n (observations) by p (nodes) for group two. |
method |
Character string. Which type of correlation coefficients
to be computed. Options include |
alpha |
The desired significance level (defaults to |
An object of class ggm_compare
adj: Adjacency matrix, where a 1 indicates a difference.
wadj: Weighted adjacency matrix (partial correlation differences that were significantly different)
cis: Confidence intervals for the partial correlation differences.
# data Yg1 <- na.omit(subset(bfi, gender == 1)[,1:10]) Yg2 <- na.omit(subset(bfi, gender == 2)[,1:10]) # compare relations fit <- ggm_compare(Yg1, Yg2)
# data Yg1 <- na.omit(subset(bfi, gender == 1)[,1:10]) Yg2 <- na.omit(subset(bfi, gender == 2)[,1:10]) # compare relations fit <- ggm_compare(Yg1, Yg2)
Learn the conditional dependence structure with null hypothesis significance testing. This provides a valid measure of parameter uncertainty.
ggm_inference( Y, alpha = 0.05, control_precision = FALSE, boot = TRUE, B = 1000, cores = 1, method = "pearson", progress = TRUE )
ggm_inference( Y, alpha = 0.05, control_precision = FALSE, boot = TRUE, B = 1000, cores = 1, method = "pearson", progress = TRUE )
Y |
The data matrix of dimensions n (observations) by p (nodes). |
alpha |
The desired significance level (defaults to |
control_precision |
Logical. Should precision (i.e., 1 - false discovery rate)
be controlled at the level alpha (defaults to |
boot |
Logical. Should a non-parametric bootstrap be employed (defaults to |
B |
Integer. Number of bootstrap replicates (defaults to |
cores |
Integer. Number of cores to be used when executing in parallel (defaults to 1). |
method |
Character string. Which type of correlation coefficients
to be computed. Options include |
progress |
Logical. Should a progress bar be included (defaults to |
An object of class ggm_inference
wadj: Weighted adjacency matrix, corresponding to the partial correlation network.
adj: Adjacency matrix (detected effects).
pcors: Partial correlations.
n: Sample size.
p: Number of nodes.
Y: Data.
Y <- ptsd fit <- ggm_inference(Y)
Y <- ptsd fit <- ggm_inference(Y)
Data mining to learn the graph.
ggm_search( x, IC = "BIC", type = "neighborhood_selection", method = "forward", n = NULL )
ggm_search( x, IC = "BIC", type = "neighborhood_selection", method = "forward", n = NULL )
x |
A data matrix of dimensions n (observations) by p (nodes) or a correlation matrix of dimensions p by p. |
IC |
Character string. The desired information criterion. Options include
type |
Character string. Which search method should be used? The options included
method |
Character string. The desired subset selection method
Options includes |
n |
Integer. Sample size. Required if a correlation matrix is provided. |
type = "neighborhood_selection"
was described in
Williams et al. (2019)
and type = "approx_L0"
was described in Williams (2020).
The penalty for type = "approx_L0"
is called seamless L0 (Dicker et al. 2013)
An object of class ggm_search
wadj: Weighted adjacency matrix, corresponding to the partial correlation network.
adj: Adjacency matrix (detected effects).
pcors: Partial correlations.
n: Sample size.
p: Number of nodes.
Y: Data.
type = "neighborhood_selection"
employs multiple regression to estimate
the graph (requires the data), whereas type = "approx_L0"
directly estimates
the precision matrix (data or a correlation matrix are acceptable). If
data is provided and type = "approx_L0"
, by default Pearson correlations are
used. For another correlation coefficient, provide the desired correlation matrix.
type = "approx_L0"
is a continuous approximation to (non-regularized)
best subset model selection. This is accomplished by using regularization, but
the penalty (approximately) mimics non-regularized estimation.
Dicker L, Huang B, Lin X (2013).
“Variable selection and estimation with the seamless-L 0 penalty.”
Statistica Sinica, 929–962.
Williams DR (2020).
“Beyond Lasso: A Survey of Nonconvex Regularization in Gaussian Graphical Models.”
Williams DR, Rhemtulla M, Wysocki AC, Rast P (2019).
“On nonregularized estimation of psychological networks.”
Multivariate behavioral research, 54(5), 719–750.
# data Y <- ptsd # search data fit <- ggm_search(Y)
# data Y <- ptsd # search data fit <- ggm_search(Y)
A data frame containing 1002 rows and 7 variables measured on various scales, including binary and ordered cateogrical (with varying numbers of categories). There are also missing values in each variable
Income of the respondent in 1000s of dollars, binned into 21 ordered categories.
Highest degree ever obtained (none, HS, Associates, Bachelors, or Graduate)
Number of children ever had.
Financial status of respondent's parents when respondent was 16 (on a 5-point scale).
Maximum of mother's and father's highest degree
Number of siblings of the respondent plus one
Age of the respondent in years.
A data frame containing 1190 observations (n = 1190) and 6 variables (p = 6) measured on the binary scale (Fowlkes et al. 1988). The variable descriptions were copied from section 4, Hoff (2007)
Fowlkes EB, Freeny AE, Landwehr JM (1988).
“Evaluating logistic models for large contingency tables.”
Journal of the American Statistical Association, 83(403), 611–622.
Hoff PD (2007).
“Extending the rank likelihood for semiparametric copula estimation.”
The Annals of Applied Statistics, 1(1), 265–283.
A data frame containing 8 variables and nearly 200 observations. There are two subjects, each of which provided data every data for over 90 days. Six variables are from the PANAS scale (positive and negative affect), the daily number of steps, and the subject id.
Subject id
steps recorded by a fit bit
A data frame containing 197 observations and 8 variables. The data have been used in (OLaughlin et al. 2020) and (Williams et al. 2019)
OLaughlin KD, Liu S, Ferrer E (2020).
“Use of Composites in Analysis of Individual Time Series: Implications for Person-Specific Dynamic Parameters.”
Multivariate Behavioral Research, 1–18.
Williams DR, Liu S, Martin SR, Rast P (2019).
“Bayesian Multivariate Mixed-Effects Location Scale Modeling of Longitudinal Relations among Affective Traits, States, and Physical Activity.”
A dataset containing items from the Interpersonal Reactivity Index (IRI; an empathy measure). There are 28 variables and 1973 observations
A data frame with 28 variables and 1973 observations (5 point Likert scale)
I daydream and fantasize, with some regularity, about things that might happen to me.
I often have tender, concerned feelings for people less fortunate than me.
I sometimes find it difficult to see things from the "other guy's" point of view.
Sometimes I don't feel very sorry for other people when they are having problems.
I really get involved with the feelings of the characters in a novel.
In emergency situations, I feel apprehensive and ill-at-ease.
I am usually objective when I watch a movie or play, and I don't often get completely caught up in it.
I try to look at everybody's side of a disagreement before I make a decision.
When I see someone being taken advantage of, I feel kind of protective towards them.
I sometimes feel helpless when I am in the middle of a very emotional situation.
I sometimes try to understand my friends better
by imagining how things look from their perspective
Becoming extremely involved in a good book or movie is somewhat rare for me.
When I see someone get hurt, I tend to remain calm.
Other people's misfortunes do not usually disturb me a great deal.
If I'm sure I'm right about something, I don't waste much
time listening to other people's arguments.
After seeing a play or movie, I have felt as though I were one of the characters.
Being in a tense emotional situation scares me.
When I see someone being treated unfairly,
I sometimes don't feel very much pity for them.
I am usually pretty effective in dealing with emergencies.
I am often quite touched by things that I see happen.
I believe that there are two sides to every question and try to look at them both.
I would describe myself as a pretty soft-hearted person.
When I watch a good movie, I can very easily put myself in
the place of a leading character
I tend to lose control during emergencies.
When I'm upset at someone, I usually try to "put myself in his shoes" for a while.
When I am reading an interesting story or novel, I imagine how I would feel if the
events in the story were happening to me.
When I see someone who badly needs help in an emergency, I go to pieces.
Before criticizing somebody, I try to imagine how I would feel if I were in their place.
"M" (male) or "F" (female)
There are four domains
Fantasy: items 1, 5, 7, 12, 16, 23, 26
Perspective taking: items 3, 8, 11, 15, 21, 25, 28
Empathic concern: items 2, 4, 9, 14, 18, 20, 22
Personal distress: items 6, 10, 13, 17, 19, 24, 27,
Briganti, G., Kempenaers, C., Braun, S., Fried, E. I., & Linkowski, P. (2018). Network analysis of empathy items from the interpersonal reactivity index in 1973 young adults. Psychiatry research, 265, 87-92.
Data mining to learn the graph of binary variables with an Ising model (Lenz 1920; Ising 1925).
ising_search(Y, IC = "BIC", progress = TRUE)
ising_search(Y, IC = "BIC", progress = TRUE)
Y |
A data matrix of dimensions n (observations) by p (nodes). |
IC |
Character string. The desired information criterion. Options include
progress |
Logical. Should a progress bar be included (defaults to |
Currently only backwards selection is currently implemented.
An object of class ising_search
wadj: Weighted adjacency matrix, corresponding to the partial correlation network.
adj: Adjacency matrix (detected effects).
pcors: Partial correlations.
n: Sample size.
p: Number of nodes.
Y: Data.
For an excellent overview of the Ising model see Marsman et al. (2018).
Ising E (1925).
“Beitrag zur theorie des ferromagnetismus.”
Zeitschrift für Physik, 31(1), 253–258.
Lenz W (1920).
“Beitršge zum verstšndnis der magnetischen eigenschaften in festen kšrpern.”
Physikalische Z, 21, 613–615.
Marsman M, Borsboom D, Kruis J, Epskamp S, Van Bork R, Waldorp LJ, Maas Hvd, Maris G (2018).
“An introduction to network psychometrics: Relating Ising network models to item response theory models.”
Multivariate behavioral research, 53(1), 15–35.
# data Y <- ifelse( ptsd[,1:5] == 0, 0, 1) # search data fit <- ising_search(Y)
# data Y <- ifelse( ptsd[,1:5] == 0, 0, 1) # search data fit <- ising_search(Y)
Data mining to learn the graph.
mixed_search(Y, data_type = NULL, IC = "BIC")
mixed_search(Y, data_type = NULL, IC = "BIC")
Y |
A data matrix of dimensions n (observations) by p (nodes) |
data_type |
Vector of length p. The type of data, with options of "b" (binary), "p" (Poisson), and "g" (Gaussian). |
IC |
Character string. The desired information criterion. Options include
Only backwards selection is currently implemented. Only an adjacency matrix is provided.
An object of class mixed_search
wadj: Weighted adjacency matrix, corresponding to the partial correlation network.
adj: Adjacency matrix (detected effects).
pcors: Partial correlations.
n: Sample size.
p: Number of nodes.
Y: Data.
# data Y <- ifelse( ptsd[,1:5] == 0, 0, 1) # search data (ising model) fit <- mixed_search(Y, data_type = rep("b", 5))
# data Y <- ifelse( ptsd[,1:5] == 0, 0, 1) # search data (ising model) fit <- mixed_search(Y, data_type = rep("b", 5))
ObjectsPlot the probability mass function for ENR.
plot_enr(x, iter = 1e+05, fill = "#009E73", alpha = 0.5, ...)
plot_enr(x, iter = 1e+05, fill = "#009E73", alpha = 0.5, ...)
x |
An object of class |
iter |
Integer. How many draws from the Poisson-binomial distribution (defaults to 1,000)? |
fill |
Which color to fill the density? |
alpha |
Numeric (between 0 and 1). The transparency for the density. |
... |
Currently ignored |
An object of class ggplot
# correlations cors <- cor(GGMnonreg::ptsd) # inverse inv <- solve(cors) # partials pcors <- -cov2cor(inv) # set values to zero pcors <- ifelse(abs(pcors) < 0.05, 0, pcors ) est <- enr(net = pcors, n = 500, replications = 2) # plot plot_enr(est)
# correlations cors <- cor(GGMnonreg::ptsd) # inverse inv <- solve(cors) # partials pcors <- -cov2cor(inv) # set values to zero pcors <- ifelse(abs(pcors) < 0.05, 0, pcors ) est <- enr(net = pcors, n = 500, replications = 2) # plot plot_enr(est)
ObjectsVisualize the conditional (in)dependence structure.
## S3 method for class 'ggmnonreg' plot( x, layout = "circle", neg_col = "#D55E00", pos_col = "#009E73", edge_magnify = 1, node_size = 10, palette = 2, node_names = NULL, node_groups = NULL, ... )
## S3 method for class 'ggmnonreg' plot( x, layout = "circle", neg_col = "#D55E00", pos_col = "#009E73", edge_magnify = 1, node_size = 10, palette = 2, node_names = NULL, node_groups = NULL, ... )
x |
An object of class |
layout |
Character string. Which graph layout (defaults is |
neg_col |
Character string. Color for the positive edges (defaults to a colorblind friendly red). |
pos_col |
Character string. Color for the negative edges (defaults to a colorblind friendly green). |
edge_magnify |
Numeric. A value that is multiplied by the edge weights. This increases (> 1) or decreases (< 1) the line widths (defaults to 1). |
node_size |
Numeric. The size of the nodes (defaults to |
palette |
A character string sepcifying the palette for the |
node_names |
Character string. Names for nodes of length p. |
node_groups |
A character string of length p (the number of nodes in the model). This indicates groups of nodes that should be the same color (e.g., "clusters" or "communities"). |
... |
Currently ignored. |
An object of class ggplot
# data Y <- ptsd # estimate graph fit <- ggm_inference(Y, boot = FALSE) # plot graph plot(fit)
# data Y <- ptsd # estimate graph fit <- ggm_inference(Y, boot = FALSE) # plot graph plot(fit)
Network Predictability (R2)
predictability(x, ci = 0.95)
predictability(x, ci = 0.95)
x |
An object of class |
ci |
Numeric. The confidence interval to be computed (defaults to |
An object of class predictability
, including a matrix of R2
for each node.
Predictability is variance explained for each node in the network (Haslbeck and Waldorp 2018).
Haslbeck JM, Waldorp LJ (2018). “How well do network models predict observations? On the importance of predictability in network models.” Behavior Research Methods, 50(2), 853–861. ISSN 15543528, doi:10.3758/s13428-017-0910-x, 1610.09108, http://www.ncbi.nlm.nih.gov/pubmed/28718088.
# data Y <- ptsd # estimate graph fit <- ggm_inference(Y, boot = FALSE) # predictability r2 <- predictability(fit) # print r2
# data Y <- ptsd # estimate graph fit <- ggm_inference(Y, boot = FALSE) # predictability r2 <- predictability(fit) # print r2
ObjectPrint ggmnonreg
## S3 method for class 'ggmnonreg' print(x, ...)
## S3 method for class 'ggmnonreg' print(x, ...)
x |
An object of class |
... |
Currently ignored |
No return value.
A dataset containing items that measure Post-traumatic stress disorder symptoms (Armour et al. 2017). There are 20 variables (p) and 221 observations (n).
A dataframe with 221 rows and 20 variables
Intrusive Thoughts
Emotional cue reactivity
Psychological cue reactivity
Avoidance of thoughts
Avoidance of reminders
Trauma-related amnesia
Negative beliefs
Negative trauma-related emotions
Loss of interest
Restricted affect
Self-destructive/reckless behavior
Exaggerated startle response
Difficulty concentrating
Sleep disturbance
Armour C, Fried EI, Deserno MK, Tsai J, Pietrzak RH (2017). “A network analysis of DSM-5 posttraumatic stress disorder symptoms and correlates in US military veterans.” Journal of anxiety disorders, 45, 49–59. doi:10.31234/osf.io/p69m7.
A correlation matrix that includes 16 variables. The correlation matrix was estimated from 526 individuals (Fried et al. 2018).
A correlation matrix with 16 variables
Intrusive Thoughts
Physiological/psychological reactivity
Avoidance of thoughts
Avoidance of situations
Disinterest in activities
Feeling detached
Emotional numbing
Foreshortened future
Sleep problems
Concentration problems
Startle response
Fried EI, Eidhof MB, Palic S, Costantini G, Huisman-van Dijk HM, Bockting CL, Engelhard I, Armour C, Nielsen AB, Karstoft K (2018). “Replicability and generalizability of posttraumatic stress disorder (PTSD) networks: a cross-cultural multisite study of PTSD symptoms in four trauma patient samples.” Clinical Psychological Science, 6(3), 335–351.
data(ptsd_cor1) Y <- MASS::mvrnorm(n = 526, mu = rep(0, 16), Sigma = ptsd_cor1, empirical = TRUE)
data(ptsd_cor1) Y <- MASS::mvrnorm(n = 526, mu = rep(0, 16), Sigma = ptsd_cor1, empirical = TRUE)
A correlation matrix that includes 16 variables. The correlation matrix was estimated from 365 individuals (Fried et al. 2018).
A correlation matrix with 16 variables
Intrusive Thoughts
Physiological/psychological reactivity
Avoidance of thoughts
Avoidance of situations
Disinterest in activities
Feeling detached
Emotional numbing
Foreshortened future
Sleep problems
Concentration problems
Startle response
Fried EI, Eidhof MB, Palic S, Costantini G, Huisman-van Dijk HM, Bockting CL, Engelhard I, Armour C, Nielsen AB, Karstoft K (2018). “Replicability and generalizability of posttraumatic stress disorder (PTSD) networks: a cross-cultural multisite study of PTSD symptoms in four trauma patient samples.” Clinical Psychological Science, 6(3), 335–351.
data(ptsd_cor2) Y <- MASS::mvrnorm(n = 365, mu = rep(0, 16), Sigma = ptsd_cor2, empirical = TRUE)
data(ptsd_cor2) Y <- MASS::mvrnorm(n = 365, mu = rep(0, 16), Sigma = ptsd_cor2, empirical = TRUE)
A correlation matrix that includes 16 variables. The correlation matrix was estimated from 926 individuals (Fried et al. 2018).
A correlation matrix with 16 variables
Intrusive Thoughts
Physiological/psychological reactivity
Avoidance of thoughts
Avoidance of situations
Disinterest in activities
Feeling detached
Emotional numbing
Foreshortened future
Sleep problems
Concentration problems
Startle response
Fried EI, Eidhof MB, Palic S, Costantini G, Huisman-van Dijk HM, Bockting CL, Engelhard I, Armour C, Nielsen AB, Karstoft K (2018). “Replicability and generalizability of posttraumatic stress disorder (PTSD) networks: a cross-cultural multisite study of PTSD symptoms in four trauma patient samples.” Clinical Psychological Science, 6(3), 335–351.
data(ptsd_cor3) Y <- MASS::mvrnorm(n = 926, mu = rep(0, 16), Sigma = ptsd_cor3, empirical = TRUE)
data(ptsd_cor3) Y <- MASS::mvrnorm(n = 926, mu = rep(0, 16), Sigma = ptsd_cor3, empirical = TRUE)
A correlation matrix that includes 16 variables. The correlation matrix was estimated from 965 individuals (Fried et al. 2018).
A correlation matrix with 16 variables
Intrusive Thoughts
Physiological/psychological reactivity
Avoidance of thoughts
Avoidance of situations
Disinterest in activities
Feeling detached
Emotional numbing
Foreshortened future
Sleep problems
Concentration problems
Startle response
Fried EI, Eidhof MB, Palic S, Costantini G, Huisman-van Dijk HM, Bockting CL, Engelhard I, Armour C, Nielsen AB, Karstoft K (2018). “Replicability and generalizability of posttraumatic stress disorder (PTSD) networks: a cross-cultural multisite study of PTSD symptoms in four trauma patient samples.” Clinical Psychological Science, 6(3), 335–351.
data(ptsd_cor4) Y <- MASS::mvrnorm(n = 965, mu = rep(0, 16), Sigma = ptsd_cor4, empirical = TRUE)
data(ptsd_cor4) Y <- MASS::mvrnorm(n = 965, mu = rep(0, 16), Sigma = ptsd_cor4, empirical = TRUE)
A dataset containing items from the Resilience Scale of Adults (RSA). There are 33 items and 675 observations
A data frame with 28 variables and 1973 observations (5 point Likert scale)
My plans for the future are
When something unforeseen happens
My family understanding of what is important in life is
I feel that my future looks
My goals
I can discuss personal issues with
I feel
I enjoy being
Those who are good at encouraging are
The bonds among my friends
My personal problems
When a family member experiences a crisis/emergency
My family is characterised by
To be flexible in social settings
I get support from
In difficult periods my family
My judgements and decisions
New friendships are something
When needed, I have
I am at my best when I
Meeting new people is
When I am with others
When I start on new things/projects
Facing other people, our family acts
Belief in myself
For me, thinking of good topics of conversation is
My close friends/family members
I am good at
In my family, we like to
Rules and regular routines
In difficult periods I have a tendency to
My goals for the future are
Events in my life that I cannot influence
"M" (male) or "F" (female)
There are 6 domains
Planned future: items 1, 4, 5, 32
Perception of self: items 2, 11, 17, 25, 31, 33
Family cohesion: items 3, 7, 13, 16, 24, 29
Social resources: items 6, 9, 10, 12, 15, 19, 27
Social Competence: items 8, 14, 18, 21, 22, 26,
Structured style: items 23, 28, 30
Briganti, G., & Linkowski, P. (2019). Item and domain network structures of the Resilience Scale for Adults in 675 university students. Epidemiology and psychiatric sciences, 1-9.
Protein expression in human immune system cells
A data frame containing 7466 cells (n = 7466) and flow cytometry measurements of 11 (p = 11) phosphorylated proteins and phospholipids
@references Sachs, K., Gifford, D., Jaakkola, T., Sorger, P., & Lauffenburger, D. A. (2002). Bayesian network approach to cell signaling pathway modeling. Sci. STKE, 2002(148), pe38-pe38.
A dataset containing items from the Toronto Alexithymia Scale (TAS). There are 20 variables and 1925 observations
A data frame with 20 variables and 1925 observations (5 point Likert scale)
I am often confused about what emotion I am feeling
It is difficult for me to find the right words for my feelings
I have physical sensations that even doctors don’t understand
I am able to describe my feelings easily
I prefer to analyze problems rather than just describe them
When I am upset, I don’t know if I am sad, frightened, or angry
I am often puzzled by sensations in my body
I prefer just to let things happen rather than to understand why they turned out that way
I have feelings that I can’t quite identify
Being in touch with emotions is essential
I find it hard to describe how I feel about people
People tell me to describe my feelings more
I don’t know what’s going on inside me
I often don’t know why I am angry
I prefer talking to people about their daily activities rather than their feelings
I prefer to watch “light” entertainment shows rather than psychological dramas
It is difficult for me to reveal my innermost feelings, even to close friends
I can feel close to someone, even in moments of silence
I find examination of my feelings useful in solving personal problems
Looking for hidden meanings in movies or plays distracts from their enjoyment
"M" (male) or "F" (female)
There are three domains
Difficulty identifying feelings: items 1, 3, 6, 7, 9, 13, 14
Difficulty describing feelings: items 2, 4, 11, 12, 17
Externally oriented thinking: items 10, 15, 16, 18, 19
Briganti, G., & Linkowski, P. (2019). Network approach to items and domains from the Toronto Alexithymia Scale. Psychological reports.
A data frame containing 1190 observations (n = 1190) and 6 variables (p = 6) measured on the binary scale.
A data frame containing 1190 observations (n = 1190) and 6 variables (p = 6) measured on the binary scale (Fowlkes et al. 1988). These data have been analyzed in Tarantola (2004) and in (Madigan and Raftery 1994). The variable descriptions were copied from (section 5.2 ) (section 5.2, Talhouk et al. 2012)
Lecture attendance (attend/did not attend)
Gender (male/female)
School type (urban/suburban)
“I will be needing Mathematics in my future work” (agree/disagree)
Subject preference (math/science vs. liberal arts)
Future plans (college/job)
Fowlkes EB, Freeny AE, Landwehr JM (1988).
“Evaluating logistic models for large contingency tables.”
Journal of the American Statistical Association, 83(403), 611–622.
Madigan D, Raftery AE (1994).
“Model selection and accounting for model uncertainty in graphical models using Occam's window.”
Journal of the American Statistical Association, 89(428), 1535–1546.
Talhouk A, Doucet A, Murphy K (2012).
“Efficient Bayesian inference for multivariate probit models with sparse inverse correlation matrices.”
Journal of Computational and Graphical Statistics, 21(3), 739–757.
Tarantola C (2004).
“MCMC model determination for discrete graphical models.”
Statistical Modelling, 4(1), 39–61.