Intro to lavaan

delirium Excuse, that interrupt you, but..

Intro to lavaan

This introduction to R is very brief and only geared toward providing some basics so that one can understand and run the code associated with the content. It is geared toward an audience that likely has no programming experience whatsoever, but may have had some exposure to traditional statistics packages.

If you have some basic familiarity with R you may skip this chapter, though it might serve as a refresher for some. As mentioned previously, to begin with R for your own machine, you just need to go to the R websitedownload it for your operating system, and install.

Then go to RStudiodownload and install it. From there on you only need RStudio to use R. As soon as you install it, R is already the most powerful statistical environment within which to work. However, its real strength comes from the community, which has added thousands of packages that provide additional or enhanced functionality.

You will regularly find packages that specifically do something you want, and will need to install them in order to use them. RStudio provides a Packages tab, but it is usually just as or more efficient to use the install. At this point there are over packages available through standard sources, and many more through unofficial ones. To start getting some ideas of what you want to use, you will want to spend time at places like CRAN Task ViewsRdocumentationor with a list like this one.

The main thing to note is that if you want to use a package, you have to load it with the library function.

Sometimes, you only need the package for one thing and there is no reason to keep it loaded, in which case you can use the following approach. However, note that the increasing popularity and ease of using R means that packages can vary quite a bit in terms of quality, so you may need to try out a couple packages with seemingly similar functionality to find the best for your situation.

RStudio is an integrated development environment IDE specifically geared toward R though it works for other languages too. At the very least it will make your programming far easier and more efficient, at best you can create publish-ready documents, manage projects, create interactive website regarding your research, use version control, and much more.

I have an overview here.

Pyspark lit array

See Emacs Speaks Statistics for an alternative. The point is, base R is not an efficient way to use R, and you have at least two very powerful options to make your coding experience easier and more efficient. It is as easy to import data into R from other programs and text files as it is any other statistical program.

As a first step, feel free to use the menu-based approach via the Environment tab. Note that you have easy access to the code that actually does the job. It is easier to select specific options as well.

Introduction to SEM with lavaan and semPlot (1 of 3)

Some key packages to note:. There are many other packages that would be of use for special situations and other less common file types and structures. The first thing to note for those new to R is that R is a language that is oriented toward, but not specific to, dealing with statistics. The better you get at statistical programming the further you can explore your data and take your research. This holds for any statistical endeavor whether using R or not.

With R, the script is where you write your R code. While you could do everything at the console, this would be difficult at best and unreproducible.

Martial weapons 5e

The console is where the results are produced from running the script. Again you can do one-liner stuff there, such as getting help for a function.

The graphics device is where visualizations are produced, and in RStudio you have two, one for static plots, and a viewer for potentially interactive ones.There are several freely available packages for structural equation modeling SEMboth in and outside of R. In the R world, the three most popular are lavaanOpenMXand sem.

I have tended to prefer lavaan because of its user-friendly syntax, which mimics key aspects of of Mplus. Although OpenMX provides a broader set of functions, the learning curve is steeper. SEM is largely a multivariate extension of regression in which we can examine many predictors and outcomes at once. SEM also provides the innovation of examining latent structure i. This is a nice dataset for regression because there are many interdependent variables: crime, pollutants, age of properties, etc.

And the syntax even has many similarities with lm. The regression coefficient is identical good! This highlights an important difference that basic SEM often focuses on the covariance structure of the data. For example, do males and females differ on mean level of a depression latent factor? Note that we can get standardized estimates in lavaan as well.

This is a more complicated topic in SEM because we can standardize with respect to the latent variables alone std. The latter is usually what is reported as standardized estimates in SEM papers. What if we believe that the level nitric oxides nox also predicts home prices alongside crime? We can add this as a predictor as in standard multiple regression. Furthermore, we hypothesize that the proximity of a home to large highways rad predicts the concentration of nitric oxides, which predicts lower home prices?

The model looks like this using the handy semPaths function from semPlot :. Parameter estimation can be hampered when the variances of variables in the model differ substantially orders of magnitude. We can rescale variables in this case by multiplying by a constant.

This has no effect on the fit or interpretation of the model — we just have to recall what the new units represent. Also, you can always divide out the constant from the parameter estimate to recover the original units, if important. You can request more detailed global fit indices from lavaan in the model summary output using fit.

Weighted moving average python pandas

You can also get just the fit measures including additional statistics using fitmeasures :. This suggests the need to examine the fit in more detail.

First, we can look at the mismatch between the model-implied and observed covariance matrices. Conceptually, the goal of structural equation modeling SEM is to test whether a theoretically motivated model of the covariance among variables provides a good approximation of the data. Formally, we are seeking to develop a model whose model-implied covariance matrix approaches the sample observed covariance matrix.

We might be able to interpret this more easily in correlational standardized units.

Using SEM (lavaan) to Estimate the APIM

The inspect function in lavaan gives access to a number of model details, including this:. In particular, getting the misfit of the bivariate associations is very helpful.

intro to lavaan

Here, we ask for residuals in correlational units, which can be more intuitive than dealing with covariances that are unstandardized.Moreover, Iacobucci accompanied the paper with a data-set and step-by-step explanation of the syntax to analyze it. This allowed the reproduction of her analysis to all the adventurous readers willing to start with structural equation modeling.

Now, however, with lavaan available, when I encountered the paper again I saw an opportunity to make her approach reproducible by a wider audience since lavaan is available to anyone and its syntax is very intuitive.

intro to lavaan

NOTE: all code is stored in a script file which is available here together with the data file. The data Iacobucci used are displayed in the covariance matrix in table 2 of her paper. I copied the data into a text file, which I named iacobucci2. I pasted the data into a text file to avoid cluttering my blog post with meaningless numbers. Before transforming the data into a covariance matrix I transposed the data from column to row.

Another important bit of information is the number of respondents This must be assigned to the sample. In fact, Lavaan does not require to specify matrices and such, but only the specification of which variables load on which factors. This is as simple as writing syntax resembling equations in which a factor is composed of the sum of the variables contributing to that factor.

Below is the model specification:. Note that in the model specification above the latent variable repeaT has a capital T. This is to distinguish it form the flow-control construct repeat see herea reserved word which initiates a Repeat Loop.

Iacobucci first addresses the fit of the model. Lavaan prints the indexes used by Iacobucci by pasting this syntax into R console. The code below may appear convoluted because of the formatting instructions. To increase readability I wrapped all the indexes into a line through the command paste and print them to screen with the print command:. A quick comparison between the output printed in the R console and Iacobucci's output shows that the two outputs are consistent and almost identical.

Factor loadings can be extracted using the parameterEstimates function. Note that p values are NA i. These are the variables whose factor loadings were fixed to one, also known as marker variables.

By convention, the first variables contributing to a factor are assigned the role of marker variable. Marker variables are constrained to 1. This is to provide a unit of measurement for the latent variable because they do not have a scale since they are not directly observed. In R NAs can be filtered out with the function na. Furthermore, we can check if factor loadings were significant by testing whether they were below.

Significance of factor loadings is important because a nonsignificant loading implies that the variable should not be included in the model. A quick way to check whether all the variables included in the model contributed significantly to the respective factor loadings is to check if the number of p values below. R-trickery comes in handy to solve these questions. Since boolean vectors in R are series of 0 false and 1 truewe can check if the sum of the vector matches amount of TRUEs its length amount of tests.

Wrapping this code into an if statement we can print to console whether or not all our factor loadings were significant:. In our case the line above returns an empty array, confirming Iacobucci's findings that all the factor loadings were significant. The range spanned by the factor loading can be extracted using the range function on either the est or std.Lavaan is a free open source package for latent variable modeling in R. The name lavaan refers to la tent va riable an alysis.

Lavaan can be used to estimate a variety of statistical models: path analysis, structural equation models SEM and confirmatory factor analyses CFA.

Quake custom maps

The adolescents were asked:. The questions on life satisfaction can be divided into four domains. We rename the variables in such a way that it becomes clear which LS domain they belong to.

To bring the order of the variables in line with their content domains, we also change the order using select. The ML estimation assumes a multivariate normal distribution. Both skewness and kurtosis are very high, pointing to multivariate outliers that might inflate standard errors of coefficients coefficients themselves are mostly unaffected by non-normality. By removing univariate outliers we also want to make sure that influential data points play no role for the factor solution.

The calculation of a CFA with lavaan in done in two steps: in the first step, a model defining the hypothesized factor structure has to be set up; in the second step this model is estimated using cfa. This function takes as input the data as well as the model definition.

Model definitions in lavaan all follow the same type of syntax. In the syntax, certain characters operators are predefined and a number of default settings are applied. For example, by default the scaling of the latent variable is achieved by fixing the loading of the first indicator manifest variable for a certain latent variable to the value of 1.

Introduction to R

The parameters of the model do not have to be explicitly defined e. However, they could be defined explicitly and we will see below that this is necessary in some situations. Another default is that factor variances and covariances are automatically specified for all latent variables in a CFA.

intro to lavaan

Variances are defined as covariances of a variable with itself. The same holds for the residual variances of the manifest variables but not for potential residual covariances!

However, the direct execution of this syntax results in only a very limited output, containing only the number of estimated parameters, the number of observations, and the chi-square statistics. Therefore, the result of cfa must first be assigned to an output object e. For the summary -function there are some additional arguments: fit. For theoretical reasons, we first estimate a model with four factors. We define one factor for each content domain of life satisfaction school, self, friends, family.

In addition, as usual we postulate a simple structure, i. The lavInspect function allows extracting information from a lavaan object. The argument what specifies which information should be extracted.

Introductory SEM using lavaan

The smaller the differences between these two matrices, the better the model fits the data, i. The residual matrix results from the subtraction of the variance-covariance matrix implied by the model from the observed empirical variance-covariance matrix.I wrote this brief introductory post for my friend Simon. In the specific case of mediation analysis the transition to R can be very smooth because, thanks to lavaanthe R knowledge required to use the package is minimal.

Analysis of mediator effects in lavaan requires only the specification of the model, all the other processes are automated by the package. So, after reading in the data, running the test is trivial. This time, to keep the focus on the mediation analysis I will skip reading-in the data and generate a synthetic dataset instead. This is because otherwise I would have to spend the next paragraph explaining the dataset and the variables it contains and I really want to only focus on the analysis.

As shown in the lavaan website performing a mediation analysis is as simple as typing in the code below:. For multiple mediators one simply need to extend the model recycling the code of the first mediator variable:.

Note that with multiple mediators we must add the covariance of the two mediators to the model. Covariances are added using the notation below:.

Theme icon

There are two ways to test the null hypothesis that the indirect effect are equal to each other. The first is to specify a contrast for the two indirect effects.

In the definition of the contrast the two indirect effects are subtracted. If it is significant the two indirect effects differ. The second option to determine whether the indirect effects differ is to set a constrain in the model specifying the two indirect effect to be equal. Then, with the anova function one can compare the models and determine which one is better.

Including the constrain and comparing the models is simple:. In my case the test is not significant so there is no evidence that the indirect effects are different. For these toy models there is no further need of customizing the calls to sem. However, when performing a proper analysis one might prefer to have bootstrapped confidence intervals.

Bootstrap confidence interval can be extracted with the function calls 1 summary, 2 parameterEstimates, or 3 bootstrapLavaan. NOTE that bootstrapLavaan will re-compute the bootstrap samples requiring to wait as long as it took the sem function to run if called with the bootstrap option. Since this post is longer than I wanted it to be, I will leave as a brief introduction to mediation with lavaan.

In this follow up post I describe multiple mediation with lavaan using an actual dataset. On github is the whole code in one. R file. Here is the first post.The lavaan package is an excellent package for structural equation models, and the DiagrammeR package is an excellent package for producing nice looking graph diagrams.

As of right now, the lavaan package has no built in plotting functions for models, and the available options from external packages don't look as nice and aren't as easy to use as DiagrammeR, in my opinion.

Of course, you can use DiagrammeR to build path diagrams for your models, but it requires you to build the diagram specification manually. This package exists to streamline that process, allowing you to plot your lavaan models directly, without having to translate them into the DOT language specification that DiagrammeR uses. The package is very straightforward to use, simply call the lavaanPlot function with your lavaan model, adding whatever graph, node and edge attributes you want as a named list graph attributes are specified as a standard default value that shows you what the other attribute lists should look like.

For your reference, the available attributes can be found here:. First fit your lavaan model. The package supports plotting lavaan regression relationships and latent variable - indicator relationships. Then using that model fit object, simply call the lavaanPlot function, specifying your desired graph parameters. And now you can label the plot edges with the coefficient values standardized or not for significant paths you can also specify whatever significance level you want so you can plot values for whatever coefficients you want.

For more information on customizing the embed code, read Embedding Snippets.

Start managing your projects on the OSF today.

Man pages 6. API 6. Source code 2. In lavaanPlot: Path Diagrams for Lavaan Models via DiagrammeR Introduction The lavaan package is an excellent package for structural equation models, and the DiagrammeR package is an excellent package for producing nice looking graph diagrams. Package example The package is very straightforward to use, simply call the lavaanPlot function with your lavaan model, adding whatever graph, node and edge attributes you want as a named list graph attributes are specified as a standard default value that shows you what the other attribute lists should look like.

Any scripts or data that you put into this service are public. R Package Documentation rdrr. We want your feedback!

Note that we can't provide technical support on individual packages. You should contact the package authors for that. Tweet to rdrrHQ. GitHub issue tracker. Personal blog. What can we improve? The page or its content looks wrong. I can't find what I'm looking for. I have a suggestion. Extra info optional.

intro to lavaan

Embedding an R snippet on your website. Add the following code to your website.The lavaan package is an excellent package for structural equation models, and the DiagrammeR package is an excellent package for producing nice looking graph diagrams.

Of course, you can use DiagrammeR to build path diagrams for your models, but it requires you to build the diagram specification manually. This package exists to streamline that process, allowing you to plot your lavaan models directly, without having to translate them into the DOT language specification that DiagrammeR uses.

The package is very straightforward to use, simply call the lavaanPlot function with your lavaan model, adding whatever graph, node and edge attributes you want as a named list graph attributes are specified as a standard default value that shows you what the other attribute lists should look like. For your reference, the available attributes can be found here:. First fit your lavaan model. The package supports plotting lavaan regression relationships and latent variable - indicator relationships.

Then using that model fit object, simply call the lavaanPlot function, specifying your desired graph parameters. And now you can label the plot edges with the coefficient values standardized or not for significant paths you can also specify whatever significance level you want so you can plot values for whatever coefficients you want.

Introduction The lavaan package is an excellent package for structural equation models, and the DiagrammeR package is an excellent package for producing nice looking graph diagrams.

Package example The package is very straightforward to use, simply call the lavaanPlot function with your lavaan model, adding whatever graph, node and edge attributes you want as a named list graph attributes are specified as a standard default value that shows you what the other attribute lists should look like.


Dazuru

thoughts on “Intro to lavaan

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top