## Exploratory Factor Analysis using R

r

>shape.v13<-factor(file$v13.shape,levels = c("Strongly Disagree","Disagree","Neutral","Agree","Strongly Agree"),ordered=T) # this arranged the item in a ascending order

to get the numeric value one can use

>as.numeric(shape.v13) #it will yield result in rank format 1,2,3,4,5. which is suitable enough to analyse cronbach alpha,

after making the data frame of the items that need to undergo cronbach alpha analysis we can use "psych" package and alpha() command to find cronbach alpha.

> KMO(correlations)

>cortest.bartlett(correlations, n = nrow(dataset))

Steps of Factor Analysis using R

> install.packages("psych")

> install.packages("GPArotation") #install these two packages and then use library to bring them in action

> data<-read.csv(file.choose(),header=T) # bring the data in likert scale

#now we will use parallel analysis technique that will use scree plot as a base to suggest no. of factors from the data

> parallel<-fa.parallel(data,fm="minres",fa="fa") #if you won't use fa="fm" the scree plot will show PCA & FA both

# below is the result

Parallel analysis suggests that the number of factors = 5 and the number of components = NA

**Cronbach's Alpha**- it is used to measure i**nternal consistency (reliability)**. It is mostly used when we have likert scale questionnaire that form a scale and we wish to know if the scale is reliable or not. It is actually based on the correlation between various item of the same test. Example -Now Internal consistency means that if a respondent says that it likes milk product , second it says it like curd, third it says it like milk product very long, fourth it says he doesn't like a diet that has no milk product, so this shows a high reliability of the data, that is all the answers are in a synchronisation. Conversly, If I said that I don't like running, and in the next question I said I like to travel on feet than the reliability will go down because it's a contradiction. The value of alpha is good between .7-.9. If it is more than .9 i.e .95 then it may suggest high level of item redundancy, i.e. a number of items are asking the same question in a slightly different way.

>shape.v13<-factor(file$v13.shape,levels = c("Strongly Disagree","Disagree","Neutral","Agree","Strongly Agree"),ordered=T) # this arranged the item in a ascending order

to get the numeric value one can use

>as.numeric(shape.v13) #it will yield result in rank format 1,2,3,4,5. which is suitable enough to analyse cronbach alpha,

after making the data frame of the items that need to undergo cronbach alpha analysis we can use "psych" package and alpha() command to find cronbach alpha.

- Before starting Factor Analysis we need to test whether the samples are adequate enough. To check for adequacy we do KMO (Kaiser-Meyer-Oklin) test and the scores closer to 1 are good while towards 0 are bad. below are the code for R

> KMO(correlations)

- Bartlett's Test is used to find whether the variables are correlated or not - if non-significant (p>.05) then variables are not correlated if significant (p<.05) the variables are correlated and it is good for EFA, because a small correlation is expected
- To do Bartlett's Test we use following code in R

>cortest.bartlett(correlations, n = nrow(dataset))

- Libraries need to perform EFA- car, psych, GPArotation
- To reverse code the value we use recode command were p is the dataset with song2 as one of the column

- Sample Size- at least 10-15 items per variable; total less than 100 item are not acceptable; 300 items are most appropriate.
- Atlest 3-4 variable per Factor are necessay
- R's base package has factanal() function that will do EFA using maximum likelihood estimation technique. The psych package is more advanced with multi functionality options.

Steps of Factor Analysis using R

__There are 2 techniques to find the no. of factors we will discuss__**1. Parrallel Analysis**> install.packages("psych")

> install.packages("GPArotation") #install these two packages and then use library to bring them in action

> data<-read.csv(file.choose(),header=T) # bring the data in likert scale

#now we will use parallel analysis technique that will use scree plot as a base to suggest no. of factors from the data

> parallel<-fa.parallel(data,fm="minres",fa="fa") #if you won't use fa="fm" the scree plot will show PCA & FA both

# below is the result

Parallel analysis suggests that the number of factors = 5 and the number of components = NA

The above structure is wrong to show in paper as it show eigen value corresponding to factors, but we need eigen value corresponding to variables/component which can be found using the command

> VSS.scree(analyse) #use this in research paper.

COMMUNALITY

we check the communality on the basis of PCA not of Factor Analysis.use PCA to eradicate variables on the basis of communality - to get the communality use PCA not FA because PCA use complete variance & not shared variance & unique variance like FA use ->principal(r=analyse,nfactors=6,rotate = "oblimin") #PCA for removal of factors with less communality

FA shows less communality because first, some variables are removed due to PCA communality, second, FA does not used complete variance it uses shared variance.

Velicer's Minimum average partial or

one can use

> VSS(dataset)

https://www.promptcloud.com/blog/exploratory-factor-analysis-in-r/

one can see the complete detail of Exploratory factor analysis in above link

The two major command that will be used are as below, the details of all the component are discussed in above link

> threefactor<-fa(data,nfactors = 5,rotate = "oblimin",fm="minres")

> print(threefactor$loadings,cutoff = .3)

now the purpose is that the obtained factor matrix should not have a). cross loading ; b). Any component without loading , to mange this you can play with nfactors & cutoff in above two major commands.

At the last variable with cross loading or no. loading need to be removed.

After getting the factor loading and result we can make diagram to see the factor & their variables

> fa.diagram(threefactor)

> VSS.scree(analyse) #use this in research paper.

COMMUNALITY

we check the communality on the basis of PCA not of Factor Analysis.use PCA to eradicate variables on the basis of communality - to get the communality use PCA not FA because PCA use complete variance & not shared variance & unique variance like FA use ->principal(r=analyse,nfactors=6,rotate = "oblimin") #PCA for removal of factors with less communality

FA shows less communality because first, some variables are removed due to PCA communality, second, FA does not used complete variance it uses shared variance.

Velicer's Minimum average partial or

**Very Simple Structure Criterion (VSS)**is also used to find no. of factorsone can use

> VSS(dataset)

**Exploratory Factor analysis**https://www.promptcloud.com/blog/exploratory-factor-analysis-in-r/

one can see the complete detail of Exploratory factor analysis in above link

The two major command that will be used are as below, the details of all the component are discussed in above link

> threefactor<-fa(data,nfactors = 5,rotate = "oblimin",fm="minres")

> print(threefactor$loadings,cutoff = .3)

now the purpose is that the obtained factor matrix should not have a). cross loading ; b). Any component without loading , to mange this you can play with nfactors & cutoff in above two major commands.

At the last variable with cross loading or no. loading need to be removed.

After getting the factor loading and result we can make diagram to see the factor & their variables

> fa.diagram(threefactor)

Kaiser Criterion (determine no. of factors)

Kaiser criterion is also used to find the no. of factors it is based on eigen value determination. Here one need to find the eigen value of the data. The data is first converted into correlation matrix and then eigen value is found. We discard the factor having eigen value less than 1.

It is assumed in beginning that the no. of factors are equal to no. of variables, we shortlist them on the basis of eigen value further. Those having less than 1 as eigen value are discarded as factor. See the codes below.

>alleigenvalue<- eigen(cor(data))

> alleigenvalue$values # it will show eigen values columnwise, discard all less than 1.

For more details visit the link http://yatani.jp/teaching/doku.php?id=hcistats:fa

Below is the Link for factor analysis using base package in r

https://www.youtube.com/watch?v=Ilf1XR-K3ps

Exploratory Factor Analysis - Result output analysis using R

The chi-square goodness of fit in EFA is shown in output. The null hypothesis of test is that "Model described by the factor predicts the data well". So, in result we expect that null hypothesis should go forward by getting p>.05, the larger it is better it will be.

Kaiser criterion is also used to find the no. of factors it is based on eigen value determination. Here one need to find the eigen value of the data. The data is first converted into correlation matrix and then eigen value is found. We discard the factor having eigen value less than 1.

It is assumed in beginning that the no. of factors are equal to no. of variables, we shortlist them on the basis of eigen value further. Those having less than 1 as eigen value are discarded as factor. See the codes below.

>alleigenvalue<- eigen(cor(data))

> alleigenvalue$values # it will show eigen values columnwise, discard all less than 1.

For more details visit the link http://yatani.jp/teaching/doku.php?id=hcistats:fa

Below is the Link for factor analysis using base package in r

https://www.youtube.com/watch?v=Ilf1XR-K3ps

Exploratory Factor Analysis - Result output analysis using R

The chi-square goodness of fit in EFA is shown in output. The null hypothesis of test is that "Model described by the factor predicts the data well". So, in result we expect that null hypothesis should go forward by getting p>.05, the larger it is better it will be.