The Quality of a Composite Score of a Concept with Reflective Indicators
Complex concepts with reflective indicators are latent variables that have an effect on different indicators of the latent variable. For example the latent variable “Job satisfaction” with the indicators such as” how satisfied are you with your job”, “how much do you like your job”, or “would you choose this job again”. The answers to these questions are determined by the latent variable "Job satisfaction". A causal model, specified as a factor model can be used for the relationships. Normally a weighted sum score of the responses to the indicators is used as a measure for the latent variable.
This situation is presented in the picture below. The observed answers to the three questions are represented by $Y_{1}$, $Y_{2}$, $Y_{3}$. These observed responses contain error ($e_{i}^{'}$). The composite score ($y_{cs}$) is the weighted sum of these three variables, therefore this composite score will also contain errors. So the question to answer is: how good is this composite score as measure of the complex concept satisfaction ($F$) given the indicators $F_{1}$, $F_{2}$,$F_{3}$ and the systematic effects connected with the chosen method ($M$)?
This situation is presented in the picture below. The observed answers to the three questions are represented by $Y_{1}$, $Y_{2}$, $Y_{3}$. These observed responses contain error ($e_{i}^{'}$). The composite score ($y_{cs}$) is the weighted sum of these three variables, therefore this composite score will also contain errors. So the question to answer is: how good is this composite score as measure of the complex concept satisfaction ($F$) given the indicators $F_{1}$, $F_{2}$,$F_{3}$ and the systematic effects connected with the chosen method ($M$)?
Estimating the weights and the loadings
From SQP one can get information about the reliability coefficient ($r_{i}$) and validity coefficient ($v_{i}$) for all three or more single questions while the coefficient $\mu_{i}$ can be computed because $\mu_{i}^2= 1-v_{i}^2$. That means that in the model only the effects of $F$ on its indicators ($F_{i}$) and the weights ($w_{i}$) are unknown.
The weights can be chosen to be 1 or unequal to 1, that is up to the researcher. Lawley and Maxwell (1971) have suggested an approach to determine the weights that maximize the quality (correlation) of the composite score for the complex concept. Given the weights the values of the composite score can be computed for all cases and the variance and standard deviation of these variables are also known. In order to take care that the composite score has a variance of 1 like all other variables, the chosen weights have to be divided by ${\sigma _{y_{cs}}}$ which is the standard deviation of the composite score before standardization.
It can be shown that on the basis of the correlations between the observed variables $Y_{1}$, $Y_{2}$, $Y_{3}$ the correlations between the latent variables $F_{1}$, $F_{2}$,$F_{3}$ can be obtained.
On the basis of these correlations the effects $\lambda_{i}$ between the concept of interest and the indicators can be estimated if there are at least 3 indicators. If there are only two questions used the $\lambda_{1}$ and $\lambda_{2}$ can only be estimated assuming that these two coefficients are equal. Because the indicators will not all perfectly represent the complex variables of interest, we introduce also unique components ($u_{1}$,$u_{2}$,$u_{3}$) for the indicators. They are another source of error.
The quality of $Y_{cs}$
Given this specification of the model it follows
$Y_{i} = r_{i}v_{i}\lambda_{i}F+r_{i}\mu_{i}M+r_{i}v_{i}u_{i}+e_{i}$ for all $i$
The unstandardized composite score is denoted by $y_{cs}$.
$y_{cs} = \sum_{i=1}^{k}w_{i}Y_{i}$ with variance $\sigma_{y_{cs}y_{cs}}$ and standard deviation $\sigma_{y_{cs}}$
In order to standardize $y_{cs}$ the weights are devided by $\sigma_{y_{cs}}$
$Y_{cs} = \sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}Y_{i}=\sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}(q_{i}\lambda_{i}F+q_{i}u_{i}+r_{j}\mu_{j}M+e_{i}^{'})$
We can also write
$Y_{cs} = \sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}r_{i}v_{i}\lambda_{i}F+\sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}r_{i}\mu_{i}M+\sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}(r_{i}v_{i}u_{i}+e_{i})$
or
$Y_{cs} = q_{_{Y_{cs}}}F+m_{_{Y_{cs}}}M+\sum_{i=1}^{k}(r_{i}v_{i}u_{i}+e_{i})$
where
$q_{_{Y_{cs}}}=\sum_{i=1}^{k} \frac{w_{i}}{\sigma_{y_{cs}}}(r_{i}v_{i}\lambda_{i})$
$m_{_{Y_{cs}}}=\sum_{i=1}^{k} \frac{w_{i}}{\sigma_{y_{cs}}}(r_{i}\mu_{i})$
The above shown figure can therefore be simplied to:
From SQP one can get information about the reliability coefficient ($r_{i}$) and validity coefficient ($v_{i}$) for all three or more single questions while the coefficient $\mu_{i}$ can be computed because $\mu_{i}^2= 1-v_{i}^2$. That means that in the model only the effects of $F$ on its indicators ($F_{i}$) and the weights ($w_{i}$) are unknown.
The weights can be chosen to be 1 or unequal to 1, that is up to the researcher. Lawley and Maxwell (1971) have suggested an approach to determine the weights that maximize the quality (correlation) of the composite score for the complex concept. Given the weights the values of the composite score can be computed for all cases and the variance and standard deviation of these variables are also known. In order to take care that the composite score has a variance of 1 like all other variables, the chosen weights have to be divided by ${\sigma _{y_{cs}}}$ which is the standard deviation of the composite score before standardization.
It can be shown that on the basis of the correlations between the observed variables $Y_{1}$, $Y_{2}$, $Y_{3}$ the correlations between the latent variables $F_{1}$, $F_{2}$,$F_{3}$ can be obtained.
On the basis of these correlations the effects $\lambda_{i}$ between the concept of interest and the indicators can be estimated if there are at least 3 indicators. If there are only two questions used the $\lambda_{1}$ and $\lambda_{2}$ can only be estimated assuming that these two coefficients are equal. Because the indicators will not all perfectly represent the complex variables of interest, we introduce also unique components ($u_{1}$,$u_{2}$,$u_{3}$) for the indicators. They are another source of error.
The quality of $Y_{cs}$
Given this specification of the model it follows
$Y_{i} = r_{i}v_{i}\lambda_{i}F+r_{i}\mu_{i}M+r_{i}v_{i}u_{i}+e_{i}$ for all $i$
The unstandardized composite score is denoted by $y_{cs}$.
$y_{cs} = \sum_{i=1}^{k}w_{i}Y_{i}$ with variance $\sigma_{y_{cs}y_{cs}}$ and standard deviation $\sigma_{y_{cs}}$
In order to standardize $y_{cs}$ the weights are devided by $\sigma_{y_{cs}}$
$Y_{cs} = \sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}Y_{i}=\sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}(q_{i}\lambda_{i}F+q_{i}u_{i}+r_{j}\mu_{j}M+e_{i}^{'})$
We can also write
$Y_{cs} = \sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}r_{i}v_{i}\lambda_{i}F+\sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}r_{i}\mu_{i}M+\sum_{i=1}^{k}\frac{w_{i}}{\sigma _{y_{cs}}}(r_{i}v_{i}u_{i}+e_{i})$
or
$Y_{cs} = q_{_{Y_{cs}}}F+m_{_{Y_{cs}}}M+\sum_{i=1}^{k}(r_{i}v_{i}u_{i}+e_{i})$
where
$q_{_{Y_{cs}}}=\sum_{i=1}^{k} \frac{w_{i}}{\sigma_{y_{cs}}}(r_{i}v_{i}\lambda_{i})$
$m_{_{Y_{cs}}}=\sum_{i=1}^{k} \frac{w_{i}}{\sigma_{y_{cs}}}(r_{i}\mu_{i})$
The above shown figure can therefore be simplied to:
The quality of the composite score as measure of $F$ is equal to $q_{_{Y_{cs}}}^2$.
The variance of the unstandardized complex concept corrected for measurement error is presented below:
$\sigma_{_{ff}}=q_{_{Y_{cs}}}^2\sigma_{y_{cs}y_{cs}}$
The proofs of these results can be found in:
DeCastellarnau, A. & Saris , W. (2021) Correcting correlation and covariance matrices for measurement errors before further analysis. Structural Equation Modeling: A Multidisciplinary Journal. https://doi.org/10.1080/10705511.2020.1870229