Assessing Whether Measurement Invariance of the KIDSCREEN-27 across Child-Parent Dyad Depends on the Child Gender: A Multiple Group Confirmatory Factor Analysis

This study aims to assess the measurement invariance (MI) of the KIDSCREEN-27 questionnaire across girl-parent and boy-parent dyad to clarify how child gender affects the agreement between children’s and parents’ perception of the meaning of the items in the questionnaire. The child self-reports and parent proxy-reports of the KIDSCREEN-27 were completed by 1061 child-parent dyad. Multiple group categorical confirmatory factor analysis (MGCCFA) was applied to assess MI. The non-invariant items across girl-parent dyad were mostly detected in the psychological well-being and the social support and peers domains. Moreover, the boys and their parents differed mainly in the autonomy and parent relation domain. Detecting different non-invariant items across the girl-parent dyad compared to the boy-parent dyad underlines the importance of taking the child’s gender into account when assessing measurement invariance between children and their parents and consequently deciding about children’s physical, psychological or social well-being from the parents’ viewpoint.

simply evaluated by comparing the mean of HRQOL scores. For a meaningful interpretation of mean differences, an essential assumption known as measurement invariance (MI) should be established (Meredith & Teresi, 2006;Teresi & Fleishman, 2007). MI shows that respondents from different groups interpret the concept of particular items similarly (Cheung & Rensvold, 2002;Vandenberg & Lance, 2000;Huang, Shenkman, Leite, Knapp, Thompson, & Revicki, 2009). If MI does not hold, it is not clear whether the observed disparity is a real difference in the underlying construct of interest or it is an artificial effect of different implicit or explicit interpretation of items by children and their parents (Cheung & Rensvold, 2002;Kim & Yoon, 2011). To the best of our knowledge, a limited number of studies have examined the MI of the pediatric questionnaires across child-parent dyad in different cultures (Huang et al., 2009;Lin, Luh, Cheng, Yang, Su, & Ma, 2013;Jafari, Bagheri, Hashemi, & Shalileh, 2013;Jafari, Sharafi, Bagheri, & Shalileh, 2013). However, one limitation of these studies is that assessing MI across children and their parents was conducted without taking the child's gender into account. Hence, it has not yet become clear whether the agreement between children's and parents' perception of the items depends on the child's gender. This issue is investigated in this study for the first time.
To explore MI, various statistical techniques are available, which have their own advantages and disadvantages (Kankaras, Vermunt, & Moors, 2011). Multiple group categorical confirmatory factor analysis (MGCCFA) is a method which can appropriately model the ordered-categorical responses, (Kim & Yoon, 2011), but rarely used in practice for pediatric HRQOL studies. For example, two of the mentioned studies (Huang et al., 2009;Lin et al., 2013) made use of the ordinary linear multiple group confirmatory factor analysis model (MGCFA) assuming that the response variable is continuous and normally distributed (Meredith, 1993).
The KIDSCREEN-27 is a well-known international generic HRQOL instrument with parallel child self-reports and parent proxy-reports. Although the MI of the KIDSCREEN have been established across different European countries (Revens-Sieberer et al., 2007;Robitail, Ravens-Sieberer, Simeoni, Rajmil, Bruil, & Power, 2007), this issue has not been evaluated across child-parent dyad yet. Therefore, this study aims to utilize the advantage of MGCCFA approach to examine the MI of the KIDSCREEN-27 across girl-parent and boy-parent dyad to clarify how child gender affects the agreement between children's and parents' perception of the meaning of the items in the questionnaire.

Data and Measure
The target population was composed of 287551 school children aged 8-18 years and their parents in Shiraz, in the academic year 2011-2012. The participants were selected based on a two-stage cluster random sampling technique from the four educational districts of Shiraz, southern Iran. In the first stage, 6 middle and 7 high schools in each educational district were chosen randomly by cluster sampling. Within the selected 24 middle schools and 28 high schools, 40 and 50 classes were chosen respectively by stratified sampling. Finally, in each class a simple random sample of students was selected. Both child self-reports and parent proxy-reports of the Persian version of the KIDSCREEN-27 questionnaire and informed parent consent forms were distributed in each classroom by a trained researcher asking the students to take them home to their parents. About 75% of parents signed the informed consent and agreed to participate in the study with their children. Only one parent (mother or father) completed the proxy-reports, but there was no information on the gender of proxy rater in this study. The proxy-reports of the questionnaire were signed by parents to ensure that parents actually completed the questionnaire. All the children and their parents in our sampling population spoke Persian. They independently filled in the self-and proxy-reports at home. Finally 1061 completed child self-reports and parent proxy-reports of the questionnaire were returned to school by students. Of the 1061 children 593 (55.89%) were boys. The mean (± SD) age of boys was 13.65 ± 2.11 years and that of girls was 12.70 ± 2.65 years.
original version, and the results of Rasch analysis confirmed that all items belong to their own underlying construct (Jafari, Bagheri, & Safe, 2012).
This instrument encompasses five subscales of physical well-being (5 items), psychological well-being (7 items), autonomy and parent relation (7 items), social supports and peers (4 items), and school environment (4 items). All items were scored on a 5-point Likert scale from 1=never to 5=always or from 1=not at all to 5=extremely. For ease of interpretation, rating scale categories of negatively worded items were reversed so that higher scores showed better HRQOL.

Statistical Analysis
In this study, the MI of each domain of the KIDSCREEN-27 across girl-parent and boy-parent dyad was assessed by MGCCFA technique. An advantage of MGCCFA is that it can appropriately model the ordered-categorical responses by including the threshold structure (Kim & Yoon, 2011;Lubke & Muthen, 2004). Interested readers are encouraged to consult with Wirth and Edwards (2007) and Joreskog (2002) for exact mathematical description and technical information.
In the present study, four levels of invariance including configural, metric, strong, and strict invariance were tested by setting four types of constraint on a set of parameters in an increasingly hierarchical order. Configural invariance investigates whether an underlying construct of interest is measured by the same set of items across two groups i.e., children and their parents use the same conceptual framework in their appraisal of underlying construct of interest (Meredith, 1993). In this situation, no equality constraint is imposed on the parameters across groups (Meredith, 1993). However, in both groups the same configuration of salient (non-zero) or nonsalient (zero) factor loadings are considered for the items (Ritchey, Frank, Hursti, & Tuorila, 2003). After establishing configural invariance, the next step is to assess metric invariance which is a prerequisite for testing higher levels of invariance (Cheung & Rensvold, 2002). Metric invariance is investigated by testing whether factor loadings of a construct are equal across groups implying that the strength of relations between a specific scale items and their corresponding underlying constructs are the same across the groups (Wu, Zhen, & Bruno, 2007). Strong invariance requires that besides factor loadings, the intercepts of like items are equal across subgroups. Equality of groups' intercept shows that no systematic biases exist in the response of a group to the given scale items (Wu et al., 2007). Strict invariance is the most constrained model in which the variance of item residual should be equal across subgroups in addition to the intercepts and factor loadings. Strict invariance implies that scale items measure the latent construct with the same degree of measurement error in both groups (Wu et al., 2007). In this study the concept of partial measurement invariance is applied. When a certain level of invariance are not www.ccsenet.org/gjhs Global Journal of Health Science Vol. 6, No. 5; satisfied for all the items in a specific domain, the equality constraints on the parameters of the items demonstrating non-invariance between groups are removed one at a time and they will be allowed to be estimated freely (Byrne, Shavelson, & Muthen, 1989). Assessing partial invariance allows researchers to identify non-invariant or problematic items, which is the purpose of our study (Sass, 2011). We utilized the information from modification index to identify non-invariant items. Large magnitude of pertinent modification indices may be indicative of non-invariance. A modification index is the expected reduction in the value of the chi-square statistics when a fixed or constrained parameter is freely estimated (Diamantopoulos & Siguaw, 2000).
MI hypotheses are assessed by comparing two nested models. For instance, for testing strong invariance a model requiring equality constraints on factor loadings and intercepts is compared to the one just with equal factor loadings across groups. For assessing the fit of each model the chi-square (χ 2 ) statistic, and for testing relative fit of two nested model, the change in it (∆χ 2 ) can be used. However, this well-known statistics is not a practical test of model fit due to the detection of even trivial differences under large sample size (Cheung & Rensvold, 2002). Therefore, due to the relatively large sample size used in the study, although the value of χ 2 (with df) and (∆χ 2 ) were reported, other fit indices like comparative fit index (CFI), ΔCFI (change in CFI values of two nested model), and root mean square error of approximation (RMSEA) were considered for a final decision about accepting or rejecting the hypothesis of interest owing to their more reliable results. Because Cheung and Rensvold (2002) recommended these comparative fit indices are more robust to model complexity and variety of sample size. A nonsignificant ∆χ 2 , value of CFI≥0.95, ΔCFI≤0.01, and RMSEA≤0.06 can support acceptable model fit (Cheung & Rensvold, 2002).
In this study, LISREL 8.52 (Jorskog & Sorbom, 2003) was used to estimate MGCCFA model in a three-stage process. In step 1 using PRELIS, the thresholds of each latent response variable were estimated by pooling data from the two groups based on maximum likelihood method. PRELIS is a companion program that serves as a pre-processor for LISREL. It is used for calculating sample correlation and covariance matrices from raw data, and for estimating asymptotic covariances. In step 2, while holding the thresholds fixed at the previous step estimated values, the mean, asymptotic covariance matrices, and polychoric correlation were estimated for each group using conditional maximum likelihood. Finally, in step 3, the model parameters are estimated using generalized least square and MI hypotheses were examined (Muthen & Asparouhov, 2002;Jorskog, 2002). Table 1 shows the items identified with each type of measurement non-invariance across girl-parent as well as boy-parent dyad. No items were flagged as measurement non-invariant in the psychological well-being domain between boys and their parents and in the school environment domain in both boy-parent and girl-parent dyad. Tables 2 and 3 present the values of fit indices of MI for each domain of the KIDSCREEN-27 in girl-parent and boy-parent dyad, respectively.

Configural Invariance
Configural invariance was supported by the values of CFIs≥0.95, RMSEAs≤0.06, and ∆CFIs≤0.01 for all domains across girl-parent and boy-parent dyad. This means that regardless of child gender, children and their parents used the same conceptual framework in their appraisal of all domains in the KIDSCREEN-27.

Metric Invariance
The fit indices values of metric invariance model were acceptable for all domains across boy-parent dyad. However, across girl-parent dyad, this hypothesis was supported for all domains except for the psychological well-being domain (RMSEA=0.063 and ∆CFI=0.011), in which the value of modification indices (not shown here) indicated that items 2 (Been in a good mood) and 7 (Been happy with the way you are) had different factor loadings among the groups. This result suggests that the associations of all items with their corresponding underlying constructs are equivalent across the groups except for the two cited items (items 2 and 7). After relaxing the equality constraints of the factor loadings of these two items across groups, partial metric invariance was accepted (RMSEA=0.054, CFI=0.968, and ∆CFI=0.001).

Strong Invariance
Moreover, the full strong invariance was held for the physical well-being and the school environment domains across girl-parent dyad (according to the acceptable values of CFA, ∆CFI, and RMSEA of Model 4 in Table 2) and for the psychological well-being, the school environment, and the social support and peers domains across boy-parent dyad (based on the satisfactory values of fit indices of Model 4 in Table 3). However, after removing the invariance constraints on the intercept of item 1 (Your life been enjoyable) in the psychological well-being domain and that of items 3 (Your parent had enough time for you), 4 (Your parent treated you fairly), and 7 (Had www.ccsenet.org/gjhs Global Journal of Health Science Vol. 6, No. 5; enough money for your expenses) in the autonomy and parent relations domain, the partial strong invariance was accepted across girl-parent dyad. Across boy-parent dyad, the modification indices suggested freely estimating the intercepts for items 1 (How would you say your health is) and 2 (Felt fit and well) in the physical well-being domain and for items 1 (Had enough time for yourself), 2 (Been able to do thing), 3 (Your parent had enough time for you), 4 (Your parent treated you fairly), and 6 (Had enough money to do things as your friend) in the autonomy and parent relations domain, resulting in well-fitting partial strong invariance. Non-invariant intercepts of the above items indicates that children and parents responded differently to these items.

Strict Invariance
The most constrained assumption, i.e., strict invariance was revealed according to the acceptable values of fit indices of Model 6 for the psychological well-being, the autonomy and parent relations, and the school environment domains across girl-parent dyad as well as boy-parent ones. In both girl-parent and boy-parent dyad, after relaxing the equality of the residuals variance of the items 2 (Felt fit and well) and 3 (Been physically active) in the physical well-being domain, partial strict invariance was verified (RMSEA<0.06, CFI≥0.95, and ∆CFI<0.01 for Model 7 in Tables 2 and 3). Partial strict invariance was held for the social support and peers domain across boy-parent dyad after freely estimating the residuals variance of item 4 (Been able to rely on your friends) across the groups. Moreover, across girl-parent dyad strict invariance of this domain was completely rejected. Therefore, there is adequate evidence that the above-mentioned items showing non-invariance residuals variance measure the latent construct with different degree of measurement error across children and their parents.    Vol. 6, No. 5;

Discussion
This is the first study assessing MI of the KIDSCREEN-27 across girl-parent and boy-parent dyad separately. The findings revealed that the Persian version of the KIDSCREEN-27 is not an invariant measure neither across boy-parent nor girl-parent dyad.
Results showed that different non-invariant items were distinguished across girl-parent dyad as compared with boys-parent dyad. While the non-invariant items across girl-parent dyad were mostly detected in the psychological well-being, and the social support and peers domains, the boys and their parents differed mainly in the autonomy and parent relation domain. This discrepancy can be attributed to extensive hormonal fluctuation in teenage girls leading to more psychosomatic disorders and emotional disturbance as compared to boys (Viira & Koka, 2012). Moreover, girls and boys behave differently in their social relations due to higher and also more contradictory social expectations placed on girls than their male counterparts (Seiffge-Krenke, 1999), which may result in a higher level of disparity across girl-parent dyad in the social support and peers. On the other hand, boys have a higher tendency to be independent in their youth than girls leading to higher levels of disagreement among boy-parent dyad in the autonomy and parent relations domain (Bisegger, Cloetta, von Rueden, Abel, & Ravens-Sieberer, 2005).
However, it should be noted that these assertions cannot be made definitely without acknowledging parent's gender. For instance, fathers encourage their sons to independence and autonomy more than their daughters (Seiffge-Krenke, 1999). Upton et al. (2008) asserted that sons and daughters have different relationships with each parent and also fathers and mothers have different perspective from their child HRQOL (Hill, Kondryn, Mackie, McNally, & Eden, 2003). Hence, further studies should consider the sex composition of the child-parent dyad on MI of HRQOL questionnaires across these two groups.
The previous studies evaluating MI of the other pediatric HRQOL instruments (PedsQL 4.0 and KINDL) across child-parents dyad did not consider child gender (Huang et al., 2009;Jafari, Sharafi et al., 2013;Lin et al., 2013). However, they arrived at the same conclusion that children and their parents interpret some items differently. In this condition, the observed mean differences may reflect differences in the interpretation rather than true differences. Therefore, comparing between children's and parents' ratings of pediatric HRQOL is meaningless. Our study revealed that caution is warranted in comparing boys' and parents' ratings for all domains of the KIDSREEN-27, except for the psychological well-being and the school environment domains, which indicate full MI. Likewise, comparison between girls' and parents' ratings for all domains, except for the school environment domain, must be done with caution.
In this case, it is important to develop a new or revised version of the instrument with invariant items for meaningful cross-group comparisons. However, non-invariance of an item should not per se lead to its elimination, especially when the instrument has high convergent and discriminant validity (Sireci, 2011). Regarding the KIDSCREEN-27, our previous study showed that both child self-reports and parent proxy-reports of the questionnaire had high convergent and discriminant validity, and the results of Rasch analysis confirmed that all items belong to their own underlying construct (Jafari, Bagheri, & Safe, 2012). Similarly, in spite of detecting some non-invariant items in the Persian version of the PedsQL 4.0 across children and their parents , the instrument has good psychometric properties including high convergent and discriminant validity in both child self-reports and parent proxy-reports versions (Jafari, Bagheri, Ayatollahi, & Soltani, 2012). Therefore, we believe that rewording or modification of the non-invariant items is a better choice than removing.

Limitations
Our study has some limitations that merit attention when interpreting our findings. First, the effect of clustered data (individual within district, school and class) has not been taken into account which may result in increased likelihood of finding measurement non-invariant items . Hence, we suggest applying hierarchical ordinal logistic regression model  or multilevel multiple-indicator multiple-cause (MIMIC) approach  in future studies to obtain more reliable and realistic results. Second, we assessed the MI across child-parent dyad classified only by the gender of the children and also our participants were apparently healthy school children. Hence, further in-depth studies are needed to evaluate the effects of additional factors such as children's age or health status in addition to parents' age, gender, mental, and physical health status on the MI of pediatric HRQOL questionnaires across these two groups. Finally, for comparability of our results with those of previous studies (Huang et al., 2009;Jafari, Sharafi et al., 2013), we run the model separately for each domain. However, considering correlation between domains (Cheung, & Rensvold, 1999) could be changed our results principally.

Conclusion
In conclusion, detecting different non-invariant items across the girl-parent dyad compared to the boy-parent dyad underlines the importance of taking the child's gender into account when assessing measurement invariance between children and their parents and consequently deciding about children's physical, psychological or social well-being from the parents' viewpoint. Therefore, future studies should replicate or extend our findings from the analysis of KIDSCREEN-27 to other pediatric HRQOL measures and also in other cultures to provide a questionnaire with measurement invariant items which is also stable regarding child gender.