A Determination of the Construct Validity of Both an Adapted Self-Confidence Questionnaire, the Personal Evaluation Inventory [PEI], (Shrauger & Schohn, 1995) and a Generalised Anxiety Disorder [GAD] Questionnaire (Taylor, 1953)

The objectives in this research were to determine the construct validity of both an adapted self-confidence questionnaire, the Personal Evaluation Inventory (PEI), developed by Shrauger and Schohn, and a Generalised Anxiety Disorder (GAD) questionnaire, which was adapted from the Taylor Manifest Anxiety Scale. The research was conducted in two girls’ primary schools in Saudi Arabia to collect relevant data on the convergent and discriminant validity of the GAD and PEI questionnaires using the Multi-Trait Multi-Method (MTMM) matrix to prove construct validity. Sixty students and two teachers filled in questionnaires, with each student evaluating themselves and, then, their peers. The teachers evaluated themselves, their self-confidence and generalised degree of their anxiety disorder. The results were that the MTMM analysis supported, to a large extent, both convergent and discriminant validity of the analysed data from students and teachers on two traits (self-confidence and generalised anxiety disorder) and across three methods of measurement (self-reporting, peer-rating and teacher-rating). The results were that the Mono-Trait Mono-Method coefficients were relatively high, and there was relative strength in the Hetero-Trait Mono-Method coefficients. The Hetero-Trait Mono-Method coefficients were reasonable for self-confidence and for the generalised anxiety disorder questionnaires, but teacher-ratings for both traits were unexpected. Furthermore, the Hetero-Trait Hetero-Method coefficients were not constant and showed an unstable variance. In conclusion, the PEI and GAD questionnaires possess acceptable construct validity, but that the teacher-ratings for both the PEI and the GAD questionnaires needed modification in order to attain the desirable construct validity.


Introduction
The mental health condition 'Generalised Anxiety' is one in which an individual is often worried or very anxious about many things; this is a state of mind that they find difficult to control. Such an individual finds that even normal circumstances are either extremely difficult or even traumatising and they tend to be prone to depression (Zung, 1974). Although anxiety itself has been understood to be an essential aspect of human life, the actual severity of the condition of Generalised Anxiety has stimulated the need for research, both in the psychological. The Taylor Manifest Anxiety Scale (TMAS) (McDowell & Newell, 1996) was devised to assess the degree to which individuals are affected. This approach was used in this research, so as to help in measuring Generalised Anxiety Disorder (GAD) as a construct. Paramount solutions for this can be derived using the construct validity method. The researcher also selected the trait of self-confidence to complete the process using similar methods.
Self-confidence (SC) has been found to bean important component in the anonymously used Rosenberg Self-Esteem Scale (Rosenberg, Schooler, Schoenbach, & Rosenberg, 1995). Importantly, the Personal Evaluation Inventory (PEI) self-report questionnaire, created by Shrauger and Schon (1995) can be used to measure self-confidence.
Turning to the background for this research, it was conducted in two primary schools for girls in Riyadh the capital of Saudi Arabia, with the aim of testing the construct validity of the PEI and the GAD questionnaires. It did so by collecting relevant data on the convergent and discriminate validity of the GAD and PEI questionnaires using the Multi-Trait Multi-Method (MTMM) matrix.
The study considered three major elements: individuals involved in the study; instrumentation; and the research procedure. Primary schools were selected to obtain convenient sample accessibility, teacher availability and sample size. In order to identify and confirm any possible variations in the sample, two primary schools were used. Cohen, Manion and Morrison (2011) recommend a minimum number of 30 cases for an effective statistical analysis, but that the researcher also needs to consider the relationships within the sample to establish an appropriate sample size. As a result, 30 participants were used in a pilot study of the GAD questionnaire while 60 different participants were used in the main study to give greater depth to the results. One teacher from each school also volunteered to participate. Each teacher supervised 30 participants in Grade 6. The questionnaires were administered in four classes in total. It is worth noting that all the students agreed to participate in the study voluntarily.
In this study, all the respondents and the supervisors were females, and as such, there was no variance due to gender. The researcher recognises that the findings cannot be generalised as the sample is not representative. However, this is because of the constraints imposed by the social norms in Saudi Arabia, in which mixed primary schools do not exist. Furthermore, it is impossible for a female researcher to access male only primary schools.

Instruments Used
The self-confidence questionnaire, the PEI, was the first instrument used, and is based on Shrauger and Schohn's (1995) work pertaining to the development of an instrument that could assess a student's confidence levels. The second instrument was the GAD self-report questionnaire. These instruments are discussed later with respect to three major areas: construct definition; instrument description; and available statistical analysis.
In relation to the PEI, it is worth noting that 'Self Confidence' has been defined as the subjective appraisal of an individual's capabilities, in a particular given context (Shrauger & Schohn, 1995). It is an individual's own assessment of their performance, in a certain subject. SC is considered to be a component of self-evaluation. To determine reliable SC areas, five subscales were developed: academic performance; general confidence; physical appearance; social interaction; and skills involved in addressing a number of people.
Based on the respondents, Shrauger and Schohn (1995) developed and used 54 items, re-naming the instrument the PEI. Carlock (1999) argues that the PEI is best used with adolescents and young adults, which made it appropriate for the present research. The PEI (hereafter known as the SC questionnaire) uses a Likert-Type Scale ranging from 1 to 5 representing a scale of truth-value, with 1 representing "Never True" and 5 for "Always True". The SC questionnaire comprised 20 elements.
Although critics have outlined that constructs of self-belief can only be measured using self-report methods (Shavelson, Hubner, & Stanton, 1976), Shrauger and Schohn (1995) observe that the validity of measuring some other constructs was reasonable between two people who knew each other. Thus, the rating could be adapted to peer-rating and teacher-rating questionnaires as the students and teachers had interacted for one academic year.
With regards to the reliability and validity of the research, the PEI's internal consistency for the five subscales ranged from 0.74 to 0.89 and the test-retest coefficients after one month ranged from 0.53 to 0.89 with the total scale correlations of 0.80 (Blascovich & Tomaka, 1991).
According to Shrauger and Schohn (1995) the PEI converges with the Rosenberg Self-esteem Scale (correlation coefficient of 0.58), with the Janis-Field Feelings of Inadequacy Scale (0.59) and with optimism (0.53). However, Shrauger and Schohn (1995) observe that the PEI is minimally influenced by social desirability, religious affiliations, and socioeconomic status.
The TMAS, also known as the GAD, is a device used for physiological experiments and was created by Taylor (cited in McDowell & Newell, 1996). The scale, which has been regularly used in relation to assessing personal anxiety and measuring anxiety as a clinical entity (McDowell & Newell, 1996), has the indecisive option of "do not know" removed from the Likert-Type Scale, resulting in the items being simple self-descriptive statements (Lietz, 2009).
Addressing the issue of reliability and validity, following DiLoreto (2013), this research used a 50-item construct to retest the following correlations: 0.89, 0.82 and 0.81 which were conducted over intervals of three weeks, five months and nine to 17 months. The correlations were viewed as possibly varying between ethnic groups and also educational levels (McDowell & Newell, 1996). A wide study of construct validity was undertaken and correlations between the TMAS and Eyesenck's (1964) Measure of Neuroticism in two samples. In the Measurement of Neuroticism, the study showed correlations of 0.72 in one sample and 0.75 in the other, while using the Psychasthenia Scale the correlations were 0.81 in the first sample and 0.92 in the other sample (McDowell & Newell, 1996).

The Pilot Study
The pilot study was undertaken for the GAD Questionnaire in the selected schools. Thirty students in grade six in two classes spent approximately 15 minutes filling in the questionnaires. Based on the responses, and aligned with the purpose of the main study (to verify the convergent and discriminant validity), the questionnaire was amended. For those interested further information on the self-rating questionnaire and the peer-rating questionnaire is available upon request.

Questionnaires
The questionnaires were paced, in order to prevent participants from suffering from fatigue, as recommended by Gorad (2013). In addition, as suggested by Crooks, Kane and Cohen (1996), the participants were encouraged to ask for clarification in the case of any difficulty; thus, reducing any possible problems with the validity of the assessment. Furthermore, following Paulhus (1984), the participants were encouraged to give sincere, accurate answers and to answer from their personal perspective, rather than from social desirability, or peer pressure.
One hundred percent of the questionnaires were accepted, all with complete responses, as the students had been asked not to leave blanks. However, there were some unanswered items in the peer-rating and self-rating sections. Nevertheless, the response rate remained feasible, according to Bryman (2008). The data from the questionnaires were gathered and then analysed, with all indicators being examined carefully. No obvious conflicting data or major issues were observed, with conclusions being drawn on the validity of the less complex data, as recommended by Hedges (2012).

Results
As highlighted previously, the MTMM matrix was used to measure different traits by various methods simultaneously. Two traits that were related to SC and GAD were chosen and measured using three methods: self-reporting, peer-rating, and teacher-rating. These generated six variables. The study anticipated low correlations between the methods measuring the two traits and high correlations between the different methods measuring the same trait. The former is referred to as 'discriminant validity', whilst the latter is known as 'convergent validity' (Coe, 2012). To establish validity, convergent validity and discriminant validity are essential (Campbell & Fisk, 1959). The results are recorded in Table 1. In order to help understand the MTMM matrix, Campbell and Fisk (1959) ordered the correlation coefficients as follows. The pink cell reliability coefficients are the highest levels in the matrix, indicating that they share similar traits and methods. These reliability coefficients have different methods of estimation. In this study, Cronbach's α coefficient refers to correlations of the MTMM. The Mono-Method block is represented through the figures in the blue and pink cells in Table 1. It consists of the similar method correlations. The blocks that use different methods are known as Hetero-Method blocks. Table 1 highlights that the Mono-Method blocks contain the higher coefficients. In contrast, the Hetero-Method blocks are characterised by lower values.
Every Hetero-Method block shows green cell figures that measure the same construct, but through the use of different methods called validity coefficients or Mono-TraitHetero-Method correlations. They represent the convergent validity as described previously. It was expected that these correlations would be lower than the correlations of a MTMM and higher than the rest of correlations. However, according to Campbell and Fisk (1959), the ideal state in psychological measurements is rare. As a result, the blue cell coefficients of a Hetero-TraitMono-Method are higher. However, in terms of the theory, they should not be as high due to the measurement of different traits. According to Campbell and Fisk (1959), these coefficients share method factors that produce higher values.
With regards to the Hetero-TraitHetero-Method coefficients, the figures in the grey cells in Table 1 represent other Hetero-Method block correlations. They share neither a construct nor a method. Trochim (2006) states that they illustrate discriminant validity with the Hetero-TraitMono-Method coefficients. Thus, they should show the lowest correlations. However, as mentioned previously, the coefficients of a Hetero-TraitMono-Method are relatively high as they share common method factors. Furthermore, the coefficients of a Hetero-TraitHetero-Method are high, raising the possibility of a threat to the construct validity, plus possibly indicating partial dependence of methods or trait overlap (Campbell & Fisk, 1959).
In summary, the findings in Table 1 highlight that there are relatively high method factors and reasonable validity coefficients, except for those related to teacher-ratings for both SC and GAD. Students' self-reports and peer-ratings (SC1-SC2) and (GAD1-GAD2) show a reasonable degree of validity as revealed by the figures 0.598, 0.362, and 0.269. However, the correlations represented in the red rectangle of Table 1 the correlations between students' self-reports and teacher-ratings (SC1-SC3) (GAD1-GAD3) and peer-ratings and teacher-ratings (SC2-SC3) (GAD2-GAD3) are low: -0.39, -0.29, -0.28, -0.27, -0.23, -0.12, and -0.09, respectively. This is particularly evident in the Hetero-Method block between peer-ratings and teacher-ratings, shown in the yellow cell. This indicates that the SC and GAD construct measured by students' methods is, to a major extent, not the same construct as measured by the teachers' method. Therefore, the six coefficients in the red rectangles threaten both the convergent validity of the SC and GAD questionnaires and the discriminant validity when comparing them with other figures in the Hetero-Method block, and, thus, construct validity. Since the validity coefficient for self-reporting and peer-rating methods is acceptable, the issue of convergent validity is now addressed.
"The pattern of trait interrelationship" is another feature that does not correspond with MTMM principles (Campbell & Fisk, 1959, p. 83). This is shown in Table 2, which depicts the summary of the partial MTMM matrix patterns. The matrix should represent the correlation patterns of a Hetero Trait-Hetero Method and Hetero Trait-Mono Method. The coefficients of a Hetero-TraitMono-Method indicate the use of almost the same pattern. However, the coefficients of the Hetero-TraitHetero-Method represent different relations and patterns with the previously mentioned coefficients. According to the rules, they should discriminate between the two traits, below and within the same range. However, it was already discussed that Pattern 1 coefficients tended to be high which is considered to be normal in psychology measurements. It is evident that Patterns 2 and 3 did not indicate a trend, meaning that further investigation is needed. With regards to convergent validity, Campbell and Fisk (1959) argued that split-half reliability such as Cronbach's α can be taken to be convergent validity, as they correlate with two items that are alike. It is a strong indicator that the reliability coefficients in the MTMM matrix (Table 1) are relatively acceptable. Despite the reliability of teacher-rating method being lower than the reliability of the self-reporting and peer-rating methods, it is expected that self-perceptions could differ among individuals and, therefore, could produce trait errors (Spector, 1994).
Furthermore, according to Campbell and Fisk (1959), it is significant to have correlations above zero and lower than the reliability correlations. For the SC and GAD validity coefficients are 0.443 and 0.252 in terms of self-reporting and peer-rating, respectively, which is higher than the coefficients of a Hetero-TraitHetero-Method. However, the coefficients of the SC and GAD validity exhibited odd behaviour in relation to the teacher-rating. If only the self-reporting and peer-rating methods were used, then the validity coefficient would appear as expected. However, the correlations -0.267, -0.278 and -0.124 are considered threatening, especially when the reliability coefficients are reasonable.
This aspect was investigated to gain a better understanding of the issue. Data from the two schools were analysed to ascertain if there were any differences. However, the analysis showed no significance differences (see Tables  3 and 4). Tables 3 and 4 highlight three main patterns. First, School 2 showed a relatively high reliability diagonal in relation tothe three coefficients (in yellow). Second, School 1 exhibited greater discrimination in some of Hetero-TraitHetero-Method correlations (in turquoise). Third, figures in the red rectangle show low Hetero-TraitMono-Method correlation discriminants. This may be related to the method that is preferable for students. Thus, the analyses of the pattern between SC and GAD, as well as the two methods themselves are relatively consistent in the three MTMM matrixes. Additionally, the validity coefficients of SC and GAD related to teacher-rating follow a similar direction.
With regards to the remaining question about the convergent validity represented by the validity coefficients for SC and GAD, these were measured using the teacher-rating method. It can be concluded, therefore, that the constructs used in SC and GAD questionnaires related to teachers were interpreted differently by the students. This could be true in case of GAD, because according to Headley and Campbell (2011), recognising the mental health of children is not a part of teachers' training. Their study revealed that most teachers admitted their inadequacy in recognising and managing pupils with mental health problems. They noted that teachers' lack of knowledge on how students' mental health problems, in particular anxiety, exhibit themselves is evidence of their insufficient training. This is reflected in numerous concerns raised in primary schools regarding such mentally ill children. Moreover, the teachers, who volunteered to administer the questionnaires, acknowledged that they only interacted with students through non-classroom activities once or twice every two weeks. Headley and Campbell (2013) assume that it is effective to refer the pupils with due care to a guidance counsellor employed by the school or other organisations based within the community. The assumption is based on understanding that the teachers are supposed to be aware of the peculiarities of typical behaviour of their specific students with GAD. However, they are not. It cannot be assumed, therefore, that teachers know how to identify and measure such disorders.
This phenomenon applies to SC as well. Chowdhury (2006) reported that the majority of teachers are preoccupied with the role of achievement or motivation. Consequently, they believe that if a child lags behind the rest of the class, it indicates certain problems with self-confidence. It is interesting that Chowdhury's study (2006) finds the gap in understanding is frequently attributed to some unnoticed psychological peculiarities of the children, which are supposed to be considered. Therefore, it is no surprise that a student is measured according to common standards, and under such conditions it is very difficult to correctly assess the child's personality. Therefore, the MTMM matrix represented in Table 1 shows, to large extent, that students share similar constructs of SC and GAD, which are not the same as the teachers' constructs. Thus, convergent validity in the method blocks of peer-rating and self-reporting are consistent with the findings of Campbell and Fisk (1959), but in the teacher-rating method blocks, it is not sufficiently valid for either GAD or SC. Table 2 displays the discriminant validity results that include the correlations of a Hetero-TraitMono-Method and Hetero-TraitHetero-Method. It is evident that there is relative inflation in the coefficients of the Hetero-TraitMono-Method, which corresponds with the findings of Campbell and Fisk (1959). Podsakoff et al. (2003) suggest that the six questionnaires could share a common method variance, due to the fact that they had the same scale anchors, scale format, and time. The teacher-rating method had even more inflated coefficients. However, according to Nisbett and Wilson (1977), this can result from the so called 'halo effect', which is a widely discussed and accepted cognitive bias, or reliability diagonals that may also cause high inflated coefficients (Wylie, 1974).
In general, there are not many concerns related to the coefficients of the Hetero-TraitMono-Method in terms of SR and PR, but the teacher-rating raises major concerns. It may be concluded that the SC and GAD questionnaires are topical. Therefore, it is necessary to investigate their components and influential factors. Teachers should think about the methods they apply to analysing children's personalities and realise that not everything that is considered to be right is really right.
The coefficients of a Hetero-TraitHetero-Method should not share the same methods and traits. Campbell and Fisk (1959) suggest that if the methods and traits are entirely independent, they would show zero; anything more indicates that trait or method factors are involved. Looking at the figures in the grey cells in Table 2, the coefficients ranged between 0.251 and -0.290 meaning that there was no unified pattern. Tables 3 and 4 also exhibited suspicious correlations. It was especially evident inthe methods implemented by teachers (particularly in School 2). In general, most positive correlations showed in Table 2 displayrelatively high coefficients of theHetero-TraitHetero-Method in the MTMM. Therefore, it is considered to be theoretically possible, indicating that two instruments showed validity in this particular area. However, the Hetero-TraitHetero-Method correlation pattern requires further investigation using different methods and samples.
Finally, although Campbell and Fisk (1959) emphasise the idea that it is essential to interpret the MTMM matrix taking into account all correlations, this may be a limitation, as people can treat correlations differently (Trochim, 2006). Table 5 illustrates a summary of findings in order to arrive at a possible conclusion: Table 5. Summary of the findings in the MTMM matrix presented in Table 1 Convergent validity Discriminant validity for SC and GAD measurements Convergent validity for SC measurement  Moderate but accepted empirically Hetero-TraitMono-Method coefficients.  High reliability coefficients.  Accepted theoretically two positive correlation trends in the Hetero-TraitHetero-Method coefficients.  Moderate validity coefficients for self-reporting and peer-rating methods.

•
One pattern in Hetero-TraitHetero-Method coefficients • Not sufficient validity coefficients for teacher-rating method. Convergent validity for GAD measurement  High reliability coefficients.  Reasonable validity coefficients for self-reporting and peer-rating methods.

•
Not sufficient validity coefficients for teacher-rating method. Table 5 shows that positive points (black type) outnumber the negative ones (red type) which leads to the conclusion that the two instruments used to measure GAD and SC were effective. However, the questionnaires could be improved to improve validity. Campbell and Fisk (1959) observe that the MTMM matrix could be used to improve constructs, methods, and their overlaps. It is possible to reduce a common method variance in the three methods through the replacement of a teacher-rating questionnaire with observations or the adaptation of items to rubrics. As the GAD and SC questionnaires contained only 20 items, it is suggested that develop more items should be developed to cover more contextual aspects.
On the other hand, as a result of the negative conclusions presented in Table 5, the different patterns in the Hetero-TraitHetero-Method are suggested to be the result of low-validity coefficients. Efforts should be made to educate the teachers about the psychological aspects of the personalities of students. According to Chowdhury (2006), at the very moment a child arrives at school, they lose something, because a range of procedural practices restricts their routines. Starting with these first days, teachers begin to analyse and consider everything about the children, and finally they collect their vivid characteristics, but they cannot unveil all the aspects of a child's personality from the seinter actions, because it is not possible. Teachers cannot know everything, especially about a child's personality. All children are different; each has his/her individual peculiarities. According to the results of the questionnaires, teachers do not know exactly the components of the child's personality. Therefore, as teachers hold key positions in the identification of a child's mental health concerns, it has been generally assumed that they can utilise their expertise in an appropriate manner to identify the appropriate health care providers (Garcia, 2009). Furthermore, different instruments can be implemented in different contexts. Applying these modifications will show that the SC and GAD questionnaires possess an acceptable construct validity.