Validity and Reliability of Comprehensive Assessment Instruments for Handball and Badminton Games in Physical Education

This study aimed to determine the validity and reliability of the Comprehensive Assessment instruments for handball and badminton games in physical education. This study conducted by six types of instruments in comprehensive assessment, that are handball cognitive assessment, handball psychomotor assessment, handball affective assessment, badminton cognitive assessment, badminton psychomotor assessment and badminton affective assessment. The measuring instruments of this study are built based on the level of Bloom's taxonomy (1956) for cognitive domain, taxonomy Dave (1970), taxonomy for psychomotor domain and taxonomy Krathwohl et al. (1964) for affective domain. The results showed the validity of a comprehensive assessment for handball was r = .82 and r = .80 for badminton. While the reliability of handball cognitive assessment r = .78 (n = 36), handball psychomotor assessment r = .93 (n = 31), handball affective assessment r = .83 (n = 39), badminton cognitive assessment r = .75 (n = 40), badminton psychomotor assessment r = .81 (n = 40), badminton affective assessment r = .81 (n = 40). The percentage of agreement between examiners (inter observer agreement) for handball is (70.11 %, SD = 0.57) and badminton (70.03 %, SD = 0.68). Based on the findings, this comprehensive assessment is suitable to be used as a standard instrument to assess students’ achievement in handball and badminton games through Physical Education subject in School Based Assessment.


Introduction
Physical Education (PE) is a core and compulsory subject to be taught in the Integrated Curriculum for Secondary School based on Education Act 1996, through the Professional Circular Number 25/1998(Ministry of Education Malaysia, 1998).Physical Education plays a very significant role in contributing to the comprehensive growth and development of students via the learning experiences based on cognitive, psychomotor and affective domains (Darst & Pangrazi, 2006;Abdullah Sani, 2003;Freeman, 2001;Daeur & Pangrazi, 1995).In order to determine the achievement of learning, a form of measurement and evaluation should be carried out during the teaching process.
Teachers are required to conduct an assessment to determine the achievement of goals and objectives of PE subject.According to Bhasah (2007), evaluations are designed to assess the status of an evaluated object and to compare the status with respect to a set of standard or criteria for decision making.In this context, evaluation is a process that includes objective determining, information gathering, information processing and conclusion forming.When all these processes are conducted systematically and scientifically, the decision will be more accurate and will meet the purpose of evaluation results (Abu Bakar & Bhasah, 2008).
Since 1989, formal summative assessment in Integrated Curriculum for Secondary School (ICSS) had been introduced for PE subject and for the first time PE marks and grades are included in the students' report cards based on the assessment results.The assessment consists of two parts, that are examination methods and National Physical Fitness Standard (SEGAK) test based on Professional Circular No. 4/2008, Ministry of Education, Malaysia (2008).This assessment is actually not complete because students are only assessed by means of the examinations (cognitive domain) and SEGAK test (psychomotor domain) which focus on the fitness aspect only.
Therefore, PE subject evaluation implemented now is considered not complete and holistic, not balanced and comprehensive due to the lack of standardized instruments and standards to be adopted by teachers to assess students in PE, especially concern to game skills.Teachers should use a standard assessment process for effective assessment.This study recommends an effective comprehensive assessment which includes a thorough assessment of cognitive, psychomotor and affective domains of learning.
The objectives of the study are as follows: 1) Identifying the content validity of the comprehensive assessment instruments for handball and badminton in Physical Education.
2) Identifying the reliability of the comprehensive assessment instruments for handball and badminton in Physical Education.
3) Identifying the reliability of inters observer agreement of comprehensive assessment instruments for handball and badminton in Physical Education.
This comprehensive assessment instrument is constructed based on the level of Bloom's taxonomy (1956) of the cognitive domain, taxonomy Dave (1970), taxonomy for psychomotor domain and taxonomy Krathwohl et al. (1964) for affective domain.Cognitive domain refers to the thinking and the intellect power in which the cognitive evaluation is to measure the level of knowledge and intelligence of the students (Kamarudin & Siti Hajar, 2004) and it happen all the time and everywhere (Abu Bakar & Bashah, 2008).There are six levels of cognitive classification based on Bloom's taxonomy (1956), namely (i) knowledge; (ii) comprehension; (iii) application; (iv) analysis; (v) synthesis and (vi) evaluation.
Psychomotor domain refers to skills related to the physical movement of a person.Physical Education teachers always focus on controlling movements during trainings and games.Psychomotor assessment domain measures the physical, motor, fitness, ability and efficiency (Jansma & French, 1994).In the process of teaching and learning, psychomotor domain is very significant.There are five levels of hierarchy based on Dave taxonomy (1970), namely (i) imitation; (ii) manipulation; (iii) precision; (iv) articulation and (v) naturalization.
Affective domain involves spiritual aspects with the emphasis on growth and development of attitudes, feelings, emotions and values.Feelings, attitudes and values are things learnt and they grow from time to time.If the environment is healthy, then feelings, attitudes and values inculcated will be positive (Abu Bakar, 1985).Krathwohl et al. (1964) classified the affective domain into five taxonomic levels, which are (i) receive; (ii) respond; (iii) value; (iv) organize and (v) characterize by value set.
Based on Table 1, the theoretical framework of this study is based on Bloom's taxonomy (1956) of cognitive domain, taxonomy Dave (1970) for psychomotor domain and taxonomy Krathwohl et al. (1964) for affective domain.The comprehensive assessment of this study is built based on these basic theory and it contains six types of assessments, that are cognitive, psychomotor and affective assessments for both handball and badminton games.

Sample
The sample consists of 16 teachers, 40 Form 2 students from the selected schools in Larut, Matang and Selama, Perak.There are 10 expert panels involved in this study.Subject teachers and expert panels in this study are chosen using purposive sampling while the selection of students is conducted with intact, in which the teacher select a class to teach Form 2 PE and all of the students in the class become the subject of the study.

Instrument
This study uses six different types of instruments that are handball cognitive assessment, handball psychomotor assessment, handball affective assessment, badminton cognitive assessment, badminton psychomotor assessment and badminton affective assessment.Comprehensive assessments in this study are instruments designed by the researchers based on Morrow et al. (2005).Figure 1 shows the flow chart for the construction of comprehensive assessment instruments of Form 2 handball and badminton.
The first step in the process of building a comprehensive assessment is to review the best evaluation criteria.This study refers to the objective requirements of PE based on teaching syllabus (Curriculum development Centre, 1999), which are related to the cognitive, psychomotor and affective domains for Form 2 handball and badminton games.
The second step is the analysis of the instrument.The researchers had referred to the contents of the handball and badminton game skills based on syllabus in which there are eight basic skills for handball and six basic skills in badminton.The values to be applied and practiced in PE are the spirit of sportsmanship, fair play, tolerance, teamwork, discipline, competitiveness, leadership and participation.In this study, the aspect of fair play is evaluated in handball games while the aspect of sportsmanship is evaluated in badminton games.

Figure 1. Flowchart of comprehensive assessment instrument construction
The third step is the study related to the construction of instrument.The comprehensive assessments of this study are based on the taxonomy of the cognitive, psychomotor and affective domains.The cognitive domain is based on Bloom's taxonomy (1956), the psychomotor domain is based on taxonomy Dave (1970) and the affective domain is based on taxonomy of Krathwohl et al. (1964).The next step is to do the item selection for the instrument.Handball and badminton cognitive assessments consist of 40 questions and are divided into four sets of cognitive tests based on four PE teachings, involving two periods, 80 minutes per session, with each set containing 10 test questions by topic learning.This assessment is based on Bloom's taxonomy (1956), using the Table of Test Specifications and weights proposed by Hastad and Lacy (2002).Handball psychomotor assessment is divided into five basic skills including passing, catching, dribbling, checking and shooting.Badminton psychomotor assessment is divided into four basic skills; include serving, stroke, smash and footwork.Assessments are carried out by teachers during training and game sessions in the handball and badminton learning process.Handball affective assessment involves the fair game value which consists of two sub values, that are abide the rules and abide the laws.The value aspects in badminton games are the spirit of sportsmanship which consists of two sub values-accept the loss and respect the opponents.
The next step is to publish the procedures of comprehensive assessment implementation.This process is the first step in the preparation of Form 2 handball and badminton comprehensive assessment.The completed procedures are submitted to experts for review (Validity Instrument 1).At this case, the cognitive, psychomotor and affective assessments are referred to six expert panels in which two expert panels review and evaluate the content of the comprehensive assessment, two expert panels review on the aspect of language and the other two expert panels attempt and try the comprehensive assessment.A few items had been refined based on the experts' advices.The next step is to conduct a pilot study.The pilot study was conducted on 20 Form 2 male students and 20 female students (N = 40) of the selected school and it involves two PE teachers.The pilot study aimed to find the validity of cognitive, psychomotor and affective assessment instruments.Some sub-assessment items had been reviewed and updated based on the feedback from the teachers during the pilot study.
The eighth step is to build a complete comprehensive assessment instruments before the actual study on comprehensive evaluation of Form 2 students in PE is conducted.At this stage, the complete comprehensive assessment is given to four expert panels to assess the validity of the content item in handball and badminton comprehensive assessment (Validity Instrument 2).
Based on the comments and recommendations of the expert panels that evaluate the content items of handball and badminton comprehensive assessment, the researcher makes some modification and item refinement to complete the comprehensive assessment.The completed comprehensive assessment is then presented to 16 PE teachers in a research workshop in the school.This one-day workshop aimed to expose the subject teachers to the procedures of the research and the assessment techniques using comprehensive assessment.Two assessment tests were carried out on the subject teacher which is the teachers' psychomotor assessment test in handball and teachers' psychomotor assessment test in badminton.The purpose of these tests is to find out the validity value of the examiners or inter observer agreement.
The final step in the process of building the comprehensive assessment instrument is to do an actual research by using the handball and badminton comprehensive assessment.The time allocated for the research on handball game is four periods of teaching sessions and is conducted in the first semester while the time allocated for the research on badminton game is four periods of teaching sessions and is conducted in the second semester.

The Validity of Comprehensive Assessment Instrument for Handball and Badminton
In order to find out the validity of the content of the pilot study, researchers have found some experts.The instruments' content are submitted to the experts for review.At this stage, the comprehensive assessment had been reviewed by six expert panels.Researchers have used the content validity of the questionnaire form semantic scale of 11 points with the right point marked 10 (strongly agree), the leftmost point, marked 0 (strongly disagree) and the center point is marked with 5.This questionnaire can be used to test the validity of the specific research content measuring instrument using a specialist referral (Sidek & Jamaludin, 2005).Some items have been checked and corrected based on the feedback and advices from the experts.Based on Table 2, the validity of a comprehensive assessment of the pilot study was r = .90(n = 6).According to Abu Bakar (1985), Sidek and Jamaludin (2005), Tuckman and Waheed (1981) value of 0.70 are considered to have control or to achieve a high level.In its complete form the comprehensive assessment researchers again found some panel of experts to evaluate the comprehensive assessment content item for handball and badminton games.Four people have been referred to a panel of experts to assess the validity of a comprehensive assessment of the content item handball and badminton games.Based on Table 2, the result shows the validity of the item expert 1 and expert 2 for handball comprehensive assessment was r = 0.82 (n = 2), while the validity assessment of expert 3 and expert 4 for badminton comprehensive assessment was r = 0.80 (n = 2).According to Abu Bakar (1995), Sidek andJamaludin (2005), Tuckman and Waheed (1981), value of 0.70 is considered to have mastered the highest level.There are reviews, comments and views related to the expert panel for the assessment of the handball and badminton validity assessment.With regard to all aspects of reviews, views and comments given from the expert panels, the researchers did some minor modifications to the handball and badminton comprehensive assessment so that it can truly meet the needs of the PE curriculum.

The Reliability of Handball and Badminton Comprehensive Assessment Instruments
According to Ahmad ( 2004), Baumgartner and Jackson (1999) and Miller (2006), a reliable test gives consistent results when it is being tested repeatedly.Reliable test will produce stable and accurate data.According to Bhasah ( 2007), there are two procedures commonly used in estimating the reliability of test scores that are the two-test administration and one-test administration.
Cognitive assessment uses the Kudder Richadson 20 formula based on the dichotomous (true-false) scoring while the psychomotor assessment and affective assessment uses correlation method.The study was conducted on 20 male students, 20 female students (N= 40) and two PE teachers in the selected school.Table 3 shows the coefficient of reliability of handball and badminton cognitive assessment.
Based on Table 3, the reliability for handball cognitive test 1 is r = 0.83, p 0.   4, result shows that the score of 14 items in handball psychomotor assessment is between 0.79 -0.95 (r = 0.90), while the score of the 15 items in badminton psychomotor assessment is between 0.65 -0.95 (r = 0.83).The reliability coefficient that can be accepted by research practitioners in social science is more than r = 0.60.According to Mohd. Majid (2000), r = 0.71 -0.99 is the best value while Fraenkel and Wallen (1996) set the acceptable reliability coefficient value at r = 0.70 -0.99.Kubiszyn and Borich (2000) determine that the value of r = 0.80 -0.90 is acceptable while Popham (1990) accept the value between 0.90-0.95.
Table 5 shows the values of two items in handball (r = .83)and badminton (r = .81)for the affective assessment.According to Mohd. Majid (2000), r = 0.71-0.99 is the best value while Fraenkel and Wallen (1996) set the acceptable value at r = 0.70-0.99.Kubiszyn and Borich (2000) determine that value of r = 0.80-0.90 is acceptable while Popham (1990), accept the value of r = 0.90-0.95.Based on finding, affective assessment for handball and badminton is acceptable.
In order to get the reliability of the examiners, two evaluation tests are conducted on the subject teachers (N = 16), that are the teachers' psychomotor evaluation in handball games and teachers' psychomotor evaluation in badminton games.These tests aimed to obtain the inter observer reliability.According to Bryington et al. (2002), there are two methods to obtain the inter observer agreement, that are the percentage agreement method and Kappa method.If the data is obtained using nominal scale, then Kappa method is suitable to be used but if there is more than one examiner for a test item, then the percentage agreement method can be used (Rink, 2002).Based on Table 6, the result shows that the percentage of agreement between the examiners (inter observer agreement) for handball assessment based on the evaluation of 28 video recordings is between 37.50 % -93.80 % M = 70.11% (SD = 0.57) and badminton assessment based on the evaluation of 30 video recordings is between 43.80 % -100 % M = 70.03% (SD = 0.6).According to Rink (2002), the acceptable reliability value of agreement between the examiners is at least 70 % (0.70).

Conclusion
Based on the findings, this Comprehensive Assessment is suitable to be used by teachers as a standard instrument to assess students' achievement in handball and badminton games for PE subject.The usage of this comprehensive assessment is more realistic, holistic and is able to assess students' wholly in accordance with the national education philosophy and indirectly shows the 'power of knowledge' of PE subject.This comprehensive assessment is also in line with the school-based assessment and the implementation is expected to restore the status quo of PE subject in schools throughout Malaysia.

Recommendations
This study involved only handball and badminton games in PE subjects in Form 2. Thus, the researcher suggests for a more comprehensive study for all levels in secondary schools.The study may involve football, ping pong and basketball games for Form 1, volleyball and softball games for Form 3 while hockey, tennis and sepak takraw games for Form 4 and basketball, cricket and rugby games for Form 5.The proposed study is intended to complement the use of Comprehensive Assessment in PE subjects for each level.
A more comprehensive study can also be done by other researchers to study the standard and level of student achievement based on learning domain of cognitive, psychomotor and affective using Comprehensive Assessment for the students of secondary schools throughout Malaysia.The aim of this study is to identify the reference norm of levels of student's learning and achievement.The standard level of achievement of learning can be identified and determined in this study.Therefore, with the scale reference norm is expected to be a reference to teachers and students in these skills.Teachers can focus on student achievement levels while students must achieve at least a minimum level in all skills learned specifically for the game in PE.

Table 2 .
Content validity of the expert panel of pilot studies (N = 6)

Table 3 .
The validity of content items of expert panels (N = 4)

Table 4 .
The reliability cognitive assessment for handball and badminton (N=40)