Development , Administration and Confirmatory Factor Analysis of a Secondary School Test Based on the Theory of Successful Intelligence

The present study attempted to investigate an application of the theory of Successful Intelligence (Sternberg, 1997) in lower Greek secondary schools, through a school tests believing that school assessments should be based on solid, empirically investigated theoretical foundations. The test was administered to 2663 students with a mean age of 13.39 years, all studying at the 2 year of the lower secondary school in 49 schools from different geographical areas in Greece. A confirmatory factor analysis demonstrated the validity of the theory. The findings suggested that Greek pupils have a relatively developed analytical ability; however, the cultivation of their creative and practical skills should be at the focus of Greek schooling.


Introduction
Classroom assessment according to Airasian (1997) is concerned with "the collection, synthesis, and interpretation of information to aid the teacher in decision making" (p. 7).Research on teachers' assessment practices, carried out mainly in the United States, demonstrates that they appear to be strongly relying on teacher made tests, and to a lesser extent on essays and student papers (Cizek, Fitzgerald, & Rachor, 1995;Frary, Cross, & Weber, 1993;Gullickson, 1985;Marso & Pigge, 1993).Their grading practices appear to be affected by academic achievement as measured by paper and pencil achievement tests, effort, classroom participation, homework preparation, perceived ability and conduct (Brookhart, 1994;Cizek, et al., 1995;Gullickson, 1985;Stiggins, Frisbie, & Griswold, 1989).These practices have been described by Brookhart (1991) as a "hodgepodge" of attitude, effort and achievement" (p.36), and they do not appear to be based on a clear theoretical background.
Assessment is a socio-political practice (Delandshere, 2001), operating in educational systems which are part of the political organization of a country.The Greek educational system has a hierarchical structure with a top down direction in decision making (Saiti, 2009), and thus, it constitutes a closed system, not easily amenable to change and innovation (Alahiotis & Karatzia-Stavlioti, 2006;Ifanti, 2007;OECD, 1994) At the summit of the pyramid is the Ministry of Education which "oversees the administration of all schools in the country through its Central and Regional Services" (Eurydice/Eurybase, 2008/9).In the past few decades, a number of efforts for curriculum reform have resulted in insignificant changes in teaching and assessment practices.The lack of long term planning, the frequent change of ministers of education with subsequent changes of policies as well as political administrators, plus a lack of a coherent theoretical background of any reform efforts, are only some of the reasons mentioned for the current situation (Kassotakis, 2010;Saiti, 2009).As a result, Greek teachers, operating in a such an educational environment have formed traditional, intuitive, implicit theories of assessment on which they base their assessment practices and grading criteria.Achievement as measured by teacher made tests, effort, classroom participation and homework preparation are only some of the criteria by which they grade their students (Zbainos & Hallam, 2002a, 2002b).Not much research has focused on the content and the format of Greek secondary teachers' self made tests but, according to the Ministry of Education, however, memorization of sections of textbooks is the predominant requirement for assessment both in teacher made tests and in final examinations (Ministry of Education, 2010).It seems, therefore, that, in accordance with the findings of the international literature, Greek teachers' grading is not based on a sound theoretical background.
The current paper attempts to contribute to the literature on assessment practices by investigating the development and the administration, of a school test that is theoretically grounded in successful intelligence.The theory of successful intelligence (Sternberg, 1985(Sternberg, , 2002(Sternberg, , 2005) ) focuses on the set of abilities needed for success in life rather than success in school in the traditional approach.According to the theory (Sternberg, 1985(Sternberg, , 2002(Sternberg, , 2005)), intelligence is defined as the ability to achieve success in life in terms of one's personal standards, within one's socio-cultural context, by capitalizing on one's strengths and correcting or compensating for one's weaknesses.Success is attained through a balance of analytical, creative, and practical abilities.Its dimensions (creativity, practical thinking and analytical reasoning) have been investigated in many areas of human action such as management, leadership, sales, academia, and the military (Hedlund, et al., 2003;Hedlund, Wilt, Nebel, Ashford, & Sternberg, 2006;Wagner, Sujan, Sujan, Rashotte, & Sternberg, 1999).
Intelligence according to the theory of successful intelligence is not seen as a fixed general factor, mainly determined by heredity as described by traditional intelligence theories (Brand, 1987(Brand, , 1996;;Jensen, 1998aJensen, , 1998b)).The ability "capitalize on strengths" and "compensate for weaknesses" (Sternberg, 2002(Sternberg, , 2009c;;Sternberg & Grigorenko, 2000) are considered inherent attributes of intelligence.Thus, both the person as an agent of his/her own behavior and the environment (Bandura, 2001(Bandura, , 2006) ) may play an extremely important role for its manifestation and cultivation.Sternberg, (2004) describes the theory to be wholly consistent with the Vygotskian notion of the "Zone of Proximal Development" (Vygotsky, 1978), according to which, children's actual cognitive achievements differ from their potentials.The level of actual achievements can be developed towards the potential ones, through the interaction with important others (peers, adults).
In this framework, education has been considered as one of the main areas in which all three dimensions of Successful Intelligence can be developed.Traditional schools tend to focus on mnemonic learning and analytical reasoning, while they pay little attention to the development of creativity and practical thinking (Grigorenko, et al., 2004;Grigorenko & Sternberg, 2001;Sternberg, Nokes, et al., 2001).Teaching for successful intelligence can be seen within the relatively recent tradition of theory-based instruction aimed at the development of students' cognition or intelligence (Feuerstein, 1980;Gardner, 1993;Martinez, 2000).It does not take place through separate courses, but it can be infused into any existing curriculum (Grigorenko, Jarvin, & Sternberg, 2002;Sternberg & Grigorenko, 2000;Sternberg, Jarvin, & Grigorenko, 2009a).The studies that have attempted to teach and to assess successful intelligence in schools appear to have been very effective: Participating students at all educational levels appeared not only to have cultivated their analytical, creative and practical ability, but also raised their academic achievement (Grigorenko, et al., 2002;Sternberg, Ferrari, Clinkenbeard, & Grigorenko, 1996;Sternberg, Grigorenko, & Jarvin, 2001;Sternberg, Grigorenko, & Zhang, 2008).
The theory of successful intelligence has never been implemented in Greece, where teaching and assessment are very traditional.In Greece, a place in deep financial crisis, maybe more than anywhere, we need to assess "what matters, in ways that can help students develop the skills they need for success in school and life" (Sternberg, 2009a) p. 208.An attempt to devise and administer a school test based on the theory of successful intelligence is a completely novel effort and, thus, it is believed to be of great research interest.The main research question that this paper attempts to answer is if the theory of successful intelligence is meaningful for Greek secondary students; if its dimensions, namely analytical reasoning, creativity and practical thinking can be identified, and thereafter cultivated in school, in order to conduce to students' and eventually society's success.
In order to answer these questions, a test was devised and administered to Greek secondary school students.The process and the findings are presented below.

Sample
The sample included 2663 students who took the test.They came from all geographical areas of Greece, from rural semi-rural and urban areas, as well as from different types of schools such as private and public, cross-cultural and laboratory.Although the participating schools were not chosen randomly, an effort was made to include all different types of schools in the sample (see table 1).

Insert table 1 about here
Participating teachers (N = 42) were asked to provide information about the demographics of their students, namely about their gender, attainment, cultural background and any difficulties they faced according to the records kept at school.Some teachers declared that they had no access to the records and therefore a percentage (varying from 21% to 29.5%) was missing (see table 2).The large number of overall participants, however, allowed a picture of the Greek educational reality.

Insert table 2 about here
As seen in table 2, almost half of the participating students (N = 1348), were girls (50.8%), and the other half were boys (N = 1304, 49.2%).The attainment scale which is being used in the Greek secondary school ranges from 1-20.Students who receive a mean grade of 10 and above in all subjects in the previous grade are promoted to the next grade.The scale used in this study was a five point Likert-type scale: (weak, if last year's grade was from 10-11.9, average for grades between 12-13.9,good for grades between 14-15.9,very good for grades between 16-17.9 and excellent for grades between 18-20).In the present sample, data were not reported for 766 students (28.8%).Of the remaining 1897, 103 participating students' (5.4%) attainment was weak, 204 students' (10.8%) attainment was average, 376 students' (19.8%) attainment was good, 584 students' (30.8%) attainment was very good, and the attainment of 630 students (33.2%) was excellent.It is remarkable that a high percentage of the participants (64%) got grades of the highest two categories implying possible leniency in assessment by teachers, or even low attainment standards.
Teachers also reported participants' families' cultural backgrounds according to the records kept at school and their personal experience (see table 2).A child was categorized under the category "non Greek" cultural background if both if their parents were born in a country other than Greece, and "Greek" if their parents were born in Greece.No specification was made on whether the participant was born in Greece or not, or how many years s/he has stayed and studied in Greece.No data were received for 578 (21.7%) students.Of the remaining 2085 students, 1,919, (92%) came from a Greek cultural background and 166 students (8%) came from non Greek cultural background (see table 2).The figure of 8% is close to the 9% recorded by the Ministry of Education.
Teachers were also asked to report any difficulties that students may be facing, only if they had an official statement of their difficulty (see table 2).This information was not provided for 785 students (29.5%).Of the remaining 1,878, 1797 students (95.7%) were reported with no difficulty, 69 (3.7%) with dyslexia, 11 students (0.6%) with severe learning difficulties, and only 1 student (0.1%) with Attention Deficit Hyperactivity Disorder (ADHD).Thus, as expected, the most commonly mentioned difficulty was dyslexia (3.7%).It is also notable that ADHD is not a frequently stated problem in Greek secondary education.In conclusion, although the sample was not chosen randomly, appears to be representing the picture of the Greed secondary education.

Validity and Reliability
The effort of assuring the validity and the reliability of the assessment used in the present study based on a "unified view of validity," (Cronbach, 1988;Messick, 1995).Birenbaum (2007) has presented a framework to assist in accomplishing the quality of a given assessment practice, which was followed to a large degree.

Content Fidelity
Content fidelity refers to the relevance of the target domain to the purpose(s) of the assessment, the fit between the target domain and the assessment domain and the representativeness of the actual set of tasks included (Birenbaum, 2007) p.34.
The target domain of the assessment was students' ability of analytical reasoning, creativity and practical thinking.Since the assessment did not intend to assess the knowledge of any particular curriculum area, its theme was chosen from an interdisciplinary frame -namely "nutrition around the world," which may have been taught under many curriculum subjects such as science, home economics, personal well being, or even geography, and which also is a common theme in the media.In accordance with Birenbaum's (2007) suggestions for the assurance of content fidelity, a panel of experts was formed.It consisted of the research team, comprised of 14 teachers and psychologists whose experience in working with children varied from 5-25 years; at that time of the study they were completing their Masters in Psychology of Education and Teaching.They operated as reviewers of the tasks, ensuring that they tested what they were supposed to test, i.e. the three dimensions of successful intelligence.They ascertained that the tasks' content as well as their wording followed the theories' directions (Sternberg, 2004(Sternberg, , 2009b;;Sternberg & Grigorenko, 2007;Sternberg, et al., 2009b;Sternberg & Spear-Swerling, 1996).
After discussions with the panel of experts, the assessment was designed as follows: At the beginning of the assessment there were 2 photos of children eating to operate as advance organizers of the assessment.The first photo depicted a group of thin but not starving children somewhere in Africa eating with their fingers something simple like cereals or rice, while the second picture showed two fat children eating burgers and drinking coca-cola in a fast food restaurant.The tasks of the assessment were all related to these pictures.The first two (Analytical 1 and Analytical 2) assessed the analytical ability of the student.Analytical 1 asked the child to compare the nutrition of the two meals in the two pictures.Analytical 2 asked students to select the place of the photo where they would prefer to live and to justify their selection.
The next four tasks assessed creative thinking.In the first two (Creative 1 and Creative 2) children were asked to imagine the situation of a morning and write a short dialogue between the children in each of the pictures and their mothers before they went to school.The next two tasks (Creative 3 and Creative 4) presented an ending "…and for all these the food I eat is to blame," and students were asked for the beginning of the story.The first two tasks assessed divergent creative thinking, while the last two convergent creative thinking (Lubart, 2001(Lubart, , 2009)).
Finally, the last 2 tasks (Practical 1 and Practical 2) assessed practical thinking by asking students to write things they would change in their diet, as well as advice they would give to others regarding their diet, after having thought about nutrition in the developing and the developed world.The completion time for this assessment was tested during the pilot study of the assessment, and it appeared that students could manage it within the time limits of one teaching period, about 45 minutes.

Scoring and Scaling
The scoring and scaling criteria of the assessment, set by the group of experts, are described in this section.The same team undertook the scoring of the assessment.The first analytical task asked students to compare the nutrition of poor thin children in Africa and fat children somewhere in the western world and the second asked them to justify where they would like to live if they had to choose between the two places.The teachers who administered the assessment were asked to inform children that under the particular task they were asked to write as many and as deep comparisons they could make.In the pilot study when the scoring rubric was being developed, it was found to be extremely difficult to place the answers of an extremely open task like this on a scale from 0 to 5 scale that attempted to capture the accuracy and level of ability of the students' analytical thinking, according to the suggestions by Sternberg et. al. (2009b).Instead, a more analytical examination of the fluency and the elaboration of the students' comparisons was employed: Each respondent's text was parsed into sentences.Each sentence was scored separately.The level of elaboration was scored in the following way: A sentence was given 1 point if it just provided information about the quantity of their food, 1.5 points if it mentioned the ingredients of the food, 2 points if the sentence contained feelings, simple justifications, hypotheses, conclusions, contrasts and oppositions, and 3 points for deep justifications, hypotheses, conclusions, contrasts and oppositions.If a sentence consisted of irrelevant information, it was not given any score.Also, a score was not given to a sentence if it was simply a repetition of what was previously mentioned.
The method of assessment for the creativity tasks of the test was an issue that the team had to tackle.Originality, fluency, flexibility and elaboration are the most commonly mentioned criteria for assessing creativity (Guilford, 1967;Torrance, 1974aTorrance, , 1974bTorrance, , 1974c;;Wallach & Kogan, 1965).However, these criteria have been mainly used in traditional divergent thinking tests.In this case, creativity was defined as "the ability to produce work that is novel, and appropriate (Sternberg, Kaufman, & Pretz, 2002) which according to Kaufman & Baer (2004) is endorsed by many theorists.Similarly, Lubart & Guignard (2004) define it "as the capacity to produce novel, original work that fits with task constraints".Thus, originality was assessed using one score, and task appropriateness as another score (Kaufman, Baer, & Plucker, 2008).Following the scoring system used in the Aurora-a test (Tan, et al., 2009), which is in accordance with Sternberg's conception of creativity, creative ability was assessed on a 5 point scale, while accuracy (task appropriateness) was assessed on a 3 point scale.The score for creative ability was mainly determined by the originality of the answer; however, elaboration also had an effect on the summative score of creative ability.The sum of those two produced the overall creativity score for each task.
The practical tasks asked students to describe how they would use their knowledge about nutrition to change their own eating habits and to advise others to do so.Using and employing knowledge have been mentioned as key practical abilities (Grigorenko, et al., 2002;Sternberg & Grigorenko, 2007;Sternberg, et al., 2009b).Scoring was solely based on the number of pieces of advice they gave.
In all assessment tasks, any answers that were completely irrelevant were treated as unanswered.This was done because they conveyed a probable lack of motivation in answering the assessment tasks rather than a complete lack of ability.Thus, weak answers were graded with the score 1, and the score 0 was avoided.

Reliability
About 5% of the scored assessments (N = 131), were first scored by members of the team and then scored again by the supervisor of the raters in order to check for inter-rater correlations.

Insert table 3 about here
As seen in table 3, very high inter-rater correlations were produced.This indicates that the training method (described in detail in a following section) was effective.It is notable, however, that the relatively lower correlations were found in the creative ability scores, where scorers had to assess the originality and elaboration of the answers, while the highest reliability was found in the practical tasks where just a count of the elements of the answers was needed.
The internal consistency (Cronbach's alpha) of the assessment was .65.As Sternberg et al. (2011) note "if intelligence is indeed multifaceted, a very high reliability may mask the various facets, as it can occur only when a single facet is being measured" (Sternberg, et al., 2011) p. 21).Therefore, the internal consistency score of .65 can be interpreted as supporting the multifaceted nature of the measured constructs.

Relationships with other Variables (Concurrent validity)
Any comparisons to obtain correlations between standardized tests measuring successful intelligence or any of its dimensions were not possible due to the lack of such tests in Greece.However, the concurrent validity of the theory of successful intelligence has been substantiated by a number of studies: High correlations have been found between conventional intelligence tests and analytical intellectual abilities (Guyote & Sternberg, 1981;Sternberg, 1980;Sternberg & Gardner, 1983).Measures of creativity are weakly and moderately correlated with conventional intelligence test scores (Sternberg & Gastel, 1989;Sternberg & Lubart, 1995).Practical ability tests predict better occupational success and correlate weakly or even negatively with conventional intelligence test scores (Sternberg & Wagner, 1986;Sternberg, Wagner, & Okagaki, 1993;Wagner, et al., 1999).

Response Processes
In order to ensure that the assessment actually elicited cognitive processes of the students per task, their responses to the tasks were collected during the pilot study: Students were asked by the assessment administrators "what are you thinking now" while answering the tasks of the assessment.Students' responses demonstrated that they were indeed using the cognitive processes required by the tasks.Answers for analytical tasks included: "I'm comparing", "I'm thinking of the consequences", "trying to decide", "thinking of the reasons why" etc.For the creative tasks ones students motioned "thinking how it is there", "imagining of his mother", "thinking how poor their home is", "thinking that he is suffering and he has health problems" and for the practical: "I am thinking what to change in my diet", "I am thinking of a fat friend and what I want to tell him" etc.

Equality of Opportunities
Both test design and administration of the test were planned to ensure that all test takers were given equal opportunities.As mentioned before, since the assessment examined here was designed for research purposes and it was not included in the process of teaching, the material covered by the assessment did not require any specialized knowledge of any particular curriculum area but a rather general theme, namely "nutrition around the world," which has been extensively discussed in many curriculum areas.Thus, no preparation for the assessment was needed.The assessment was administered by volunteer teachers, who did not consider it a burden to their workload since they would not have to score it.They were given written directions about the test procedure as well as for the time required for it, which was 40 minutes.The adequacy of time was measured during the pilot study.The research team visited several classrooms at the time of the administration of the assessment; in general, the testing took place smoothly without any major problems.

Assessment Perceptions and Dispositions
Assessment perceptions and dispositions may play an important role in the obtained test scores.In order to secure a common perception of all test takers, they were informed by their teachers and the research team in advance about the theory and the purpose of the assessment.The format of the assessment as the tasks per se as mentioned earlier were described by the panel of experts to be motivating and not to resemble the usual teacher made tests which require students to mnemonically reproduce chunks of material.For ethical reasons however, students were clearly told that the assessment would not have any impact on their achievement and thus, on their grades.This may have resulted in reduced motivation for some of them according to research which has shown that students tend to perform less well in assessments that do not have an important impact on their grades (Wise & DeMars, 2005).In order therefore to avoid such lack of motivation the test was taken on a voluntary basis.

Initial Statistics
Figures 1 and 2 show frequency distributions of participants' scores to the analytical tasks.Answers to the first task appeared to be are relatively normally distributed.

Insert Figures 1 and 2 about here
In the second task there was a tendency toward lower scores.The mean scores between the two analytical tasks are significantly different (t(2448)= 37.1, p<.001), indicating that in the present assessment the task which required comparing and contrasting allowed significantly more analytical elaboration than justifying and reasoning (see table 4).

Insert Table 4 about here
Figures 3 and 4 show the distributions of the summed scores of the divergent creativity tasks, and of the summed scores of the convergent creativity tasks.They are relatively normally distributed, a finding which supports the validity of the design and the implementation of the scoring rubric.
Insert Figures 3 and 4 about here A comparison of the mean scores of the two types of creativity yielded a significant difference, (t(2387)= -4.357, p<.001) showing that Greek secondary school children may be on average more competent in divergent creativity than in convergent (see table 5).It is notable however, that, higher percentages of high scores were given to divergent creativity than to convergent.

Insert Table 5 about here
The distributions of scores of the practical tasks appeared to be skewed towards lower scores, which shows that Greek students do not appear to be very competent in making decisions about changing their own eating habits or providing advice to people on changing eating habits.

Insert Figures 5 and 6 about here
However, a comparison between the mean scores of the practical tasks showed that pupils were significantly more competent in providing advice to others, than making decisions about changing their own lives (see table 6).
Insert Table 6 about here

Confirmatory Factor Analysis
In order to investigate the structural validity of the present assessment and the validity of the theory of successful intelligence with a Greek sample, a confirmatory factor analysis was performed on the collected data.The analyzable tests for factor analysis (with no missing values) were 1523.
Insert figure 7 about here A hypothesized model of the existence of 4 factors (Analytical, Creative Divergent, Creative Convergent, and Practical) was tested using AMOS v 16.0 maximum likelihood factor analysis.The model was evaluated by four fit measures: the chi square, the normed fit index (NFI), the comparative fit index (CFI) and the root mean square error of approximation (RMSEA).Results of the 3 fit indexes support the proposed model.The CFI and the NFI which are measures of relative fit comparing the hypothesized model with the null model with acceptance values of .95yielded values of .992and .987,respectively, indicating an excellent fit of the model (Hu & Bentler, 1999).The RMSEA measures the discrepancy between the sample coefficients and the population coefficients with values closer to zero indicating a well fitting model.The RMSEA was .032,indicating an excellent fit as well.The chi square had a value of 35.301, (N = 1523), p = .032suggesting a non acceptable match between the proposed model and the observed data.However, many theorists, e.g.(Bentler, 1990;Thompson, 2004) have stressed that chi squared should be treated cautiously as a sole fit indicator, especially for studies with large samples, because sample size may result in rejection of a good-fitting model because of trivial but statistically significant differences between the predicted and the observed values.Thus, the confirmatory factor analysis provided support for the internal structure of the assessment.

Discussion
The present study demonstrated that Greek pupils' analytical ability is relatively normally distributed as every other cognitive ability.This was shown in both their answers to the tasks related to critical thinking (comparison task) and reasoning (why task).This contrasts the claim of the Minsitry of Education (2010) or a notion frequently reproduced by newspapers eg (Lakasas, 2010), based on Greek students' performance in PISA (OECD, 2010) that the Greek educational system teaches memorization only and not critical thinking.It is in line, however, with other studies that have shown that critical thinking can be, and is being cultivated in Greek schools (Malamitsa, Kasoutas, & Kokkotas, 2009).Of course, the fact that Greek 14 year old children have developed some analytical thinking does not mean that it cannot be further developed.
On the contrary, the creative ability of 14 years old Greek students, as reflected through the measurements of the present assessment, did not appear to be as developed.The very low percentages of high ability scores demonstrate that most Greek children fail to produce new original ideas.The finding that the convergent creativity mean score was significantly higher than the divergent creativity one, allows the conclusion that children are significantly less creative in tasks that resemble those which are frequently used by teachers in schools.In Greece, traditionally, children, since their early school years are asked to write dialogues, paragraphs, and essays on given topics, whilst they are rarely asked to think of a paragraph that precedes a final sentence.The findings indicate that for the "average student" common school activities which could help towards the creative potential of students, due to the formalistic expression often imposed by teachers, have ended up as an obstacle to creativity, whereas new frames for expression allow more original thinking.These findings can be interpreted with reference to studies of creativity in Greek schooling.Creativity has not been at the centre of attention of the Greek curriculum.According to Kampylis, Berki, & Saariluoma (2009) the curriculum "does not offer a substantiated working definition of the term neither does it give explicit instructions on how creativity might be developed or how one would know when this ambitious target has been achieved"(p.19).Their study of Greek serving and prospective teachers demonstrated that these teachers hold contradictory conceptions of creativity, and strive to formulate consistent implicit theories, that the Greek core subjects do not offer them enough opportunities for creativity the great majority of teachers do not feel well-trained to facilitate students' creativity (Kampylis, et al., 2009).
As far as practical thinking is concerned, it seems that Greek children can use school knowledge for practical everyday purposes, namely for changing their own eating habits and for offering advice to others to do so.The finding that Greek students appeared to be much more fluent in providing advice to others than changing their own behaviors may just be the result of the test tasks that measured practical thinking, or it might represent a difficulty in self-perception self-assessment and self-regulation to change by pupils.Although Greek children face serious problems with their weight (Karayiannis, Yannakoulia, Terzidou, Sidossis, & Kokkevi, 2003), according to the findings of this study they do not have much to change in their own lifestyles related to personal health.Greek children appear to know much better what should be done by others, rather than by themselves.Such findings reveal some of the major problems of the Greek school, that is, it focuses on teaching what is right and wrong, but it does not manage to contribute to changing attitudes and behaviors well.
Perhaps the most important finding of this study has been produced by the structure validation of the test by the confirmatory factor analysis.The fact that students appeared to have distinct moderately correlated analytical, creative and practical abilities in accordance with the theory, provides support for the test as well as for the theory.This finding has implications for the everyday school practices both in teaching and in assessment.Teachers may construct tests for their everyday practice based on the theory of successful intelligence in the format of the present assessment and draw useful information for the decisions that concern their students.Traditional teacher made tests in Greece which are based solely or mainly on memorization or -at best-on analytical thinking tasks, do not allow creative or practical thinkers to express their abilities.Using assessments based on the theory of successful intelligence would allow students who have these abilities to be expressed and appreciated, allowing a more integrated teaching and assessment of those students.
This study has certain limitations that need to be mentioned in order to be taken into account for the interpretation of the results as well as for any subsequent research efforts.The first, is related to the experience of the participants of the panel of experts..In the present study, as mentioned earlier, the panel was formed conveniently and consisted of 13 teachers and two psychologists doing a master's in Psychology of Education and Teaching at the time of the study under the supervision their tutor, an experienced university teacher and researcher.Although most of the members were experienced in teaching, they were not as experienced in devising and performing innovative assessments.This may have resulted in decisions that may have been of less value compared with a hypothetical panel comprising of very experienced people in the area.A counter-argument may stress however, that expert knowledge does not does not necessarily lead to creative thinking (Weisberg, 1999), which is needed for innovative test decisions.Maybe a mixed group containing more and less experienced educators in test construction would be the best idea for the formation for the panel of experts.
The second is related to the sample size for implementation of a test for research purposes.The sample used in for the implementation of the present assessment was large both in the population of participating students and teachers.Large samples are considered to be better for generalizable results in general and for factor analysis in particular (MacCallum, Widaman, Zhang, & Hong, 1999).In striving, however, to implement the present assessment to a large population, the absolute control of the testing situation was lost.Although the research team visited a sample of classes when the test was performed, it was not completely known what happened to the rest of the classes.The researchers' presence may have resulted in the adequacy of testing conditions, while, there is a chance that some teachers many had diverted from the directions given, and this to have an impact on students' performance.
Participating students' motivation in taking the test raises some questions too.Most of the studies on test-taking motivation support that there is a link between test-taking motivation and the stakes of the test, that is, the higher the stakes of the test (the impact that the test has on students later life) the higher the motivation reported by students for those tests (Wise & DeMars, 2005;Wolf & Smith, 1995;Wolf, Smith, & Birnbaum, 1995).In the present study, the test was not only a low stakes test, but actually a "no stakes test" which would not have any impact at all in any decisions for them in the future.Although teachers who administered the test stressed to the students that they should try to do their best, and those who did not want to could not take it, there is still a chance that students' motivation was not very high and thus their performance.Finally, the present test, following the tradition of teacher made tests, was based on linguistic expression of the three dimensions of successful intelligence.This may have posed difficulties, especially to children with dyslexia and other learning difficulties related to language.
In conclusion, studies for development and implementation of school assessments based on psychological theories in general and on the theory of successful intelligence are not very common in the international literature and surely not common in Greek educational research.Therefore, any further research in the area would contribute not only to academic knowledge but it would also provide directions for teaching practices, towards the direction of changing Greek teachers testing practices.

Table 1 .
Numbers of participants per geographical area and school type

Table 5 .
Means for Divergent and Convergent Creativity Questions Note.** = p < .001.Standard Deviations appear in parentheses below means.

Table 6 .
Means for Questions Practical 1 and Practical 2 Note.** = p < .001.Standard Deviations appear in parentheses below means.Figures 1 and 2. Frequency Distributions of responses: Analytical 1 and Analytical 2 Figures 3 and 4. Frequency Distributions of responses: Divergent and Convergent Creativity