Comparative Coh-Metrix Analysis of Reading Comprehension Texts : Unified ( Russian ) State Exam in English vs Cambridge First Certificate in English

The article summarizes the results of the comparative study of Reading comprehension texts used in B2 level tests: Unified (Russia) State Exam in English (EGE) and Cambridge First Certificate in English (FCE). The research conducted was mainly focused on six parameters measured with the Coh-Metrix, a computational tool producing indices of the linguistic and discourse representations of a text: narrativity, syntactic simplicity, word concreteness, referential cohesion, deep cohesion, Flesh Reading Ease. The research shows that the complexity of EGE texts caused by lower than in FCE texts cohesion is balanced with a simpler than in FCE texts syntax and higher narrativity thus resulting in about the same text complexity of the two sets of texts studied. EGE and FCE texts demonstrate correspondence to grade six and very similar Means of Flesh Reading Ease (FCE Mean is 71.06; EGE Mean is 78.25) which fit the band FAIRLY EASY.


Introduction
Reading as a key element in EFL testing research is typically studied with the focus on one of the three interacting factors that make the reading comprehension process more or less challenging: reader characteristics, text characteristics, and characteristics of the activity of reading.
Characteristics of the texts used in standardized tests remain a fairly neglected area in Language assessment research and publication.Responding to this effective gap in the literature, the present research is geared towards the lack of a systematic approach to parameters of the texts used in language assessment.
The main objective of the research presented is to find out in which way the text characteristics used in reading multiple choice parts of Unified (Russian) State Exam in English (EGE) differ from those of Cambridge First Certificate in English (FCE) by comparing their narrativity, syntactic simplicity, word concreteness, referential cohesion, deep cohesion, Flesh Reading Ease.Our side objective in highlighting such differences is to provide textual information that may facilitate the work of non-native English item-writers and offer some direction for reading materials selection and modification.

Literature Review
The text characteristics include a wide variety of quantitative and qualitative parameters: length of the text, sentence and word length, frequency of unfamiliar words, vocabulary, language structures, text structure, genre, and background knowledge assumed in the text that students read.Although educators sometimes speak of students' "reading level," a given student might be quite successful at reading a text that is in a familiar format and about a favorite topic, but then struggle to read an academic text, even if it is at the same level as measured by a readability formula.Research shows that test takers considered 'below level' based on academic assessments can demonstrate high-level comprehension of sophisticated texts selected in other contexts (Moje, 2000).

Quantitative Text Characteristics
Modern quantitative measures calculate text complexity with a number of Readability formulas, more than 40 of which have been developed over the years (Klare, 1974(Klare, -1975)).Readability formulas assign a grade level equivalent or Lexile levels for the texts.The quantitative text parameters theories are based on a number of assumptions, the most common of which, according to Zipf (1949) are as follows: texts with longer words and lengthier sentences are more difficult to read.Longer words tend to be less frequent in the discourse, and infrequent words take more time to access and interpret during reading (Just & Carpenter, 1980).Longer sentences tend to place more demands on working memory and are therefore more difficult (Graesser et al., 2001).
The most frequently used formulas are the Flesch Reading Ease score and the Flesch-Kincaid Grade Level.The output of the Flesch Reading Ease formula is a number from 0 to 100, easier reading is indicated with a higher score.
The Flesch-Kincaid Grade Level formula converts the Reading Ease score to a US grade school level.The higher the number, the harder it is to read the text.Readability formulas though they rely exclusively on word length and sentence length have a widespread use and have had a major influence on the textbook industry.But readability formulas ignore dozens of language and discourse components that influence comprehension difficulty (Graesser et al., 2004).

Qualitative Text Characteristics
Of all possible qualitative text parameters, five used in the Coh-Metrix tool applied in the research are: narrativity, syntactic simplicity, word concreteness, referential cohesion, deep cohesion.
Narrative texts tell stories, with characters, events, places, and things that are familiar to the reader.Narrative is closely affiliated with everyday oral conversation.It is well documented that narrative is easier to read than informational texts (Bruner, 1986;Graesser, Olde, & Klettke, 2002;Haberlandt & Graesser, 1985).
Syntactic Simplicity reflects the degree to which the sentences in the text contain fewer words and use simple, familiar syntactic structures, which are less challenging to process by the reader.At the opposite end of the continuum are texts that contain sentences with more words and that use complex, unfamiliar syntactic structures (Graesser et al., 2004).Syntactic complexity is measured by the Coh-Metrix in three major ways: 1. NP density, the mean number of modifiers per NP, 2. the mean number of high-level constituents per word, 3. the incidence of word classes that signal logical or analytical difficulty (such as and, or, if-then, conditionals, and negations) (Graesser et al., 2004).Coh-Metrix provides an estimate of the number of sentences with similar syntactic structure.A high score for syntactic similarity indicates consistency in style and form.
Texts that contain content words that are concrete, meaningful, and evoke mental images are easier to process and understand than those texts which contain words that are abstract.Abstract words represent concepts that are difficult to represent visually, and as such, it is difficult for readers to generate a mental picture of what these words mean.Texts that contain more abstract words or phrases are more challenging to understand (McNamara & Graesser, 2012).
High cohesion texts contain words and ideas that overlap across sentences and the entire text, forming explicit threads that connect the text for the reader.Low cohesion text is typically more difficult to process because there are fewer threads that tie the ideas together for the reader.Deep cohesion reflects the degree to which the text contains causal, intentional, and temporal connectives.These connectives help the reader to form a more coherent and deeper understanding of the causal events, processes, and actions in the text.The cohesion components assess characteristics of the text that go beyond traditional readability (McNamara & Graesser, 2012).
According to Graesser et al. (1994), readers routinely attempt to construct coherent meanings and connections among text constituents unless the text is very poorly composed.McNamara and colleagues have discovered that cohesion gaps require the reader to make inferences using either world knowledge or previous textual information (McNamara, 2001;McNamara et al., 1996;McNamara & Kintsch, 1996).When inferences are generated, the reader makes more connections between ideas in the text and knowledge.This process results in a more coherent mental representation (Graesser et al., 2004).
Coh-Metrix has been successfully used by a number of researchers in language assessment (Best, Rowe, Ozuro, & McNamara, 2005).

Methods
A total of 12 EGE and FCE reading texts were analyzed in the six dimensions of the analytical framework selected, i.e. narrativity, syntactic simplicity, word concreteness, referential cohesion, deep cohesion, Flesh Reading Ease.The tools of the data analysis used for the research are: Coh-Metrix and Flesh Reading Ease Formula.
The free Web-based software tool called Coh-Metrix (version 3.0) (Coh-Metrix Text Easability Assessor, 2014) analyzes texts on cohesion, language, and readability.Coh-Metrix developers argue that 'its modules use lexicons, part-of-speech classifiers, syntactic parsers, templates, corpora, latent semantic analysis, and other components that are widely used in computational linguistics.After the user enters an English text, Coh-Metrix returns measures requested by the user' (Graesser et al., 2004).
A set of 5 Coh-Metrix qualitative measures were selected for this study: narrativity, syntactic simplicity, word concreteness, referential cohesion, deep cohesion.After the text is uploaded and processed the tool presents the indices in a graph (see Pictures 1 and 2 below) followed by a short descriptive text on the parameters computed.
The Flesch Reading Ease score is scaled with the following formula: Flesch Reading Ease = 206.835-1.015 x ASL -84.6 x ASW (1) Where ASL refers to the average sentence length, computed as the ratio of the number of words in the text divided by the number of sentences, and ASW refers to the average number of syllables per word, computed as the ratio of the number of syllables divided by the number of words.

Studying the Question
The three stage analysis was preceded with the authors' compiling two corpora of texts utilized in EGE and FCE Reading Multiple choice sections.
Stage One of the research, i.e. descriptive, is aimed at describing the two corpora and comparing parameters of the texts specified by exam developers in corresponding documents (Part 4.5).With the evidence of the exams and texts 'external' similarity induced, on Stage Two, the authors address the main objective of the research, i.e. determining qualitative and quantitative indices of the two corpora with the Coh-Metrix and Flesh Reading Ease formula.The research finishes with a brief discussion on the results of the research and their implications for EGE item-writers.

The Unified (Russia) State Exam (EGE)
Until late 1990 reading skills had been the focus of EFL teaching and testing in the Soviet Union, but the new economic situation shifted the priority to communicative skills and demanded a new testing system (Ter-Minasova, 2000;Dzubenko, 2005).Language assessment became a burning issue when Russia introduced its Unified (Russia) State Exam (later referred to as EGE-'Ediny Gosudarstvenny Ekzamen') in 2001.The exam is similar to Cambridge FCE and at the moment is a mandatory requirement for acceptance into higher education institutions.Introduction of EGE and a wide spread of international exams popularity in Russia made '...teachers need to reflect on…the nature of language assessment, assessment qualities and fitness for purpose, relationships between test content, test construct and teaching/learning aims, relationships between, test-wiseness: exploiting construct irrelevant aspects of test design, test familiarisation: learning about test content and format, test preparation: building tested skills and test success, how best to exploit the motivational effects of a test without sacrificing professionalism' (Green, 2012).
Thus, changes in economy intensified the necessity of enhancing assessment tools and methods and forced researchers to develop quality assurance systems.

Exam Texts Selection
One of the major and recurring problems which item writers of EFL tests face is selection of appropriate readable texts.In Russia the numerous criteria which designers of the Unified (Russia) State Exam in English (EGE), i.e.Federal Institute of Pedagogical Measurement of the Russian Federation (FIPI), expect the item writers to apply are focused on estimating the difficulty level of texts and identifying the texts characteristics that will challenge exam candidates.Meeting the strict FIPI terms on the choice of the texts is always a challenge as real-world texts do not always adhere to strict topic and style conventions and/or grammar rules and/or bear quantitative inconsistencies among passages.Thus, in English EGE Reading tests, authentic and copyrighted stories, essays, mass media texts and articles quite often appear not as they were originally published, but altered to the needs of the exam.Under the circumstances an item writer's main consideration is not only to create an item, but first and foremost to select a text and change it so that it would not detract from students' ability to understand, but specify the knowledge, skills and competences acquired.

FIPI Criteria for Unified (Russian) State Exam (EGE) Reading Items
Content Categories defined by the EGE Codifier (2014) include: recognizing the main idea of a text/paragraph, showing how details are related to the main idea, recognizing significant details, drawing conclusions from facts given, inferring cause-effect relationships, inferring the main idea of a passage with more than one paragraph (fiction, popular science, pragmatic texts).Thus, the type of reading expected at EGE include: careful reading on local (understand the sentence) and global (comprehend main ideas and overall text) levels and expeditious reading on local (scan/search for specifics) and global (skim for gist, search for main ideas and important details) (Kodifikator EGE, 2014).

EGE and FCE Multiple Choice Reading Tasks
The Reading texts corpus compiled for the study consists of a total of 12 texts utilized for EGE and FCE multiple choice tasks, in which candidates are given a set of several possible answers (4 in the case) of which only one is correct.The total number of reading task items across the corpus is 78 comprising 42 EGE and 36 FCE items.Table 1 lists the task types identified, along with their place in the reading Section/Exam, focus, format, number of items and number of occurrences of task type in the corpus.
The texts studied were marked 1-6: EGE 1-EGE 6, FCE 1-FCE 6.The reading test samples are retrieved from two main sources: 1) the official sites of FIPI and Cambridge English (FIPI, 2003-2013, Cambridge English, 2014a); and 2) practice test material (EGE in the English Language, 2013, Variant No.125958, 2014& Cambridge English, 2014b).There are a number of reasons for selecting the FIPI demo corpus other than the fact that the corpus is large and readily available.One important reason is that this corpus is representative of the texts that a typical senior in high school would have encountered in the 11th (final) grade while taking a EGE test.Another is that the texts are scaled on Flesh index, which can approximately be translated into B2 CEFR level.

EGE and FCE Reading Comprehension (Multiple Choice) Task Survey
Before the analyses, the authors reviewed a number of existing professional documents in educational assessment and language testing, including the EGE Specifications (Spetsyfikatsyya EGE, 2014 (3) the reader attempts to build up a 'macrostructure' on the basis of the majority of the information in the text.For these excellent features careful reading is thought to be the most effective reading strategy, and many educationalists and psychologists recommend it most.
Part 3 in Reading section of EGE (A15-A21), candidates are tested on their ability to recognize meaning from context and follow text organization features, such as exemplification, comparison and reference (Sample 1).

Sample 1
Read the text and do tasks А15-А21.Draw a circle around the number 1-4 you chose (In Russian).
When Suzanne had ever thought of New Orleans, it was always in connection with Hector Santien, because he was the only soul she knew who dwelt there.He had had no share in obtaining for her the position she had secured with one of the leading dry goods firms; yet it was to him she addressed herself when her arrangements to leave home were completed.
He did not wait for her train to reach the city, but crossed the river and met her at Gretna.The first thing he did was to kiss her, as he had done eight years before when he left Natchitoches parish.An hour later he would no more have thought of kissing Suzanne than he would have tendered an embrace to the Empress of China.For by that time he had realized that she was no longer twelve nor he twenty-four.
A15. Suzanne associated New Orleans with Hector Santien because he had helped her to find a job at a dry-goods firm there she used to address her letters to him when he lived there she was not acquainted with anyone else there he had arranged her visit to that city A16.When Hector met Suzanne he kissed her as such was his manner of greeting her as he used to do when she came to New Orleans  Centr Resolventa, 2003-2014).The text is typically taken from a modern novel or an article, questions focus on the main ideas or details in the text, and on the attitudes or opinions expressed.
Candidates may also be asked to deduce the meaning of a word or phrase and to demonstrate understanding of references, such as pronouns, within the text.Additionally, questions may focus on the tone of the text or the writer's purpose, as well as the use of exemplification or comparison.These questions require candidates to infer the meaning from clues in the text, a skill which is an essential part of reading ability.
The 4-option multiple-choice questions are presented in the same order as the information in the text so that candidates can follow the development of the writer's ideas as they work through the questions.The final question may require candidates to interpret an aspect of the text as a whole (Cambridge English First Handbook for Teachers, 2014 & FCE for Schools, 2010) (see Sample 2).

Sample 2
You are going to read an article about a London tour guide.For questions 1-8, choose the answer (A, B, C or D) which you think fits best according to the text.
Mark your answers on the separate answer sheet.

The best kind of know-it-all
There is an art to being a good tour guide and Martin Priestly knows what it is.
It's obvious that the best way to explore a city is with a friend who is courteous, humorous, intelligent and this is essential-extremely well-informed.Failing that, and if it is London you are visiting, then the next best thing may well be Martin Priestly, former university lecturer, now a guide, who seems to bring together most of the necessary virtues and who will probably become a friend as well.
Last spring, I took a trip around London with him, along with a party of Indian journalists.Accustomed to guides who are occasionally excellent but who often turn out to be arrogant, repetitive and sometimes bossy, I was so struck by Priestly's performance that I sought him out again to see, if I could, just how the trick was done.
This time the tour was for a party of foreign students, aged anything between 20 and 60, who were here to improve their English, which was already more than passable.As the 'tourists' gathered, Martin welcomed them with a kind of dazzled pleasure, as if he had been waiting for them with excitement and a touch of anxiety, now thankfully relieved.I have to say, all this seemed absolutely genuine.
There are several hundred other guides out there, all looking for a share of the work.I think, as we talk, that I am starting to understand why good guides are so rare.It's a great deal harder than it looks, and it demands, for every stretch of road, an even longer stretch of study and forethought.

Flesh Reading Ease
All the texts studied (EGE 1-6, FCE 1-6) were scaled on Flesch Reading Ease formula measuring text difficulty.
The results show that B2 Reading test texts (EGE, FCE) between grades 4 and 8.The Mean for EGE texts is 78.25 (Fairly easy = grade 6).The Mean for FCE texts is 71.06 (Fairly easy = grade 6).

Coh-Metrix
On the next stage of the research the texts were scaled on the Coh-Metrix measures: narrativity, syntactic simplicity, word concreteness, referential cohesion, deep cohesion.
The routine procedure was to access the Web site (Coh-Metrix Text Easability Assessor) and enter EGE (1-6) and FCE (1-6) texts with a copy-and -paste function from a text file.After that, the Coh-Metrix returned measures requested as a graph and a description.The Web facility is free and available at the moment.Table 2 shows the results of the comparative analyses conducted and lists all the texts with Flesch Reading Ease Score decreasing.

Narrativity
High indices of narrativity of the texts in the corpus studied (from 38 in FCE 2 to 97 in EGE 2) serve as an evidence to the genres of the texts, i.e. fiction and mass media.They mainly convey stories or sequences of events with animate beings and are rich in verbs.EGE Texts (1, 2, 3 and 5) present short stories, 2 texts (EGE 4 and 6) are extracts from novels; FCE Texts (1, 2, 4 and 6) present a newspaper text, 2 texts (FCE 3 and 5) are extracts from novels.

Syntactic Simplicity
The results suggest that Russian item writers produce more syntactically similarly constructed sentences than British writers.A possible explanation of this index is that Russians writing in English or modifying an English text may feel less confident expressing their ideas and preferring to stick to the structures they heard or saw in native discourse or even used in the text being modified.

Word concreteness
Multiple choice reading tasks of both exams utilize texts with high and low indices of concreteness: word concreteness in the texts produced by Russians range from 31 to 85, while FCE texts spectrum is wider, i.e. between 22 and 87.Higher scores correspond to bigger amount of 'meaningful, evoking mental images wordsas opposed to abstract' ones.
Cohesion (both referential and deep) has a big variation over the texts studied with a slightly lower percentile in EGE texts.Table 3 presents the means of the indices studied.The mean values were calculated by adding up all the numbers of the corresponding index and then dividing the sums by the counts.
As we see, the data received suggest some differences between the EGE and FCE texts and supply evidence to the fact that Russian EGE authors create items with higher indices of narrativity and syntactic simplicity.With the regard to the comparison between EGE and FCE texts on deep and referential cohesion, our study provided evidence that Russian item writers use significantly fewer cohesion tools thus resulting in candidates facing more problems inferring the meanings on the sentence level.

Conclusion
The research proved that the texts used to test reading comprehension vary a great deal across the EGE and FCE assessment systems.EGE texts have a higher index of narrativity and lower indices of syntax simplicity, deep and referential cohesion.Word Concreteness indices of both groups of texts are similar with the Mean amounting to 62.The higher complexity of FCE texts is caused by higher than in EGE texts index of syntax simplicity/complexity and lower narrativity.Higher indices of referential cohesion and deep cohesion in FCE texts suggest that a reader is provided with better tools to mentally connect parts of the text and infer meanings.
Flesch Reading Ease Means of both groups of texts -written (or modified) by Russian and English authors -are similar with the EGE Mean being 76.95, and the FCE averaging at 72.89.Though the corresponding band for both sets of texts is FAIRLY EASY and the grade is the same, i.e. six, EGE texts are slightly easier mainly due to a lower variety of syntactic structures and a higher degree of narrativity.
Addressing the problem of text selection and its modifying for the exam purposes, one implication is that EGE exam developers and item writers may use the tools highlighted in this study to measure the characteristics of the texts used for language assessment purposes.
because he was overwhelmed by her beauty to show that she was still a little girl for him ... A20.The phrase "He often treated them to the theatre … when business was brisk" implies that Hector bought theatre tickets for them Hector accompanied them to the theatre Hector's business had something to do with the theatre Hector was well connected in the theatrical world A21.After her talk with Hector Suzanne realized that his business must have been illegal he was romantically involved with another woman their relationship might break down she had been exhausted by her work at the store FCE Part 1(2014) or Part 5 (in the new 2015 year version) Multiple choice consists of a text, followed by eight(2014) or six (2015) 4-option multiple-choice questions which test the understanding of content and text organization (Uchebnyy

1.
What do we learn about Martin in the first paragraph?A He has two educational roles В He is a colleague of the writer С His job is an extension of his hobby D His job suits his personality 2. The writer decided to meet Martin again to find out how he managed to A win custom from other tour guides В entertain large and varied tour groups С avoid the failings of many other tour guides D encourage people to go back to him for another tour 3. The writer notes that on meeting the tour group, Martin A greeted everyone warmly В seemed as nervous as everyone else С praised everyone for their prompt arrival D checked that everyone could understand him … 8.In the last paragraph, the writer says he is impressed by A the distances Martin covers on his tours В the quantity of work available for tour guides С the amount of preparation involved in Martin's job D the variety of approaches taken to guiding

Table 1 .
EGE and FCE multiple choice reading tasks Cambridge English First (2014),Cambridge English FCE (FCE for Schools, 2010).Reading (multiple choice) comprehension items of the two studied tests are of two general categories: referring and reasoning.Within these two categories there are content categories specifying the skills and knowledge assessed by each item.Referring items pose questions about material explicitly stated in a passage.Reasoning items assess proficiency at making appropriate inferences, understanding the text, and determining the specific meanings of difficult, unfamiliar, or ambiguous words based on the surrounding context.

Table 2 .
Coh-Metrix analyses of EGE and FCE multiple choice reading tasks

Table 3 .
Means of Coh-Metrix indices of EGE and FCE multiple choice reading tasks