The Effect of Group Work on English Vocabulary Learning

This study investigated the effectiveness of group work (GW) in EFL vocabulary learning by second year, non-English major, university students in Taiwan, in comparison with working individually (IW). The students (N=44) worked in mixed ability groups of 3-4 or in IW to complete vocabulary exercises following reading activities. The classroom intervention followed a repeated measures design with alternating sessions (one week IW, one week GW) for 12 weeks. In order to measure students’ word knowledge gains, the modified vocabulary knowledge scale was used in pre-, postand delayed-post tests, and the scores from the tests were analyzed with paired t tests. Qualitative information about vocabulary discovery and retention was further obtained from interviews with 24 students conducted after the classroom intervention. Results showed that students’ overall improvement in vocabulary knowledge with group work was significantly higher than that with individual work on immediate post-tests, though both treatments had a beneficial effect. Later retention of word knowledge after GW was only 2% higher than that with IW. Interpretations and implications of these findings are discussed.


Introduction
Vocabulary development closely relates to learners' reading ability (Krashen, 2011;Nation, 2013).First, in order to understand and learn efficiently from unsimplified text, learners need to already possess a vocabulary of at least 5,000 word families, which provides coverage of 98% of authentic text (Laufer, 1992;Hirsh and Nation, 1992).This requisite coverage is for reading at a reasonable level (Laufer and Ravenhorst-Kalovski, 2010;Schmitt, Jiang and Grabe, 2011).Ways of improving students' word knowledge, to this level include glosses (Watanabe, 1997, Myong, 2012), dictionary use (Scholfield, 1997;Jee and Hyeok, 2016), direct vocabulary instruction (Schmitt, 1997;Maki and Thomas, 2015) etc.Secondly, however, reading itself, especially extensive reading (Perfetti, 2007) is seen by some as an important method to learn vocabulary.Studies have however found limited vocabulary learning through meaning-focused reading (Horst, Cobb and Meara, 1998;Waring and Takaki, 2003;Rosszell, 2007).By contrast, positive effects on students' vocabulary learning have been found from explicit activities conducted after intensive reading (e.g.matching the target word with a definition or synonym, fill-in-the-blanks exercises with target words, inferring the meaning of new words from the sentence context) (Knight, 1994,;Paribakht & Wesche, 1997;Laufer, 2000;Min, 2008).
Such vocabulary activities may be performed by students either doing individual work (IW) or group work (GW).Surprisingly, however, despite the effectiveness widely claimed for group work in language learning (Ohta, 2001;Swain, 2002;Storch, 2005;Jones, 2006, Lasito & Storch, 2013;Dobao, 2014) no studies have investigated the conduct of such vocabulary activities through group work, particularly considering the impact on the discovery and retention/consolidation of lexical information separately.This study therefore attempts to investigate the effect of incorporating collaborative group work into the above mentioned activities to find out its precise influence on students' vocabulary learning, in particular considering vocabulary discovery and retention separately.

Noticing
Giving attention to at least some aspect of a new word is the first step in the process of vocabulary learning.In order to gain knowledge of a word, learners need to be aware of the word and recognize it as a useful language item (Ellis, 1991;Schmidt, 1990) removed from its sentence context rather than just "as a part of a message" (Nation, 2013, p.103).That is to say, the word need to be mentally separated from its sentence context and becomes the focus of the learners' attention as a language item about which some information need to be 'discovered' (Schmitt, 1997).If one does not notice that one does not know something (e.g. about a word), then one will not take steps to 'discover' the information that one does not know (Schmitt, 1997).
It has been recognized that reading individually is not entirely conductive to noticing new words.Often the learner's focus may be on understanding the meaning of the text, not on noticing and learning new words (Nation, 2013).Learners may pay some passing attention to new words, at a low level of consciousness, but they may skip them (as may be encouraged by the teacher in a reading class, e.g. for skimming), or give them some minimal possible meaning to support the continuation of figuring out the wider text (Grabe, 2009).Studies have therefore found low rates of vocabulary learning through individual meaning-focused reading (Horst, Cobb andMeara, 1998, Waring andTakaki, 2003;Rosszell, 2007).
According to Vygotsky's (1978Vygotsky's ( , 1986) ) constructivist theory of learning, by contrast, an individual's learning occurs through communication with others in the social group.This would imply that more vocabulary might be noticed and learnt from reading if GW was in some way involved.Vygotsky further states that learning occurs when learners interact collaboratively in the zone of proximal development (ZPD), which is the space between what a learner can do without help and what a learner can do with help from a more competent member of the group (Vygotsky, 1978).Noticing and discovering new lexical information from reading could be seen as an activity within the ZPD, and so capable of being facilitated by a more able individual who prompts noticing and provides assistance to the novice through conversation in the group (Aljaafreh and Lantolf, 1994).Notably GW allows for more knowledgeable 'others' additional to the teacher to be involved in this process.However, while the general GW literature seems to suggest benefits of GW for learning, the vocabulary literature reveals little specially about GW benefits for vocabulary noticing (Ohta, 2001;Swain, 2002;Storch, 2005).

Retrieval
The second step in enabling a word to be learned, according to Nation, is retrieval.This primarily assists retention of what has been noticed, or what Schmitt (1997) calls 'consolidation'.When a word is retrieved, the mental connection between word form and meaning is strengthened (Baddeley, 1990;Nation, 2013): the learner subconsciously evaluates and compares the word with other words which he is able to recognize and then chooses the one most suitable for the present context thus strengthening memory for the lexical information (Beheydt, 1987;Pavlenko, 2009).
Moreover, learning a word fully involves discovering and retaining a number of different types of lexical information (e.g. its precise spelling, sound, and part of speech).It is unlikely, however, that students will notice all of these facets of a word fully after only one exposure.Thus repeated retrieval has a second benefit that further such inherent aspects of the word may be discovered and start to be memorized.Indeed, in order to equip learners with enough information about a word to use the word accurately in production, it is suggested that a minimum of twelve exposures is required (Beck, Mckeown & Omanson, 1987).
Clearly, therefore, after the initial meeting, opportunities need to arise for learners to repeatedly meet or use a new word, so that retrieval occurs (Schmitt and McCarthy, 1997).While this may in part be determined by the teacher or the teaching materials, clearly performing vocabulary tasks via GW is likely to engender more retrieval than the same tasks done as IW.
Exposing learners to a large amount of words repeatedly purely in reading texts has been seen by some as a key way for them to learn vocabulary, since it requires repeated retrieval, but such reading is essentially an IW activity.Studies of the relationship between L2 reading and vocabulary learning have in fact shown that increasing the amount of extensive reading (where words recur, and so need to be repeatedly retrieved) does lead to noticeable vocabulary learning, but that it is slow in comparison with the amount that has to be read (Elley and Mangubhai, 1983;Ferris, 1988;Pitts, White and Krashen, 1989).Furthermore, many EFL learners, including ours who are non English majors, do not have the time or motivation to read extensively in the FL.
Thus reading supplemented with another activity enhancing vocabulary learning has been considered in several studies (Knight, 1994, Paribakht and Wesche, 1997, Laufer, 2000, 2003).Min (2008) examined the effectiveness of reading plus vocabulary-enhancement activities (RV) versus narrow reading (NR) (repeated reading of thematically-related articles) for vocabulary acquisition and retention among EFL secondary-school students in Taiwan.The results showed that the RV group gained significantly more knowledge of target words than the NR group both on the post-test and on the delayed post-test.In these studies, however, the possible role of IW versus GW in performing the task was not considered.
In GW, learners are expected to correct others' errors and to give explanations to other group members.This should not only help the group member who lacks knowledge to discover more lexical information, but also prompt the explainer to retrieve word knowledge and reconsider the meaning of the target word in relation to the rest of words, which should enhance their learning of further aspects of word as well.GW also must help all learners retrieve through repeated speaking and listening, not just reading, and attention may be given not only to words targeted in an RV task, but also to non-target words.Once again, while we can deduce from the general literature the putative benefits of GW, the vocabulary literature says little about this in relation to retrieval.

Generative Use
The final step in learning a new word according to Nation is generative use.Generative use refers to meeting or using the previously-met word in a context that is different from the context in which the word was met previously.For example, if a student meets the word 'inherited' used as a verb in the sentence: 'Some scientists believe that a person's personality is mainly inherited', and then meets the word again in another sentence: 'He inherited a fortune from his grandmother', the learner will need to rethink the meaning and use of the word 'inherited' and this will help the student remember the word (Nation, 2013).Generative use then again enables further, more contextual, aspects of lexical information to be discovered about words (e.g.multiple meanings, collocation, associated grammatical structures), as well as at the same time promoting retention/consolidation.This is consistent with 'levels of processing hypothesis' in cognitive psychology which implies that word retention relates to the amount of attention given to, and the variety of types of manipulation applied to a new word (Craik and Lockhart, 1972;Leow and Mercer, 2015;Baddeley and Hitch, 2017).
Once again, in performing vocabulary tasks through GW, there is expected to be much less predictability about what contexts words will be used and heard in, since each member of the group has their own history of prior exposure to and use of the word, and this will be reflected in their GW interactions.By contrast in IW the learner is limited to the context provided in the learning materials and his/her own prior experience of the word.While in general it is recognized that the interaction in collaborative group work provides students with opportunities to build and practice their newly-constructed knowledge (Panitz, 1999), to our knowledge this has not been directly investigated in post-reading vocabulary tasks.The vocabulary literature again says very little about GW benefits for vocabulary generation, although the general GW literature seems to suggest such benefits of GW for learning (Ohta, 2001;Swain, 2002;Storch, 2005, Dobao, 2014).

Involvement Load
A somewhat distinct approach from the above is the 'involvement load hypothesis' about vocabulary learning.Laufer and Hulstijn (2001), who have suggested that the involvement in processing target words, and hence the benefit for learning, is affected by three task related features: need, search and evaluation.Need is the requirement for the target word in order to complete the task, such as needing a particular word to fill in the gap in a sentence correctly."Need does not exist if the target vocabulary is not needed to complete the task.Need is moderate if the task requires the target vocabulary, and it is strong if the learner feels the need for the vocabulary, e.g. a genuine communicative task" (Nation, 2013, p.98).Clearly, when reading a text, the need for a new word that is encountered may vary depending on how far understanding it is crucial to adequately understanding the meaning of the text as a whole.In dedicated vocabulary tasks following reading, however, the need for the target words in the exercises will be much enhanced regardless of whether the task is performed as IW or GW.For the present study, the need for the target words in completing the tasks is moderate because the tasks require it, and need does not come from the learner.The need for words other than the target ones is not promoted by the task, however, and it is possible that such a need may be created where a fellow group member perceives the need to find out about a non-target word in the task which a particular learner working in IW would have overlooked.
Search is the attempt to find the target word information required by the tasks, for example, by using a dictionary to look up the meanings or forms of the target word."Search does not exist if the word forms or their meanings are supplied as a part of the task.Search is moderate if learners have to search for the meaning of the item, and strong if learners have to search for the form to express a meaning" (ibid).In the tasks we envisage, the requisite information is not all supplied with the task.The search for the forms and meanings of target words is rated low because they are provided by the teacher in work prior to the task.However, for other aspects of target words and for non-target words search is moderate as the learner has to discover the information.Here the search resources will differ between GW and IW since, in the former, fellow group members can provide information additionally to a dictionary.Evaluation involves the comparison of target word information with the context required in the tasks and then deciding if the word selected fits the context appropriately."Evaluation is moderate if the context is provided and is strong if the learner has to create a context" (ibid).In our study, evaluation is moderate since the context is provided for both parts of the tasks.However, we would argue that richer evaluation, so more involvement, may occur in GW than IW since members of the group may collectively come up with more reasons for why a word does or does not fit a context than a learner working alone.
Although some research has been done which supports the involvement load hypothesis (e.g.Laufer and Hulstijn, 2001;Hulstijn and Laufer, 2008;Kim, 2011), as far as we are aware the potential role in it of the mode of working on a task, by IW or GW, has not been considered.

Studies of IW or GW in Relation to Vocabulary
In studies of interaction in general classroom activities it is reported that learners tend to focus particularly on vocabulary and other linguistic forms (Williams, 1999(Williams, , 2001;;Leeser, 2004;Kim, 2008).More specifically, Kim (2008) examined the effectiveness of collaborative and individual tasks on the acquisition of L2 vocabulary by Korean-as-a-second-language learners (KSL).Thirty-two, intermediate-level, KSL speakers in a preparatory Korean language program completed a pre-test, a dictogloss task, and two post-tests over a three-week period.The dictogloss was a procedure in which a text was read to a group of learners, and while the text was being read the second and third times, learners were asked to take notes, including noting words and phrases from the text.After listening to the text three times, learners in the individual work group were required to reconstruct the text individually while thinking aloud, whereas collaborative group learners reconstructed the text with a partner.The results showed that learners who worked in the collaborative groups performed significantly better on the vocabulary tests than the learners who worked individually, even though learners in both groups exhibited similar numbers of LREs.LREs are language-related episodes, defined by Swain and Lapkin (1998) as "any part of a dialogue where the students talk about the language they are producing, question their language use, or correct themselves or others" (p.326).
In addition, in GW activity in general, it has been shown that peers can be experts and novices at the same time: more proficient and less knowledgeable learners both may contribute knowledge to each other in order to increase the level of their performance (Donato, 1994;Anton and DiCamilla, 1998;Swain and Lapkin, 1998;Ohta, 2000Ohta, , 2001)).A study by Watanabe (2008) explored the interaction of adult ESL (English as a Second Language) learners with either a higher-or a lower-proficiency peer during pair problem solving, and investigated their perceptions of the interactions with their partners.The study showed that the three participants preferred working with group members who contributed their ideas, regardless of their proficiency level.The analysis of the interactive aspects of the LREs further indicated that the more proficient learners and the less proficient learner became resources for each other by repeating each other's utterances, which helped them to elaborate and to build up each other's understanding (DiCamilla and Anton, 1997).For this reason we will involve learners of unequal proficiency level in our groups for GW.

Purpose of This Study
In order to test general claims made about the benefits of GW in a specific context, this study therefore aimed to investigate the following questions: 1.Will students learn the target words better through group work than through individual work?Will that change over the period of the study?2. What aspects of word knowledge do the students claim to discover through group work vs individual work? 3. What aspects of word knowledge do the students claim to retain better through group work vs individual work?

Participants
The participants were 44 second year, L1 Chinese, non-English major students from various departments attending the required English reading course to improve their basic reading ability and vocabulary size for further ESP courses.There were 28 male and 16 female students, aged from 18 to 20, with pre-intermediate English proficiency as measured with the national Taiwanese GEPT test.

Overall Procedure
All 44 participants took pre-test of target word knowledge (see Appendix 3), carried out tasks in IW and GW, took an immediate post-test, a delayed post-test, and 12 participated in interviews after the classroom intervention.For GW activities, the 44 students were arranged into groups of 3, including members with high, mid-and low prior knowledge of the target words in each group (Donato, 1994;Ohta, 2000).For IW activities in other weeks, the same 44 participants worked individually in class (see Appendix 1).
In each weekly lesson, first a reading text was introduced from the textbook Day and Yamanaka (2007), Cover to Cover 1, Oxford University Press.After the schema activating activities, the teacher explained the general meaning of reading text, vocabulary, and intended reading comprehension strategy to the students.Students then read the text and completed the reading comprehension exercise individually.This was followed by two vocabulary exercises provided in the textbook, which were completed either through IW or GW, and lasted for 15-20 minutes.12 texts were used over the 12 weeks of the intervention: seven pairs of vocabulary exercises were completed with IW, five with GW in alternating weeks (See Appendix 2).

Training Sessions
While IW was very familiar to students, so not deemed to require training, this could not be assume for the collaborative GW of our study.Students were introduced to the idea of 'collaborative' GW as GW where students are grouped intentionally rather than self-selected, so as to have varied levels of prior knowledge, but with each member individually responsible for interacting and contributing to the tasks performed, with no specific organization or procedure imposed on the group from ourside.During the GW training session in week 3, handouts on disputational talk, cumulative talk, and exploratory talk were given and explained to students (Mercer, 1995(Mercer, , 1996;;Wegerif and Mercer, 2000).Students observed and understood how to co-construct knowledge in groups through exploratory talk, while comparing this with disputational talk, and with cumulative talk.Later the students were given a vocabulary task and asked to discuss the target words by using exploratory talk.When students were practising, the teacher went round the classroom to check on them and to encourage them to use exploratory talk during discussions.

Test Instrument
A modified version of the Vocabulary Knowledge Scale (VKS) test (Paribakht & Wesche, 1993) was used for all tests of the targeted items (Appendix 4).In this kind of test, the testee answers a series of graded questions (categories) about each vocabulary item tested, revealing how many different kinds of information about the word they know.The modification took account of the fact that, besides word form, meaning, and synonym (tested by the existing VKS), the vocabulary tasks also practised antonyms, collocations, inflections, and parts of speech.In the modified VKS self-report categories, students were therefore given the additional option (category IV) of providing the antonym of the target word.It was also stipulated that when using the target words to make sentence (category V), they should use sentences which they had not seen before and not read in the exercise, because the researcher wanted to preclude participants using the strategy of memorizing one of the example sentences from the tasks in order to answer this question.
The two pre-tests between them covered all the words to be targeted in the exercises.Immediate post-tests were given directly after each vocabulary task was finished (10-15 minutes) to check each student's short-term retention of vocabulary information for each of the six words targeted.The delayed post-test, two weeks after finishing the intervention, was to check each student's change in knowledge retention between pre-and post-test and covered 40 vocabulary items studied earlier in either IW or GW modes.
The derivation of scores from student test responses was again a slightly modified version of the original VKS system.Responses to the set of questions about each word are evaluated and mapped onto scores as shown in Appendix 4, yielding a score of 1-5 for each word tested.For example, if the testee chooses category II 'I have seen this word before, but I don't know what it means' they are awarded a score of two for at least being familiar with the written form of the word.However, they may also receive a score of two if they stated they had knowledge of the word in a sentence:_______', but showed that they did not know its meaning.A score of 4 was given if the word was used with the appropriate meaning and correct grammar, even if another part of the sentence contained errors (e.g.In many society, sleep deprivation is becoming part of the culture).

Interviews
Semi-structured interviews with 12 participants were conducted in weeks 14-17 (out of class time).The participants were interviewed in pairs for 30-80 minutes, and the information gathered was recorded on a digital voice recorder.They were asked their experiences and opinions concerning vocabulary discovery and retention using GW (N=24), and IW (N=24), (See Appendix 5).

Data Analysis
The test scores were converted to percent, for ease of comprehension, and analyzed with SPSS (Version 16) to compare any gains in word knowledge which the students had made through IW and GW.Interview data was fully transcribed in the original Mandarin Chinese and then translated into English.Each audio file was checked several times to ensure there was no possibility of missing any data.The analysis procedure for coding, categorization, description and interpretation suggested by Bogdan and Biklen (2007) and Patton (2002) was used to analyze the student interview data.In order to ensure the reliability of the categorization and coding (Mackey and Gass, 2011), the researcher involved three of her colleagues as second judges.The codes assigned differently from the researcher by second judges were discussed and changed if needed.As seen in Tables 1, 2, with both IW and GW immediate post-test mean scores were always higher than pretest scores, as would be expected given the amount of instruction and learning in between in both treatments.The vocabulary scores in the pre-tests for IW ranged from 12.50% to 36.82%, while the vocabulary scores in the immediate post-tests ranged from 40.81% to 68.63%.The vocabulary scores in the pre-tests for GW ranged from 14.05% to 21.41%, while the vocabulary scores ranged from 48.96% to 71.48% in the immediate post-tests.

Changes Between Pre-Test and Immediate Post-Test
In both treatments prior knowledge of the words as reflected in the pretests varied somewhat from week to week, as might be expected and this variation is reflected to some extent in the post-test scores for IW but not for GW.
The GW post-test scores steadily rose from session to session, reaching 71.5% in the final session.The rise in posttest scores did show signs of leveling off, however, suggesting that scores would not reach 100% for a long time, if at all.This could be a sign that the richer amount of input in GW improves learners' knowledge up to a certain level regardless of their initial level of knowledge while IW simply adds a certain amount of knowledge to what was already known.
Post-test scores rose over the period of the study for both IW and GW, though the pattern is more consistently progressive over time for the latter, where we see a curve possibly tending to eventually level off at a post-test score of around 75%.More importantly improvement scores (=post-test minus pretest) rose for both treatments over sessions.This shows that both treatments had a progressive effect of increasing the pre-post improvement of knowledge over time, not just GW, despite IW being the familiar mode of working and GW the innovation.This will be discussed further below.
Crucially for our study, however, the trajectory of change in scores over time shows greater benefit from GW than from IW.The immediate post-test scores end up in the 70s for the former but not for IW.The significance of the difference was confirmed by a paired sample t test comparing overall improvement scores in GW with those in IW (t=7.279,p<.001).The result was similar to Kim's dictogloss study (2008), described earlier, which showed that learners who worked in the collaborative groups performed significantly better on the vocabulary tests than the learners who worked individually. jel.ccsenet.o

Delaye
The mean more voca mean diffe GW, albei noticeable average im forgetting/ Attrition w be because students as after the d not include before if th post-test.

Aspect
While the retained, th how the in As seen in Table 3 and 4, students reported that they discovered more types of word knowledge when working in GW than in IW.Both in IW and in GW, students reported that they found out synonyms, antonyms and meanings of target words but since the tasks were explicitly about meanings, synonyms and antonyms of target words, this is not surprising.No other lexical information about words was required by the tasks, although some was by the tests (e.g.grammatical properties of words, collocation).Aspects not strictly required by either were affixes, compounds, phrases, and pronunciation.Spelling, though required to be written in both, was supplied in both.It is therefore interesting that quite a lot of these other types of information were reported as discovered.
Knowledge of prefixes, and suffixes, which students discovered in both groups, was not required in the tasks except insofar as it might relate indirectly to antonyms, where the affix showed oppositeness (e.g.un-happy, dis-connect, care-less) and is known to help with remembering words and meanings, even though the students did not say that (the morphological decomposition strategy).The precise spelling of target words (only mentioned as being discovered in IW but inevitably discovered in fact in both modes) was also knowledge needed in relation to the tests (e.g.making sentences with target words or writing down the synonym or antonym of the target words in the tests).The other types of word knowledge reported as discovered in GW included lexical grammatical features and phrases which were not specifically required in the tasks but were relevant to the tests (Category V, making sentences with target words).
While there were not great differences in the kinds of lexical information involved, clearly the sources, and variety and quality, of information differed between IW and GW.In IW, the synonym, antonym and additional meaning of target words which students discovered were generally from the dictionary, since students depended on their own efforts to complete the tasks and the dictionary was the main source for word information (10A-S, 11A-S, 11B-S).In group work, word knowledge was obtained from the contribution of other group members (8B-S) or from the dictionary (8A-S, 8B-S, 9A-S).[8B-S] showed more active engagement with the information when he said: 'The antonym of break down is operating according to my dictionary.Group members found out that working can also be the antonym of break down.The word, working is even easier than operating.I never thought that working can be the antonym of break down, but the group members do.So I use the word working as the antonym of break down.' 7A-S showed more sources for word information when he said: 'During group work, the group members use different types of dictionary; for example, one groumate uses cell phone as dictionary, another groupmate uses paper dictionary, the other uses electronic dictionary.The information about the synonym or antonym of words the groupmates get from their dictionaries could be different.For example, the synonym of interrupt from groupmate A is disturb, stop from groupmate B and disconnected from groupmate C.So I learn disturb, stop, and disconnected are all the synonym of interrupt from the group members.9A-S importantly mentioned greater processing of word information through group discussion.His basic dictionary only provided meaning and pronunciation of words, and the meaning provided in the dictionary sometimes could not be applied to the sentence in the task: an example was the task of circling the word that did not belong in 'unload (par.5)unpack take down pack'.He then discussed the possible meanings of the word he got from the dictionary with group members and decided on a meaning that properly fitted the task.
In IW, students discovered the use of prefixes and suffixes from the teacher and the dictionary (10A-S, 10B-S).In group work, students again discovered prefixes and suffixes both from group discussion and from the dictionary (4B-S), but used as a shared resource.8A-S that discovered the synonym of casually could be the word, relaxed, simply by adding -ed to relax from the dictionary before telling group members this.Knowledge of spelling of words (different from any homonyms), lexical grammatical features and phrases were targeted in the tests: students were required to make sentences with target words and to write down words correctly.It became apparent however that students were using knowledge of the test to guide what they did in the tasks, not just what the task required.Some lexical-grammatical features and the use of phrases were reported only in GW. 9A-S illuminated something of the thinking that could be promoted in GW when he reported that he discovered a lexical grammatical feature, using v-ing after -confident of-, when explaining the meaning of the word, -confident-, to group members.In order to answer a group member's question, he re-read the question and checked the dictionary again; finally he made a sentence with the target word by using the lexical information he found.By contrast, 8B-S showed that thinking in individual work was less rich than in group work.He said: 'I will not know so much about the knowledge of words when I work on the task alone.What I do is to memorize as much vocabulary as I can without knowing the lexical grammatical use of the words clearly'.
Overall, students seemed to discover richer lexical information in GW than in IW.This seemed to be because, in GW, students had more stimulation to notice new aspects of words, and more opportunities to retrieve words and add to word knowledge from other group members during group discussions; they also had the chance to learn more pieces of word knowledge by giving help (e.g.explaining things) to their fellow group members.As can be seen from Table 5, the aspects of word knowledge students in both groups said they remembered, not surprisingly, spanned the same range of types as we saw above for discovery.Hence we will not go over each separately again here.

The Aspects of Word Knowledge Reported as Retained in IW vs GW
The factors which students reported to help them memorize synonyms, antonyms and meanings of target words in IW were target words repeatedly (10A-S), including having no interruption from group discussion, having to depend on oneself, and asking the teacher for help (10B-S).An example of teacher impact on retention in IW was remembering the grammatical use of words (10A-Ss, 10B-S), because of help from teacher correction rather than because they used a dictionary.The teacher's corrections seemed to make certain word knowledge more memorable.10a-S for example said 'I make a sentence with the new word, frustrate: I am frustrating.Then, the teacher corrects the mistake in the sentence; the sentence becomes I am frustrated.In this situation, I remember easily about the grammatical use of the word frustrate'.
In GW, the factors that helped students to memorize many aspects of target words were claimed to be learning this information from group discussion (8A-S), and contributing word knowledge to group members (7B-S).One way in which this occurred was given by 8A-S who reported memorizing a synonym of the target word, because group members mentioned a synonym word that was easy to remember (e.g.synonym of prevent could be stop) during group discussion.Again, 8B-S reported that he understood and remembered the phrases discussed in group discussions, since group members used better examples to help him understand: 'I can remember better about the phrases or words that are discussed in the group discussion.For example, the phrases pay off in the sentence: Sandy took out many loans for medical school, so it will take them a long time to pay off his debts.Besides, group members often tell me better examples to help me understand about it'.The implication here is that he sees better understanding as leading to better memory.
Furthermore, the very act of receiving information in discussion was claimed to help.As 8A-S said: 'A group member finds out that remote can also be remote control.After discussion, all of group members understand that remote can be distant or remote control.In order to answer this question in the task, the former is better.So I understand and remember a new meaning of the word now'.
Not only receiving but also giving information in GW seemed to help.9A-Spointed out: 'I specially remember about certain words because groupmates asked me about the meaning or lexical grammatical use of those words.So I explain the meaning or grammatical use of words to group members which help me to have deep impression about them'.
Overall, once again the processes of GW were reported as providing richer support for retention than IW, both in terms of multiple retrievals and generative use, in Nation's terms, and through their interactive nature.
Nevertheless, as we have seen, this benefit was only weakly reflected in the delayed post-test score evidence (t=1.85,p=.027).(We used paired t tests again to compare posttest scores, but note that it was not comparing groups but the same people on two sets of vocabulary that learnt in GW and that learnt in IW).

Vocabulary Learning Through GW and IW
The test results showed that students in the very short term attained higher scores through GW than through IW.GW fulfilled the expectations from the general literature (e.g.Donato, 1994;Anton and DiCamilla, 1988;Lasito & Storch, 2013;Dobao, 2014), allowing more knowledgeable others, other than the teacher, to be involved.This increased the opportunities for the desirable processes identified by Nation (2013) to occur, such as lexical information being noticed, retrieved and generatively used.It enhanced the learning of different aspects of word knowledge through error correction, explanation, suggestion, and sharing of resources, as compared with learning in IW.Moreover, it enhanced task involvement since doing any task with others in GW required more attention, and richer searching than just talking to oneself in IW (Laufer and Hulstijin, 2001).Furthermore, in GW, students not only had more opportunities to receive lexical information from group members but also to benefit through giving up (e.g.explanation) to group members during group discussion.
An unexpected finding was the increase in effectiveness of IW over the period.One expects a new treatment (GW) to show increasing pre-post effect over time as learners get used to it, but IW was familiar and would be expected to have a pre-post effect unchanging over the period of the study.However, its result in some way presents it as another 'new' treatment.This may be due to the fact that, unlike normally in such classes, there were regular vocabulary post-tests at the end of each class, administered for the researcher to gather data but actually also having an unanticipated pedagogical effect both on IW and GW learning (as claimed by Karpicke and Roediger, 2007).This is supported by the fact that the pre-post gain in the very first learning session of the study, which involved IW, was exceptionally low at only 11%, but increased considerably over ensuing sessions.This maybe because that was the very first session and students did not yet really anticipate the immediate post-test, even though they had been told there would be one.
The major fall off for both GW and IW at the delayed post-test was not entirely unexpected as it is common in vocabulary learning studies (Jones, 2006;Kim, 2008).In our case this might have been enhanced because the vocabulary tested in the delayed post-test was not to be included in the final course examination (taken by the students as part of course requirements, but not part of the research).Nevertheless, this result is disappointing particularly as in interview the GW students mentioned a range of benefits of GW for retention, yet remembered only a little more lexical information than IW students in the delayed post-test.Of course our tests only included the targeted vocabulary items from the textbook exercises.It is possible that if non-targeted vocabulary had been tested, GW would have shown a more marked long term benefit than IW, given the nature of the reported GW interactions.Nevertheless, this prompts the need for consideration of how to combat the rapidity of attrition of learnt vocabulary knowledge.Perhaps the message is that even the most effective method used at the initial vocabulary learning stage cannot overcome the widely recognized additional need for long term, recycling of vocabulary at regular intervals after the session where new information arises (Karpicke and Roediger, 2007;Nation, 2013).

Future Research
While the value of explicit vocabulary tasks following classroom reading is no longer really in question, there remain many facets of how such tasks are best implemented still to be investigated.We used collaborative GW to investigate vocabulary learning and retention in the present study.It would be interesting to test the effectiveness of other types of GW such as cooperative group work.Moreover, the independent part played by tested in combination with IW or GW need to be further investigated, including the kinds of informal tests which are commonly given by teachers or pedagogical vocabulary recycling purposes.Finally, longer periods of delayed retention should be measured, with and without intervening revision/recycling in some form a week or a month later before a much delayed post-test, so as to simulate real learning conditions more closely.

Table 1 .
Vocabulary scores from pre-and immediate post-tests for IW

Table 2 .
Mean vocabulary scores from pre-and immediate post-tests for GW

Table 5 .
Aspects of word knowledge retained in IW vs GW (N=24)