An Exploration of Vocabulary Knowledge in English Short Talks- A Corpus-Driven Approach

Yu-Chia Wang


Adopting a corpus-driven approach, the study aimed to explore the vocabulary knowledge in English short talks including word patterns, features, and usages that are most likely to be encountered by language users in the real context. A specific corpus TED was conducted through a collection of English talks that are less than 20 minutes from the website TED Talks. In addition, the existed corpus BASE (British Academic Spoken English) was included in the study as a sample of talks longer than 20 minutes. Applying three corpus tools, AntConc (Anthony, 2003), RANGE (Nation & Heartkey, 2002), and KfNgram (Fletcher, 2007), the researcher was able to compile frequency-ordered word lists, concordance lines, vocabulary coverage, and lists of lexical bundles. The results showed that although the most frequently-used words in TED corpus and BASE corpus were similar grammatical items, the order was quite different. Moreover, the chi-square test showed a significant difference among four pronouns I, You, We, They between the two corpora and also in different parts of the TED corpus. Finally, the results of concordance lines and lexical bundles presented the “typical” and “frequent” word usages in the beginning, middle, and ending part of English short talks. It is suggested that teachers can build their own corpus to meet specific teaching purposes or learner’s needs, and to generate the corpus results into classroom materials while teaching English short talks.


Full Text:



International Journal of English Linguistics   ISSN 1923-869X (Print)   ISSN 1923-8703 (Online)

Copyright © Canadian Center of Science and Education

To make sure that you can receive messages from us, please add the '' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.