The American Dream Revisited: A Corpus-Driven Study

As a dominant ideology throughout America, the American Dream rests on the idea that with hard work and personal determination anyone, regardless of background, has equal opportunity to achieve his or her aspirations. Given the importance of the American Dream to American national identity, and the enormity of it in shaping dominant ideologies, this study explores this deeply-held belief and particular mind-set in media discourses related to the American Dream. Modeled on the approach of corpus-driven discourse analysis, and combining the framework of a sociocultual linguistic approach to identity and interaction, the article reports on a corpus-driven sociocultural discourse study which aims to discover, through the analysis of frequent lexical and semantic patterns, discursive characteristics of media discourses related to the American dream, and whether there are any changes of the American dream to American national identity and ideologies which might be developed in time and space.


Introduction
The term American Dream was first used by the American historian James Truslow Adams in his book The Epic of America published in 1931. Later on, Martin Luther King Jr. spoke about a dream of freedom, equality, and justice, which then has become the widespread American way of life in general. As a great source of pride, the American Dream has become the central creed of American nation since 1931, which represents a basic belief in the power and capacity of the individual (Cullen, 2003;Schwarz, 1997). The seemingly egalitarian system of opportunity regardless of background each individual has equal chance to prosper resonates throughout contemporary American society. As Johnson (2006, p. 21) notes, the American dream is shared as the national ideology of meritocracy, a system "contingent upon a societal commitment to fair competition so that no individual or group is advantaged or disadvantaged by the positions or predicaments of their ancestors". But in fact, many individuals as well as scholars believe that the American Dream is not equally distributed among ethnic groups, which ultimately makes the dream an "inchoate fantasy" that has severe racial antagonisms embedded within it (Hochschild, 1995). In a same vein, Devos et al. (2010) examine the exclusionary definition of the American identity which is more readily granted to members of the dominant ethnic group, while other ethnic groups, at the minimum, are not created equal in their pursuit of the American dream and their aspirations to acquire the national identity. Findings show some individuals are relegated at the margin of the American identity because their group does not fit its prototypical definition, which contributed to a growing literature on the ramifications and consequences of defining a super ordinate identity in a way that excludes some subgroups. identity and ideologies have been lacking. An important motivation of investigating the media representations of the American Dream is that it already existed in electronic form, although care needs to be taken when assuming that a person who has posted a message actually possesses the identity they claim to have. The corpus-driven quantitative research actually does help to uncover the secret of the American Dream in modern society.

Framework and Methodology
Discourse and identity are closely connected. Identity is always defined via similarity and difference (e.g., Ricoeur, 1992;Wodak et al., 2009). In the process of identity formation, news media plays a crucial role not only mirror some kind of objective reality, but also acts as powerful social agent in its own right. Through media reports, journalists as social actors can constitute objects of knowledge, situations as well as identities between different social groups and readers. Following Wodak et al. (1999, p. 22), identity is "constructed and conveyed in discourse, predominantly in narratives of national culture". The present study adopts Bucholtz & Hall's (2005) sociocultural linguistic perspective on identity. Identity produced in linguistic interaction based on the following principles: identity is best viewed as the emergent product rather than the pre-existing source of linguistic and other semiotic practice; identity relations emerge in interaction through several related indexical processes, such as the use of linguistic structures and systems that are ideologically associated with specific persons and groups; identities are never autonomous or independent but always acquire social meaning in relation to other available identity positions and other social actors (Bucholtz & Hall, 2005, pp. 585-614).
The American media discourses related to the American Dream, with data collected from January 2012 to December 2016 are examined in this study. One important consideration of choosing data during this time span is America under the former American President Obama's administration since his second term of presidency in 2012. The research questions are: What are the discursive characteristics of media discourse related to the American dream? Did the media discourses related to the American Dream reflect the American identity and dominant ideologies? If not, how has the American Dream to American identity and ideologies changed over time? To address the questions, the American Dream Corpus (ADC) of media texts with 99,832 words in 112 news articles are retrieved from the Newspaper database EBSCO host. All the articles are constrained to the American media because they represent the American ideologically construed social and political positions to international readers. Prominent newspapers with higher circulation include Washington post, The New York Times, Wall Street Journal, USA Today Tribune, and Christian Science Monitor, with detailed descriptions in Table 1. The criterion for selecting articles is that American Dream has to be the primary topic and appear in the title. This is done with a view to including only articles in which the American Dream is discussed as the major topic and to exclude texts in which the two words American Dream are mentioned only in passing. ConcGram 1.0 (Greaves, 2009) and Wmatrix (Rayson, 2001) are used as tools to retrieve two/three-word concgrams, keywords and key semantic categories and relevant concordances, from which analyses will be conducted below.

Two-word Concgrams
As the identification of keywords can indicate what a corpus is about, the "aboutness" of a text or homogeneous corpus (Scott, 1999), the two-word concgrams in the study corpus offers "a first glimpse of the dominant theme and topic throughout the texts" (Cheng & Lam, 2013, p. 180). The top ten two-word concgrams in Table 2, with the exclusion of function/grammatical words, tells the dominant theme on the American Dream. The most frequent two-word concgram American/dream and the quotation-related concgram (said/who) and people-related concgrams (class/middle, more/people and high/school) are prominent, indicating the individualistic value of the American Dream, probably the American middle class are more concerned about their American Dream.

Keywords and Key Semantic Categories
Keywords act as a standard reference for normal frequencies of words that reveal something of the "aboutness" of a particular corpus. For the purpose of this analysis, ADC is compared against the AmE06, a very good reference corpus as it matched reasonably the data in terms of national and international variety of English. The top 20 keywords, relative to AmE06, listed in Table 3, confirmed the dominant theme and topic of the American Dream in the ADC, which further support the two initial observations from the two-word concgram analysis.
The keywords also showed the preoccupation of individualistic issues related to the Americans and their "homes", "housing", "mortgage" and "family". Another noticeable keyword is "China", implying the American Dream is not independent from China.  (Table 4). Consistently, the keywords in the categories showed the same as what the Americans related to or interested in are "home", "China", "mortgage" and etc. ijel.ccsenet.
The above concgrams and its rela

The Am
Knowing