A Corpus-Based Study of Hillary Clinton ’ s and Donald Trump ’ s Linguistic Styles

Since the 2016 U.S. presidential election, research on Hillary Clinton’s and Donald Trump’s linguistic styles has witnessed an exponential increase, with a lopsided focus on Trump in particular. This study compared Clinton’s and Trumps’ campaign speeches during the general election using a corpus-based approach. Discourse analysis of the corpora was conducted using the textual analysis software AntConc 3.2.4. The results showed that Clinton used a more diverse vocabulary compared with Trump, and that both candidates stuck to their core campaign messages in their speeches. Three major differences between Clinton’s and Trump’s linguistic styles were identified: 1) Clinton was inclined towards rational discussions of public policy, while Trump was adept at appealing to voters’ emotions; 2) Clinton was more positive and focused on her vision of the future, while Trump was more negative and fixated at depicting a dystopian reality; 3) Clinton aimed to find commonalities with the American people, while Trump aimed to highlight differences between himself and his opponents. By putting Clinton’s rhetoric on a par with Trump’s, this study highlighted their linguistic style differences as part of their grand campaign strategy, which could contribute to current understanding of the two candidates’ rhetorical preferences, political beliefs and strategies in their 2016 campaigns.


Introduction
In 2016, the world witnessed a historic election unfold in the United States, an election that would carry either the first female president into the White House, or a president who had no prior experience in public office and who constantly broke long-standing political norms.The ultimate victory of Donald Trump over Hillary Clinton defied all popular expectations and sent shock waves around the world.
Widely considered as the most stunning election result in modern U.S. history, Trump's victory sent people in all sectors of society scrambling for explanations.Among the myriad explanations was a focus on Trump's rhetoric in relation to his victory, as Trump not only behaved differently, but also spoke differently from previous political candidates.Thus, insights into his rhetoric might shed light on his triumph.For example, Lamont, Park and Ayala-Hurtado (2017) contended from a sociological perspective that Trumps' speeches addressing the concerns of the white working class may have helped him secure the white vote.Montgomery (2016) also argued that Trumps' populist appeals in his campaign speeches, while alienating some voters, may have won him support in other portions of the electorate.These studies provide some initial evidence that Trump's rhetoric, among other things, may have played a significant role in the 2016 election.
In an effort to better understand Trump's language and even his personality, many discourse studies have been conducted to extract the characteristics of Trumps' rhetoric.A general agreement out of these studies is that Trump's language was characterized by the use of simple vocabulary, basic sentence structures, an informal style of communication, and negative portrayals of people and events (Ahmadian, Azarshahi, & Paulhus, 2017;Kayam, 2018;Liu & Lei, 2018;Savoy, 2018;Wei, Yang, Chen, & Hu, 2018).Another prominent feature of his language is an appeal to people's emotions and beliefs, as found in Liu et al. (2018), Wang and Liu (2018), etc. Relevant research on political styles has also indicated that Trump showed a full populist style in his speeches, reaffirming his tendency to appeal to political sentiments (e.g., Schoor, 2018).
Obviously, most of these recent studies take Trump as the principal subject of inquiry, presupposing that his rhetoric represents a departure from existing norms and thus worthy of special investigation.This inevitably pushes studies on his adversaries to the periphery of academic attention, thus failing to highlight their specific differences.Besides, as Trump's only and most fierce competitor during the 2016 general election, research on Clinton's linguistic styles following the election is insufficient, with Clinton's rhetoric often taking a backseat to Trump's.To address these problems, this study puts Clinton's speeches on a par with Trump's using two publicly available corpora for analysis.By means of the textual analysis software AntConc 3.2.4,this study aims to identify Clinton's and Trump's linguistic features in their speeches, accentuate their campaign themes, and explore their underlying political beliefs.
Therefore, two research questions will be formulated as follows: 1) What are the linguistic features of Clinton's and Trumps' speeches, and what themes do they reflect in each candidate's campaign?
2) What are the differences between Clinton's and Trump's linguistic styles, and what political beliefs do they represent?

Comparison Between Clinton's and Trump's Campaign Rhetoric
Since the 2016 election, there has been an exponential growth of research on Clinton's and Trump's rhetoric during the campaign, with the focus mostly on Trump as he represents a deviation from existing norms.Many of the studies hitherto conducted have approached this topic from a psychological or sociological perspective.Linguistic studies, while taking language itself as the subject of inquiry, have often served as a basis for discussions of the candidates' personalities and political orientations due to the social nature of language.
Among these studies, many are based on the candidates' campaign speeches or presidential debates.For example, Schoor (2017) systematically studied the speeches of several politicians in the 2016 election, and found that Clinton showed an elitist tendency and obvious inclusiveness, while Trump showed a populist style.In agreement with this view, Nai and Maier (2018) added that Trump's style was not only populist, but also negative and based on fear appeals.Similarly, their results also rated Clinton as high in negativity, only that she utilized a less populist rhetoric and made an average use of emotional appeals.In Aswad's (2019) study, Donald Trump was significantly more likely to use hyperbolic language as a fulmination against the status quo and emphasized a shared social identity and the pursuit of common goals.Hillary Clinton, on the other hand, employed an egalitarian rhetoric, though her ability to exploit the relevant rhetorical constructs was restrained by gender stereotypes.Some other studies, while also using speeches or debate transcripts as materials, took linguistic analysis as their focal point.Liu et al. (2018) asserted that Clinton used more descriptive vocabulary and cognitive vocabulary during the campaign, while Trump's vocabulary was predominantly negative.Based on previous research results which showed that political candidates' choice of different words in their speeches (verbs vs. adjectives; concrete vs. abstract message orientations) can lead to different persuasive effects on voters, Chou and Yeh (2017) demonstrated that Trump used more verbs than adjectives, while Clinton used more adjectives than verbs.Savoy's (2018) study also found a higher representation of exclusive terms in Trump's speeches compared to Clinton's.These studies provide unique insights into the candidates' rhetorical characteristics and serve as a complement to those studies of a more psychological or sociological nature.
As a powerful social platform in the 2016 election, the use of Twitter as a medium of communication has also attracted some attention in academia.For example, by examining their Twitter messages, Yaqub, Chun, Atluri and Vaidya (2017) observed that Trump conveyed a more optimistic and positive campaign message than Clinton, though this finding seems to contradict both popular perceptions and a general academic consensus that Trump was more negative.In an effort to better understand the agenda setting of the Clinton and Trump campaigns, Lee and Xu's (2018) study showed that Clinton adopted more visual elements such as pictures and videos, while Trump used more texts.Besides, the use of visual elements proved to be more effective for Clinton than for Trump in bringing voter reactions.A similar argument is made by Lee and Lim (2016), who identified Clinton as an active user of multimedia such as graphics, videos, and photos, or links to other webpages, which accounted for 58.3% of her Twitter messages on the aggregate.They also found that Trump paid greater attention to masculine issues while Clinton gave more weight to feminine issues.

Clinton's and Trump's Individual Linguistic Styles
Compared with the above studies, which are primarily concerned with highlighting differences between Clinton's and Trump's rhetoric on the campaign trail, other studies are less contrastive in nature and focus more on the linguistic features of each individual candidate.
Understandably, a large number of studies in recent years have dealt with Trump's unique linguistic characteristics.For example, assuming that Trumps' outstanding performance in the Republican primaries may be attributed more to his communication style than to his campaign platform, Ahmadian et al. (2017) went on to demonstrate that Trump's speeches were characterized by grandiosity, the frequent use of first-person pronouns, great variation in tone, and informal communication, which resonated better with the electorate.In Kayam's (2018) study measuring the readability and simplicity of Trump's language in media interviews and debates, it also found that Trump scored low in both metrics, as marked by his use of short and simple sentences.Similar results were also obtained in Wang et al. (2018), who undertook a study of Trump's language across genres.Their findings showed that in debates, Trump used less diverse vocabulary and simpler sentences, but in campaign speeches, he occasionally adopted a richer vocabulary and well-edited sentences.Coutanche and Paulus (2018) studied the evolution of Trump's linguistic features based on his media interviews from 2011 to 2017, with the results showing an increase in his use of filler words over time.The underlying assumption among these studies is that Trumps' rhetoric is unconventional, simple and pompous, an assumption that seems appealing as it confirms our instinctive knowledge and anecdotal evidence.However, Jordan, Sterling, Pennebaker and Boyd (2019) made a compelling case against this assumption by studying political discourse in the United States and worldwide during the past 200 years.Their findings suggest that, contrary to popular perceptions of Trump's rhetoric as a deviation from long-standing norms, Trump's language actually reflected long-term trends in world politics, that is, a decline in analytic thinking and an increase in confidence.Given Trump's late entry onto the political arena and his break with political norms, it is not surprising that researchers have been particularly interested in his rhetoric in recent years, as evidenced by the aforementioned literature.In contrast, research on Clinton's rhetoric has a much longer history (from the early 1990s to the present day) and reflects a wider spectrum of perspectives.These studies, synchronic or diachronic, have generally proceeded on a timescale documenting Clinton's transformation from a controversial political spouse to an independent power player and decision maker.
A prominent feature of Clinton's rhetoric is her use of a masculine, instead of a feminine style in presenting herself publicly.Campbell (1998) once argued that Clinton's low favorability among the public was largely ascribed to her inability to feminize her rhetorical style, which was detrimental to her public persona.Using three linguistic features (confrontation, aggressiveness, and authority) for analysis, Manning (2006) countered Campbell's argument, stating that Clinton's masculine rhetorical style, instead of being a hindrance, actually challenged people to view her not as a woman, but as a professional politician, thus helping to redefine social perceptions of women in public life.Jones (2016) tracked the changes in Clinton's linguistic styles from 1992 to 2013, and found that, with her growing involvement in politics, Clinton's style became more masculine over time.This masculine style was especially apparent during the first year of Clinton's 2008 campaign as she waged a formidable race against then-candidate Barack Obama (Bligh, Merolla, Schroedel, & Gonzalez, 2010).Lockhart and Mollick's (2015) book on Clinton's rhetorical changes over her long public career also provides ample evidence for the presence of this masculine style.Occasionally, though, Clinton was able to adopt a more feminine style to soften her image, which may be more of a strategic choice than a natural expression of womanhood (Rhode & Dejmanee, 2016).Clinton's inability to resonate with voters via emotionality is also conspicuous and well documented in the literature.Even at the beginning of her own Senate career, Anderson (2002) noted that Clinton's rhetorical style was largely prosaic, uninspiring and devoid of emotions.In explaining such a style, Anderson hypothesized that this could be due to Clinton's professional training as a lawyer, which required rational thinking instead of spontaneous overflows of emotions, and the dilemma a woman faces in public life (appearing too tough or not tough enough).Drawing on the traditional framework of appeals based on ethos, pathos and logos, Bennister (2016) reached a similar conclusion, contending that Clinton played down both pathos and ethos and allowed logos to dominate.Other studies have shown that Clinton tends to utilize long and complex sentences as responses to voters' questions, mobilize analytical categories to reflect her political and ideological positions, and adopt personal pronouns and certain modality features to achieve strategic ends (e.g., Abdel-Moety, 2015;Chen & Hu, 2018;Hu & Wei, 2018).Overall, these studies seem to suggest that despite the status of Hillary Clinton as a controversial political figure, her rhetorical styles seem to have been less contended, with scholars agreeing on certain prominent aspects of her language use.

Materials
The materials in this study are two online corpora of Hillary Clinton's and Donald Trump's speeches selected   Table 1 shows the lexical information of the Clinton and Trump corpus.The Clinton speech corpus has a total of 135 714 words of 7 047 types.The type/token ratio is 5.19%, and the standardized type/token ratio is 39.91%.In contrast, the Trump corpus contain 481 919 words of 10 343 types.The type/token ratio is 2.15%, and the standardized type/token ratio is 36.17%.

Lexical Diversity in Clinton's and Trump's Speeches
As the corpora used in this study are unbalanced, with the Trump corpus nearly four times as large as the Clinton corpus, using the type/token ratio as a measure of lexical diversity would be misleading.Therefore, the standardized type/token ratio (the type of every 1 000 words in a corpus) is introduced as a more accurate metric as it results from a comparison of the two corpora on the same scale.Specifically, this step is intended to reduce the influence of the function words (such as prepositions and articles) on the type/token ratio of the whole corpus as much as possible, and highlight the content words (such as nouns, verbs, adjectives) in the type/token ratio.

Thematic Information in the Clinton Corpus
The Keyword List can overcome the shortcomings of the word frequency list by providing more detailed information, especially thematic information at the discourse level, which is conducive to an in-depth analysis of the candidate's speech style and his/her campaign themes.
In statistics, in general, when the chi-square value (when the degree of freedom is 1) is greater than the critical value of 6.64, it means that the value is significant at the significance level of 0.01.The chi-square value here is also the Keyness value (indicating topicality) as presented in Table 2.The larger the value, the stronger the theme of the word.As the keywords generated by AntConc are enormous in this study, the top 30 keywords were selected for analysis.Two conditions must be met for selection: 1) the keywords with the highest Keyness values were selected from high to low; 2) abbreviations such as "are" in "we're", prepositional words that mainly serve as connectives such as "to" and "of", and conjunctions such as "that" and "which" were excluded.The final keywords are as shown in Table 2: According to Table 2, the keywords can be broadly classified into two categories: 1) referential vocabulary: Clinton, he, his, Donald, my, young, president, Trump, America, someone, dad, everyone, Scranton, etc.; 2) topic vocabulary: college, economy, families, rights, kids, women, campaign, etc.

Thematic Information in the Trump Corpus
The extraction of the keywords in the Trump corpus followed the same procedure as above, only that the order of importing the two corpora was reversed, with the Clinton corpus used as the reference corpus.Likewise, the keywords that failed to meet the above two criteria were excluded.The final results were shown in Table 3: According to Table 3, the keywords can be broadly classified into three categories: 1) referential vocabulary: Hillary, Clinton, they, she, media, politicians, etc.; 2) topic vocabulary: trade, Obamacare, borders, money, Mexico, NAFTA, wall, etc.; 3) affective vocabulary: very, bad, OK, great, illegal, disaster, incredible, etc.

Clinton's and Trump's Linguistic Features and Campaign Themes
With both the type/token ratio and the standardized type/token ratio of the Clinton corpus outweighing those of the Trump corpus, the results in Table 1 suggest that Clinton adopted a more diverse vocabulary during the campaign than Trump.This is not surprising considering that both scientific and anecdotal evidence have pointed to Donald Trump as a less complex language user than other presidential candidates (e.g., Savoy, 2018).Put in the context of his victory, this further validates the general recognition that politicians who speak in an accessible manner tend to be better received among the public.
The keywords selected from the Clinton corpus are testimony to her campaign strategies and themes in 2016.For example, many of the referential keywords, when put in context, were addressed to Bill Clinton, Donald Trump, and the American people.Frequent reference to Bill Clinton may arise from two conflicting considerations.First, as a successful former president and as Hillary Clinton's husband, he was considered an asset to the campaign because of his political legacy and his name recognition.Second, he was frequently mentioned to answer people's doubts about whether Hillary Clinton was running for herself, or for her husband's third term, a perception common among a large proportion of the American electorate (Mandziuk, 2017).Reference to Trump was mostly negative, and intended to draw a contrast between Clinton and Trump, a strategy commonly used in political campaigns (Schwartzman, 2017).Reference to the American people relates to Clinton's campaign messages, that is, a campaign for the people and aimed at addressing people's immediate concerns.The topic keywords present a clearer picture of the core themes of the Clinton campaign, like college intuition, student debt, women's rights, and issues of social justice.These keywords also indicate that Clinton had a clear vision for the future and was more focused on discussing matters of public policy, which was consistent with her definition of herself as a policy wonk (Kaufer & Parry-Giles, 2017).
In the Trump corpus, many of the keywords were addressed to Hillary Clinton and Barack Obama.Running as the standard bearer of the Republican Party, Trump was a fierce critic of the Democratic Party, which was led by Clinton and Obama.As was common practice during political campaigns, the opposition party exploits the dissatisfaction with the ruling party to highlight their differences and to present their own vision (Schwartzman, 2017).The themes of Trump's campaign are also clear from the topic vocabulary, such as trade issues, national health care, border security, and illegal immigration.These issues represent the platform on which Trump ran his campaign.It bears noting, though, that these issues, too, are points of constant criticism towards the Obama administration.Another prominent feature in the keywords relates to Trumps' use of vocabulary with negative valence.This linguistic strategy plays into voters' emotions and beliefs, especially Trumps' core base of supporters.

Differences Between Clinton's and Trump's Linguistic Styles
Based on the previous results and analysis, three major differences between Clinton's and Trump's linguistic styles can be identified as follows: First, Clinton was inclined towards rational discussions of public policy, while Trump was adept at appealing to voters' emotions.This finding is congruent with those of many other scholars that reached similar conclusions.For example, after a systematic analysis of the rise of populism in the 2016 presidential election, Lakoff (2017) concluded that Trump was an adept user of "appeals to emotion and personal belief" (p.604).Kaufer et al. (2017) studied Clinton's two memoirs, Living History and Hard Choices, and found that as Clinton grew in political influence and stature, she was more inclined towards discussions of public policy.And Ibarra and Obodaru's (2009) research on women's leadership also showed that Clinton was not good at inspiring voters' emotions, but instead had a good grasp of policy details.
Second, Clinton was more positive and focused on her vision of the future, while Trump was more negative and fixated at depicting a dystopian reality.From the corpus findings, it is clear that Clinton often integrated discussions of policy into her vision for America's future (e.g., college debt, women's rights, and social justice), and used more positive vocabulary (e.g., help, kind, can, etc.).In contrast, Trump's speeches were peppered with criticisms of the current administration, and contained more negative vocabulary (e.g., bad, illegal, disaster, etc.).This finding, too, is supported by recent research findings that identified Trump' predisposition toward negative description of the reality (e.g., Chen, Zhang, Wei, & Hu, 2019;Liu et al., 2018;Savoy, 2018).
Third, Clinton aimed to find commonalities with the American people, while Trump aimed to highlight differences between himself and his opponents.The results showed that Clinton stuck to her core messages in her speeches and appeared to address voters' concerns by constantly using such keywords as my, young, women, help, can, kids, everyone, etc.With these words, she was trying to find commonalities with ordinary people by showing a sense of empathy.In contrast, Trump was concentrated on launching attacks on his opponents (e.g., Hillary, Clinton, bad she), or criticizing the policies of the current administration (e.g., Obama, Mexico, border, etc.).In so doing, he was positioning himself as an anti-establishment force that could bring new vigor to Washington politics.
These differences between Clinton and Trump were not confined to the linguistic dimension, however.They also provide a quick overview of the campaign themes and strategies each candidate adopted to present themselves in the public.And even more so, they serve as a window into their underlying political beliefs and personalities, with Clinton being a more rational and pragmatic decision maker, and Trump a more sentimental demagogue.

Conclusion
This study investigated Clinton's and Trumps' linguistic styles during the 2016 U.S. general election using a corpus-based approach.Two corpora of Clinton's and Trump's campaign speeches from July 2016 to November 2016 were analyzed using AntConc 3.2.4.Linguistic analyses using word frequency information and the Keyword List functions showed that Clinton adopted a more diverse vocabulary than Trump, and that both candidates' speeches reflected their core campaign themes.Three major differences between Clinton's and Trumps' linguistic styles were identified.First, Clinton was inclined towards rational discussions of public policy, while Trump was adept at appealing to voters' emotions.Second, Clinton was more positive and focused on her vision of the future, while Trump was more negative and fixated at depicting a dystopian reality.Third, Clinton aimed to find commonalities with the American people, while Trump aimed to highlight differences between himself and his opponents.These findings shed light on each candidate's rhetorical preferences as part of a grand campaign strategy, and could contribute to our current understanding of their personalities, political beliefs, and even the 2016 election result.
Nonetheless, the limitations of this study must also be addressed.First, the corpora used in this study are unbalanced, with the Trump corpus nearly four times as large as the Clinton corpus.This may have inadvertently influenced the results to some extent.Future studies may choose more balanced materials to minimize such influence.Second, campaign speeches may not serve as the most sensitive index of the candidates' linguistic styles because they may be pre-edited by their staff.A combination of speeches, debates, interviews, congressional records, etc. in future studies may give a fuller account of the linguistic styles of political Figure 2. Com

Table 1 .
Lexical profile of the Clinton and Trump corpus

Table 2 .
Top 30 keywords selected in the Clinton corpus (in order of Keyness values)

Table 3 .
Top 30 keywords selected in the Trump corpus (in order of Keyness values)