Gender-Sensitive Language in German Annual Reports

Gender equality in business has gained worldwide attention recently. This study examines whether firms address female individuals (e.g., in salutations) in annual reports and if so, whether this kind of gender-sensitive language is related to the firms’ market value. The study is based on the German setting, as the German language has separate nouns for female and male individuals that do not exist in other languages (e.g., English, Chinese). Using a sample of HDAX listed firms between 2007 and 2015, we find, surprisingly, that few firms address women throughout their annual reports and the more frequently women are addressed, the lower the firms’ market value. Results remain robust using three different proxies for the firms’ market value. The findings may be interesting for German firms that wish to forge a positive relationship with (female) board members and also male and female investors. The findings are more generally important for the international market and firms in other countries, because giving greater visibility to gender policies and gender equality in business language may help to increase the number of women in higher management positions.


Introduction
This study examines whether German listed firms address female annual report readers as often as they address male readers and if so, how this relates to the firms' market value.The language used in financial disclosure or other types of business-related documents could be the reason why women are few and far between in upper management positions (Demarmels & Schaffner, 2011).Indeed, over the last decades, gender equality has been introduced into international law and the role of women has changed (Connell, 2005).In Germany, gender equality is even manifested in the German Corporate Governance Codex, which requires firms to publish a report about their (future) measures to promote women.With the adoption of the Germany's Equality Act regarding women and men in higher management positions in 2015, gender equality has been achieved at the supervisory board level in Germany and has gained more attention in firms' recruitment policies (Note 1).
Nevertheless, the lack of equal representation of women in upper management positions and generally, female participation in economic activity and in society is still the most debated diversity issue worldwide (Campbell & Mínguez-Vera, 2008).Current surveys show there are more women in part-time jobs or lower positions who earn on average 21 percent less than men (FAZ, 2017;Spitzer & Wieser, 2017).Comparing women and men who work at the same level in the hierarchy and the same number of hours, women earn 5.2 percent less than men (FAZ, 2017).
Prior gender equality literature points out that women have quietly rejected general male descriptions and definitions in written and spoken language for decades (Hoffmann, 1996).Earlier studies looked at how women were presented on photographs and how they were considered in annual reports decades ago (Bujaki & McConomy, 2010a, 2010b;Newsom, 1988).Indeed, the photographs show that men are considered to be in more powerful positions than women.Newsom (1988) concludes that pictures show "white male territories", and with few exceptions, if women are in the photographs at all, they are almost invisible.Modern examples of gender discrimination or unequal language can be found in job advertisements that call for a "Projektleiter" or "Geschäftsführer" (English translation: "project leader" and "managing director") or annual reports or newsletters that address the reader as an "Aktionär" (English translation: "investor").These examples exclude the female version in their salutations (German: "Projektleiterin", "Geschäftsführerin" and "Aktionärin".A growing body of literature acknowledges the informative value of qualitative characteristics in text documents (especially the textual sentiment, tone, or tone changes) and effects on stock return predictability, volatility, or trading volume (Bannier et al., 2018;Kearney & Liu, 2014;Loughran & McDonald, 2011).However, none have examined gender equality in business documents, such as annual reports.While some earlier studies use general word lists to analyze business documents (Tetlock, 2007;Tetlock et al., 2008), recent studies demonstrate that specific financial word lists are appropriate for the finance context (Huang et al., 2013;Jegadeesh & Wu, 2013;Loughran & McDonald, 2011;Mayew & Venkatachalam, 2013).However, prior studies on the effects of sentiments on a firm's performance and market value have produced inconsistent results.In line with earlier studies, this study focuses on the language in annual reports, because this type of financial disclosure is essentially a PR document that "(…) has the greatest involvement of all members of the organization; and it is approved at the highest level, the board of directors" (Newsom, 1988, p. 15).We prefer a computer-based method over manual content analysis as it overcomes the issue of error-prone hand-collecting techniques and small sample sizes.In contrast to prior studies, this study sheds more light on gender sensitivity of language and the firm's market value.Annual reports are appropriate to analyze gender-equal language and the effect on the firm's market value for the following reasons.Firstly, annual reports are instruments that provide financial data and other relevant information to investors and other stakeholders.Secondly, an annual report is a way for the firm to present its corporate identity (Benschop & Meihuizen, 2002;Preston et al., 1996) and is one of the most powerful instruments with which an organization can interact with its stakeholders (Neimark, 1992).An annual report is the dominant and pervasive communication medium to demonstrate diversity to stakeholders (Bernardi et al., 2005).
Dictionary-based analyses have been applied to various kinds of texts such as, e.g., financial disclosures, analyst reports, earnings press releases, IPO prospectuses, internet board postings, or newspaper articles (Kearney & Liu, 2014).Our study mainly differs from other studies on business documents in that we do not consider the tone (positive, neutral, or negative) but instead focus on the level and change of gender-sensitive language in annual reports and its effect on the firm's market valuation-a link that has not been analyzed before.We focus on German annual reports because the consideration of female stakeholders is more observable in German than in English.Also, when analyzing effects of annual reports on stock reactions in Germany, the German version of the annual report is more appropriate in general, because the English translation is often published later than the original documents (Bannier et al., 2018).The study is based on German annual reports for HDAX-listed firms between 2007 and 2015.
Tobin's Q is used as a proxy for the firms' market value-related financial performance.We use the firms' market value and price-to-book ratio as alternative measures for market value.In addition, we use a novel wordlist consisting of nouns that refer to male or female stakeholders in the financial context.In line with most other studies on computer-aided text analysis, we use the rule-based dictionary approach including the "bag-of-words" technique to count the words in our pre-created female and male word list regardless of where they appear in the text.
Surprisingly, the results indicate that the use of gender-equal language is decreasing over time, and that firms that address women and men in annual reports equally have lower market values.The results may be interesting for politicians and other market players that aim to introduce gender equality to the business environment.
This study contributes to prior literature as follows.First, as dictionary-based approaches have become one of the most commonly used tools to investigate text documents, this new wordlist offers a tool for context-specific evaluation of gender-equal language in German business documents (Bannier et al., 2018).Second, the study provides new evidence that the more often female annual report readers are addressed in the firms' annual report language, the lower the firm's market value.However, the interpretation of results is limited for two reasons.First, the text analysis program does not consider the context of the analyzed words.Second, significant negative effects on the firm's market value could be caused by other omitted variables.Thus, the study suggests other methods that could be used to address this research question, for instance by disentangling investors' reactions to the language in annual reports from other effects that empirical studies often fail to control for, such as the underlying firm value or the firm's environment.
The paper is structured as follows.After a short introduction, the history of gender equality in Germany is discussed in Section 2. Section 3 summarizes prior literature on text analysis, provides theoretical considerations, and presents our main hypothesis (Section 4).Then the methodology, data, software program, and variable definitions (Section 5) are presented.This is followed by the results in Section 6, as well as a separate discussion in Section 7 and a brief conclusion in Section 8.

German Setting
In order to understand the relevance of gender-equal language it is important to consider the origin and historical movement of political correctness and feminism around the world in general and in Germany in particular.In general, political correctness aims to protect minorities against discrimination (Hoffmann, 1996).The term "minorities" stands, amongst other things, for black, disabled, or homosexual individuals as well as women who represent a minority in their professional or social environment.Despite the fact that the term "minorities" is broadly defined, this study focuses on gender equality, which is a doctrine in international law and is part of the Universal Declaration of Human Rights of 1948 (Connell, 2005).The 1975-85 United Nations Decade for Women was a key element in the story of global feminism (Bulbeck, 1988;Rodi, 2017).It influenced the adoption of the Convention on the Elimination of all Forms of Discrimination Against Women (CEDAW) in 1979 (Note 2), an international bill of rights for women that was adopted by almost all states all over the world.Contemporary western feminism focuses strongly on women's equal participation in work and education, reproductive rights, and sexual freedom (Bulbeck, 1998).Nevertheless, there are many reasons why women miss out on career steps, are still the minority in upper management positions, and earn and work less than men (FAZ, 2017;Spitzer & Wieser, 2017).One significant barrier to women's careers is the responsibility for childcare, which is traditionally assigned to women (Haas, 2008).However, Germany and other EU countries have paid more attention to these work-family related issues recently.For example, in Germany both parents can either share parental leave or take it at the same time (Haas, 2008).Another important milestone is the compliance with the Equality Act, requiring a certain proportion of women and men on supervisory boards which was enacted in 2015.This act requires listed or unlisted firms with equal or more than 1,000 employees to include at least 30 percent of women and 30 percent of men in the supervisory board.It is the supervisory board's main function, as an independent body, to monitor management (Wagner, 2011) and approve the annual financial reports ( § 171 AktG).Additionally, the German Corporate Governance Codex provides that firms publish a report about their selected (future) measures to promote women.
Another important factor is how firms deal with those inequalities and how they address women and men in their business documents, for instance.Most nouns in German differ according to whether they designate males or female.However, it is questionable whether listed German firms also address women and men in their communication with investors, e.g., via their annual report.There are no fixed language codices or sanctions for firms that do not specifically address female individuals through the language they use.
Our study focuses on the language used in German annual reports that were published between 2007 and 2015.ON the one hand, we expect the market's reaction to the use of gender sensitive language to be similar if the sample was extended until the end of 2018.On the other hand, we expect is possible that firms address women in annual reports more often since the adoption of the quota of women in upper management positions in the beginning of 2016.In this context, prior literature has shown that communication barriers are reduced and the minority voice becomes more assertive when the number of female board members increases (Konrad et al., 2008;Kramer et al., 2006).As it is the supervisory board's duty and right to monitor management's actions (Denis & McConnell, 2003), future research may detect a change in language as women and men have been legally and equally represented at board level.Nevertheless, the effect of gender-equal language on market value would not change if we extended the sample by three more years, because the reason(s) why and how the market reacts to gender-equal language is not expected to change immediately.This paper not only shows the development of gender-sensitive language in annual reports over time, it also discusses the results in the context of the glass ceiling issue which obviously still exists (see Section 7).Therefore, we expect firms to maintain the way women and men are addressed in the language of annual reports in the short run.The next two sections provide theoretical considerations on why and how the use of gender-equal language in annual reports is expected to affect a firm's market value.

Gender Equality in Business Organizations
In earlier studies that examined how women were involved in business organizations, some researchers focus on their presence in annual reports, i.e., the way women are presented in images or photographs (Bujaki & McConomy, 2010a, 2010b;Preston et al., 1996) or on the combination of photos, gender-related statistics, and language (Benschop & Meihuizen, 2002;Newsom, 1988).In a more theoretical study, Preston et al. (1996) focus on visual images in US annual reports during the late 1980s and early 1990s.Overall, most studies that analyze the presence of women on photographs in annual reports conclude that women are underrepresented, and that stereotypical images of women are dominant (Benschop & Meihuizen, 2002;Bujaki & McConomy, 2010a;Newsom, 1988).Specifically, photographs show women in less powerful positions.In a similar study with a Canadian sample, Bujaki and McConomy (2010b) provide empirical evidence that women are underrepresented in annual report photographs and on boards and that a higher frequency of depicting women in annual reports is positively related to higher return on equity (ROE).However, they do not find a direct link between gender diversity at board level and ROE.Focusing on language in annual reports, Benschop and Meihuizen (2002) use English-language annual reports of 30 Dutch corporations to examine the gender of nouns and pronouns that were used.Their findings indicate that the majority of pronouns are either neutral or masculine.Nevertheless, this does not necessarily indicate that women are discriminated language-wise.Newsom (1988) examines 26 annual reports of firms from the U.S. but does not find any evidence of discrimination of women in annual reports either.In sum, prior evidence does not conclusively show whether firms discriminate against women.Additionally, the interpretation and implication of results is limited because most studies rely on relatively small sample sizes or analyze text and images manually (Bernardi et al., 2005).Moreover, most studies base the analysis on English business documents where, unlike most other languages, firms are not able to distinguish between female and male annual report readers as English makes no gender distinction in nouns.
As Rose (2007) suggests, the presence of women on corporate boards is an important aspect of good governance and has the potential to diversify the opinions and perspectives explored during decision-making.Research indicates that good corporate governance is favorable because it is associated with better financial performance (Bernardi et al., 2005;Catalyst, 2004;Rose, 2007) and investor protection (La Porta et al., 2000).Recent studies have examined how gender diversity at the board level is related to the firm's performance.Some find no significant relationship between gender diversity at the board level and financial performance (Denmark: Rose, 2007;U.S.: Shrader et al., 1997) or stock market reactions (Fortune 500 companies: Farrell and Hersch, 2005;Denmark, Sweden, Norway: Randøy et al., 2006).Other studies find a positive relationship between gender diversity at the board level and the firm's ROE and shareholder returns (U.S.: Catalyst, 2004), the firm's Tobin's Q (Spain: Campbell and Mínguez-Vera, 2008;Fortune 500: Carter et al., 2003), or return on assets and return on investment (Erhardt et al., 2003).In their Norwegian study, Bøhren and Strøm (2007) find evidence of a significant negative relationship between women on boards and the firm's Tobin's Q.Overall, the results of prior studies remain inconsistent because studies relate to different countries and time periods and use different estimation methods (Campbell & Mínguez-Vera, 2008).
In contrast to most prior studies on gender equality, this study does not focus on the visual representation of women in annual reports or gender diversity at the board level.However, the use of gender-sensitive language in annual reports may be an important component of gender diversity in a firm as it may indicate or relate to the extent to which strong women are integrated in firms and upper management (Demarmels & Schaffner, 2011), indicating whether there is still some degree of gender discrimination in Western economies.Indeed, annual reports serve as appropriate instruments for analyzing this for two reasons.First, German annual reports are published earlier than the translated version (Bannier et al., 2018) and second, because the German language allows content creators to use both the general male version of a noun or both female and male versions.Recent studies show that gender inequality in organizations is persistent (Benschop & Doorewaard, 1998) due to a so called gender subtext: the concealed processes subtly and latently (re)produce gender distinctions (Acker, 1992;De Bruin et al., 2007;Fraser, 1989;Smith, 1990).Our study is theoretically based on this socialist feminist framework that emphasizes concealed gendered power processes.These processes (re)produce gender inequalities as normal social practices that are inherent to organizational routines (Benschop & Meihuizen, 2002).This is in line with the study of Benschop and Meihuizen (2002) who build their study on Hagemann-White's (1989) layered notion of gender as well as and Acker's (1992) sets of gendered processes, which distinguish four interrelated sets of arrangements (organizational principles, measures, and practices), indicating distinct gendering processes which becomes visible in culture, interactions, and the firm's identity.These sets of gendered process presentations obscure underlying (gender) subtexts in organizations.According to Acker (1992), the disembodied worker (full-time available, highly qualified, work-oriented) is an example of abstract and neutral social relations that are characterized by dominant textual presentations or leading discourses within organizations.This disembodied worker corresponds in day-to-day reality to the characteristics of male rather than female workers.Transferring this theory to the research question, firms may be willing to address women more consistently.However, it is unclear whether firms intentionally focus a lot on the potential differences between female and male workers in business texts.

Textual Analysis
Prior studies on the economic implications of corporate disclosure have generally focused on the transparency, the level, or the tone of the text and are based on two general approaches that can be used to measure qualitative characteristics of text documents -the statistical approach including learning algorithms (machine learning) and the rule-based (dictionary) approach (Li, 2010a).The first method includes vector distance, Naïve Bayes classifications, likelihood ratios, or other classification algorithms.The machine-learning algorithm works with a "training set" as a proportion of the complete text (Kearney & Liu, 2014).The training set needs to be manually classified as, e.g."positive," "negative" or other dimensions of sentiment.The sentiment classification rules are applied to the whole text to derive textual sentiment scores.For instance, each sentence of the text needs to be assigned to a specific category from a set of all possible categories (tones).For more a more detailed explanation, we refer to Li (2010a).According to Kearney and Liu (2014), implementing a machine-learning algorithm is more specific, but it is also time-and cost-intensive because it requires a manual classification of the training data set.The second approach involves a computer program reading the text and sorting words into different predefined categories (using dictionaries or wordlists) (Li, 2010a(Li, , 2010b)).Documents are considered to be the "bag of words", where the presence of one word in the bag is independent of another, thus the context is ignored (Kearney & Liu, 2014;Manning & Schütze, 1999).The dictionary-based approach is probably the easiest for business, economic, and financial analysts to handle because the well-established programs presented in Section 2 are readily available and are most frequently used in the literature (Kearney & Liu, 2014).

General English Dictionaries
Most earlier studies using the rule-based approach perform a text analysis based on general dictionaries such as DICTION 7.0 (hereafter DICTION) or the Harvard University's General Inquirer IV-4 (hereafter HARVARD) (e.g., Li, 2008;Tetlock, 2007;Tetlock et al., 2008).Li (2008) questions whether the readability (transparency) of annual reports is related to firm performance and persistence.He employs the Fog index from computational linguistic literature, where the number of words per sentence and the number of syllables per word are combined to create a measure of readability.In a sample of 55,719 firm-years with annual report filing dates between 1994 and 2004, he finds evidence that annual reports that refer to lower fundamental earnings are more difficult to read and that firms whose annual reports are easier to read have positive earnings in the longer term.Examining the daily content of the "Abreast of the Market" column in the Wall Street Journal from 1984 to 1999, Tetlock (2007) finds evidence that a large number of pessimistic words in the description of the stocks in the Dow Jones Index precedes lower returns of the stock indices the next day, and that high or low pessimism predicts high market trading volume (Note 3).In a subsequent study, Tetlock et al. (2008) provide evidence that a negative tone in Wall Street Journal (WSJ) and Dow Jones News Service (DJNS) stories predicts individual firms' accounting earnings and stock returns (Note 4).Feldman et al. (2010) analyze tone changes in management discussion and analysis (MD&A) section of Forms 10-Q and 10-K and find a positive relationship with short-window contemporaneous returns around SEC filing dates, as well as drift excess returns.Specifically, if managers' view of future prospects becomes more negative (positive), more negative (positive) words are used in disclosures (Note 5).

Context-Specific English Dictionaries
A growing body of research uses the context-specific text analysis that enables researchers to focus on information value of textual sentiment (Bannier et al., 2018;Kearney & Liu, 2014).However, most researchers doubt the usefulness of common dictionaries such as the GI/Harvard dictionary when analyzing business texts, arguing that common dictionaries misclassify words (Bannier et al., 2018;Henry, 2006Henry, , 2008;;Loughran & McDonald, 2011).Therefore, some researchers have created more context-specific wordlists.On the one hand, general wordlists would categorize words like rise or increase as positive, which would falsify the tone of a sentence about costs.On the other hand, words classified as negative in common dictionaries include terms such as taxes or liabilities, which are not typically negative in a finance context.Words such as cancer or capital are linked to specific industries but add noise to the tonal measure.Henry (2006Henry ( , 2008) ) composes a dictionary explicitly designed to examine the tone of earnings press releases.In her second study, Henry (2008) finds evidence that positive sentiment in earnings press releases leads to a positive subsequent market reaction.As it is customized for one specific text type, her dictionary contains only 85 negative and 105 positive words.Hence, its applicability is very limited to the small number of words it contains (Bannier et al., 2018;Loughran & McDonald, 2011).The finance-specific dictionary of Loughran and McDonald (2011) (hereafter L&M) includes categories such as negative, positive, uncertainty, litigious, strong modal, and weak modal wordlists and has been widely used by subsequent studies to measure the tone in business documents.Using a sample of 50,115 SECs 10-Ks of 8,341 unique firms from 1994 to 2008, Loughran and McDonald (2011) demonstrate that the negative wordlist is the correct one.Their tone measure of 10-Ks based on their negative word list is significantly associated with 10-K file date excess returns, while the tone measure based on the negative Harvard word list is not.Some previous studies that have used the L&M wordlist to analyze the tone in earnings conference calls find that tone significantly influences subsequent stock returns and trading volume (Davis et al., 2015;Doran et al., 2012).Moreover, Huang et al. (2013) find that the normal tone component is associated with firm fundamentals, but that the abnormal positive tone predicts a negative future performance and a positive market reaction.Mayew and Venkatachalam (2013) find a statistically significant positive (negative) relationship between positive words (negative words) and contemporaneous stock returns.According to Arslan-Ayaydin et al. (2016), the tone in earnings conference calls is more positive when the managerial portfolio value is more closely tied to the firm's stock price, especially when their compensation is equity-based.Additionally, Davis et al. (2015) document that the mean for the optimistic tone measured by the L&M word list is lower in relation to the other two measures, due to the significantly higher number of negative (2,337) than positive words (353) using DICTION (914 negative, 697 positive) and Henry's (2006) wordlist (98 negative, 188 positive).Examining CEO letters, Boudt and Thewissen (2016) show that CEOs present negative and positive words strategically in their letters in order to create a more positive perception by the reader.In their current working paper, Bannier et al. (2017) analyze market reactions to the sentiment of CEO speeches held at companies' annual general meetings and show that investors react significantly to the speeches' textual sentiment in terms of abnormal stocks returns and trading volume.Twedt and Rees (2012) find that analyst report complexity (one dimension of report detail) helps explain cross-sectional variation in the market's response to the reports' recommendations and that tone contains significant information content.Other studies analyze the tone in IPO prospectuses and find that cautionary language can be used to predict post-IPO performance and is inversely related to post-IPO abnormal stock returns (Ferris et al., 2012;Jegadeesh & Wu, 2013).Jegadeesh and Wu (2013) also find evidence for a significant relationship between positive and negative tone and market returns of filings around 10-K filing dates.In their working paper, Ammann and Schaub (2016) use their own dictionary of positive and negative words to investigate whether the tone of comments posted by traders can predict the future performance of investment strategies.Unlike the English version of Loughran and McDonald, (2011), their wordlist is an ad-hoc context-dependent and sample-specific dictionary, consisting of 129 positive and 134 positive words.Instead of translating the English L&M dictionary, Ammann and Schaub (2016) asked two individuals to mark and categorize words as positive and negative by hand.They find that posting comments and the tone of said comments does affect investment decisions of followers, but that they do not seem to have predictive power for the trading strategies' future performance.However, the manual categorization limits the wordlists' applicability to other samples.

German Dictionaries
The number of publications in business-and finance-related literature using the keywords "text analysis," "textual analysis," or "content analysis" has increased enormously, from two publications in 2006 to 56 publications in 2016 (Bannier et al., 2018).Foreign-language data has only received minor attention (Kearney & Liu, 2014).Some researchers have tried to overcome the lack of business-specific dictionaries in non-English languages, since examining English texts only is likely to bias statistical analyses because the translated versions are often published later than the initial documents.In line with most English dictionaries, all of the existing German dictionaries have been created for sentiment analysis in various domains (Ammann & Schaub, 2016;Bannier et al., 2017;Mengelkamp et al., 2016;Remus et al., 2010).Some researchers have translated the general dictionaries.For example, the "SentimentWortschatz" dictionary was translated by Remus et al. 2010 and is based on and extends the General Inquirer lexicon by Stone et al. (1966).It is mostly used to study the sentiment in fields such as political communication (Haselmayer & Jenny, 2017;Rill et al., 2014) and social media (Momtazi, 2012), or for opinion mining in news articles (Scholz & Conrad, 2013).Another general language dictionary is the German version of the Linguistic Inquiry Word Count translated by Wolf et al. (2008).It is used for analyzing the tone in essays in the context of expressive writing experiments and is mostly applied to political texts (Caton et al., 2015;Jacobi et al., 2016;Stieglitz & Dang-Xuan, 2012).Ammann et al. (2014), the creators of the first German business specific dictionary, examine whether newspaper content of a leading German financial newspaper ("Handelsblatt") can predict aggregate future stock returns.In doing so, they assign 236 words that were frequently used in the articles to positive, negative, and other sentiment and create word-count indices as a quantitative language measure.They find that these indices do indeed predict future DAX returns.In line with prior research, Mengelkamp et al. (2016) create their own domain-dependent sentiment dictionaries based on parts of a manually classified corpus from Twitter for corporate credit risk analysis.They conclude that sample-independent context-specific dictionaries are more accurate for content analyses compared to the ad-hoc dictionaries that are sample-dependent.Bannier et al. (2018) provide the first German study that translates the positivity, negativity, and uncertainty wordlists of the dictionary compiled by Loughran and McDonald (2011) to the German language.They demonstrate that the accuracy of their translated dictionary is similar to that of the English L&M wordlist version and is better suited for capturing the sentiment of the text compared to the general, non-business specific, German dictionaries that were mentioned above.We contribute to this strand of literature and create a dictionary that captures the level of gender sensitivity in German annual reports.

Hypothesis Development
Annual reports not only provide investors with financial data and other relevant information, they also serve as an instrument for presenting firms' corporate identity (Benschop & Meihuizen, 2002) and interacting between the organization and its stakeholders (Neimark, 1992).Supporting this notion, some researchers document that stakeholders do consider not just quantitative information, but also qualitative linguistic information when they evaluate a situation of a firm (e.g., Antweiler & Frank, 2004;Bannier et al., 2017;Tetlock, 2007).Therefore, we expect the extent of gender equality or gender sensitivity in language has an effect on a firm's market value.
There are two theoretical lines of argumentation on why and how the market reacts to gender-sensitive language in annual reports.On the one hand, the use of gender-sensitive language can appeal to annual report readers.It may reflect the firm's attitude to gender diversity as one aspect of corporate governance which diversifies the decision-making process at a board level (Bujaki & McConomy, 2010b;Rose, 2007).Addressing women and men separately in annual reports may also mean better integration of women and men in business and in upper management (Demarmels & Schaffner, 2011).From the perspective of human relation theories (e.g., Hertzberg, 1959;Maslow, 1943;McGregor, 1960), the workforce is an organizational asset which can create substantial value, with employee satisfaction improving retention and motivation, to the benefit of shareholders.Consistent with human capital-centered theories, empirical literature on workforce treatment and financial performance shows that employee satisfaction is positively correlated with long-run shareholder returns (Edmans, 2011).Consequently, gender-sensitive language may help current or potential female employees to identify with the firm more easily (Acker, 1992;De Bruin et al., 2007;Fraser, 1989;Smith, 1990).Thus, investors could perceive gender-sensitive language in annual reports as a sign of higher employee satisfaction.
On the other hand, female annual report readers could interpret the absence of gender-sensitive language as discrimination against women in business.The socialist feminist framework predicts that concealed gendered power processes turn gender inequalities into normal social practices and part of regular organizational routines (Hagemann-White, 1989;Acker, 1992).Not addressing women and men in annual reports equally may be (1) tolerated by annual report readers and (2) shorter and readable from the reader's perspective.In addition, gender-sensitive salutations in business documents require more words and longer sentences, which may reduce their readability.Li (2008) shows that lower readability is related to lower firm performance and earnings persistence.
To our knowledge, no prior study has analyzed whether gender-sensitive language affects investors and thus the firm's market value.A related area of research is that on the effects of CSR or gender diversity on firm value.Using cross-sectional return regressions and buy-and-hold abnormal returns, Dorfleitner et al., (2018) show that firms with strong CSR significantly outperform firms with weak CSR in certain areas in the mid and long run.When considering human relation theories, as mentioned above, it could be argued that gender-sensitive language could be seen as an indicator of gender diversity and good CSR performance which in turn motivates employees and appears to attract investors.However, analyses of the relationship between gender diversity and the firm's market value have produced inconsistent results.While some researchers find no significant relationship between gender diversity and the firm's market value (Farrell & Hersch, 2005;Randøy et al., 2006), other studies document a positive relationship (Campbell & Mínguez-Vera, 2008;Carter et al., 2003;Catalyst, 2004) but also a negative relationship (Bøhren & Strøm, 2007).Inconclusive evidence may be due to different country origins, time periods and different estimation methods (Campbell & Mínguez-Vera, 2008).As prior literature's main focus is the effect of gender diversity, CSR performance, or the tone of certain business documents on the stock market, we believe we considerably contribute to this string of literature when analyzing the effect of gender-sensitive language in annual reports.
For our hypothesis, we follow the predictions of the socialist feminist framework (Hagemann-White, 1989;Acker, 1992).Indeed, we would interpret the absence of gender-sensitive language as a sign of gender inequality.However, we expect investors to perceive neutral language (e.g., lack of female salutations) as normal social practice and routine.In fact, we expect investors to intuitively prefer more concise and traditional language in annual reports as they may find it more readable.In line with Li (2008), we expect that, from the perspective of the majority of investors, gender-sensitive language decreases the annual reports' readability.Lower readability in turn is associated with hidden problems and complex structures within the firm.Therefore, gender-sensitive language is expected to deter investors, which results in a lower market value.This leads to the following hypothesis: H 1 : Addressing female and male annual report readers separately is negatively related to the firm's market value.

Sample Selection and Data
Our sample selection process starts with HDAX-listed firms from 2007 to 2015 with 110 firms per year.The HDAX is elected as the base because it comprises all 110 stocks in the DAX, MDAX und TecDAX indices.The DAX index has been calculated by Deutsche Börse AG since 1988 and covers all sectors (Deutsche Börse Group, 2004).It measures the share performance of the 30 largest companies listed in the Prime Standard segment (Deutsche Börse Group, 2016).Companies qualify for admission to the DAX based on the following two main criteria: (1) Order book turnover on Xetra and the Frankfurt trading floor (in the preceding twelve months) and ( 2) free float market capitalization on a specific date (last trading day of a month).Though, besides the requirement for a Prime Standard listing, for a company to be included in a ranking list its shares must have a minimum free float of ten percent and be headquartered in Germany.Also, it must continuously trade on Xetra and must have a minimum period since the first listing.The MDAX comprises the shares of 50 companies from traditional sectors that, in terms of the same key indicators as the DAX, rank immediately below the companies included in the DAX.Thus, the MDAX mainly comprises medium-sized companies from the pharmaceutical, chemical, machinery and financial sectors.The TecDAX was launched in 2003 and reflects the share performance of the 30 largest companies in the technology sector ranking below those included in the DAX.The composition of the DAX (MDAX/TecDAX) is reviewed on an annual (semi-annual) basis.The firms in our sample are included in the HDAX as they meet certain requirements that will be briefly presented below.
A company is not admitted to the HDAX if it has a rank lower than 40 in DAX, 60 in MDAX, 110 in SDAX, and 40 in TecDAX for either criterion, provided there is a company that has a rank equal or better than 35 in DAX, 55 in MDAX, 105 in SDAX and 35 TecDAX for both criteria (Deutsche Börse Group, 2016).If no alternate candidate can be determined, there is no change.According to Deutsche Börse Group, a company is rapidly excluded from the index if it ranks below 45 in DAX, 65 in MDAX, 115 in SDAX, and 45 in TecDAX in terms of either free float market capitalization or order book volume.It is replaced by a company with a rank equal or better than 35 in DAX, 55 in MDAX, 105 in SDAX, and 35 in TecDAX for both criteria.Since September 2016, the index composition has been fully automated which has improved the transparency of the index rules and visibility of changes.Before Deutsche Börse Group changed the way it determines changes in index composition in 2016, the decision on the composition was already completely rule-based, although there was room for discretion in some special situations.
As firms from the financial sector are subject to different financial accounting regulations, we reduced banks and real estate companies from the sample.We also excluded observations for which annual reports or financial date were not available.Our final sample contains 760 firm-year observations of 84 firms.Table 1 summarizes the sample distribution per industry and year.  1 shows the sample distribution over industry sectors of HDAX-listed firms between 2007 and 2015.The industry sectors are based on the Fama and French (1997) 12-industry classification.For completeness, Panel B shows the distribution of HDAX-listed firms in our sample over time (year).
The annual reports are analyzed using the text analysis program DICTION 7.0.We use our own created context specific keyword lists for German words, presented in detail in Section 5.2.Financial data is derived from COMPUSTAT, and market data is from the DATASTREAM database.

Keyword List and Weighting
The novel keyword list created for this study contains words referring to either female or male annual report readers.Table 2 represents all words included to count and weigh the use of words referring to female persons compared to those referring to male persons.represents the wordlist with the German versions of salutations (noun + "in") to female persons that are translated into English in Column (2).Column (3) shows the salutations addressed at male individuals that are translated into English as presented in Column (4).In order to test our hypothesis, we weigh the number of words referring to women (e.g., Column (1)) against words referring to men (Column (3) using the bag-of-words technique as presented in Equation ( 1) below.
Where N represents the total number of German annual reports in the sample, df i, the number of documents containing at least one occurrence of the i th word, tf i,j , the raw count of the i th word in the j th document, and a j the average word count in the document, then we define the weighted measure (FEM_WORDS i,j ).
Unlike English (Column 2 of Table 2), German (Column 1 of Table 2) includes inflections that are more explicit (+"in") (Bannier et al., 2018;Hawkins, 2015).When looking at nouns that are important for this study, both languages distinguish between men and women with respect to singular and plural.German also distinguishes four cases in noun phrases: nominative, accusative, genitive, and dative while English only has a separate genitive case.However, case is not relevant to our study.Because the analyzed documents are considered to be a "bag of words," where the presence of one word in the bag is independent of another, the context is ignored (Kearney & Liu, 2014;Manning & Schütze, 1999).Our novel wordlist accounts for gender equality in the German language.The "female" wordlist consist of terms referring to female employees, representatives, shareholders, etc.Therefore, we use term weighting to measure the relationship between gender-equal wording and the total number of words in the document (Jegadeesh & Wu, 2013).In line with prior studies, proportional weighting means our word list calculates simple frequencies for words appearing in the text (Kearney & Liu, 2014).It is assumed that all words in the predetermined dictionary are equally informative and that other words are uninformative (Antweiler & Frank, 2004).

Model Specification
This section presents the model used to test our hypothesis, namely whether addressing female and male annual report readers separately is related to the firms' market value.In line with prior studies, we run a cross-sectional multivariate regression and three different proxies for market value.The model is based on the model of Yermack (1995): where the dependent variable is a measure for Tobin's Q, a proxy for market-related firm performance that has also been used by prior studies.Tobin's Q (TOBINSQ) is calculated as the ratio of the market value of equity plus the book value of assets minus the book value of equity over the book value of assets.Our measure is based on that used in Doidge et al. (2004) and Lins (2003).A similar measure has also been used by Aggarwal and Samwick (2003), Fama and French, (2002) and La Porta et al. (2002).

Tobin's Q = (MV of equity + (BV of assets -BV of equity))/(BV of assets)
(3) Furthermore, three different measures are used as proxies for a firm's market related performance or market value: LOGMV i,t and PTBV i,t (see sensitivity tests in Section 6.2.2).Our variable of interest (FEM_WORDS i,j ) represents the number of words that refer to female annual report readers relative to words that refer to male annual report readers.Specifically, it represents the ratio of the absolute number of words referring to a female to the number of words referring to a male in the annual report j of firm i in time t.On the one hand, this enables us to test whether addressing female readers is related to the firm's market value, as it could indicate its awareness of gender equality in written language and a wish to attract male but also female investors.On the other hand, addressing female readers would lengthen the text and decrease readability, which may be less helpful for annual report readers.We include a number of control variables because they are expected to capture other determinants and measures of current available fundamental information, growth opportunities, and investment policies (Fauver and Fuerst, 2006;Lang and Stulz, 1994;Ofek, 1993;Servaes and Tamayo, 2013).These are: ratio of operating profits to total assets of the current and previous two years (ROA it , ROA it-1 .ROA i,t-2 ), natural logarithm of total assets (SIZE i,t ), and sales growth from time t-1 to time t in percent (GROWTH i,t ), CAPEX i,t and LEVERAGE it .CAPEX i,t is the ratio of long-term investments (in property, plant, and equipment) and total assets, and leverage is the ratio of long-term liabilities to total assets.We also include year and industry fixed effects, which are based on the Fama and French (1997) 12-industry classification.All variables are defined in Table 3. COMPUSTAT

FEM_WORDS i,j
The number of female salutations scaled by the number of male salutations in annual reports Annual reports and text analysis program using the bag-of-words method as presented in Table 2.

FEM_WORDSsq i,j
Quadratic value of the ratio of FEM_WORDS i,j Annual reports and text analysis program using the bag-of-words method as presented in Table 2.

LEVERAGE i,t
Long-term debt scaled by total assets (dltt/at) COMPUSTAT LOGMV i,t Natural logarithm of market value DATASTREAM PTBV i,t Price-to-book value per share DATASTREAM ROA i,t Net income scaled by total assets of year t = 0 (nicon /at) COMPUSTAT SIZE i,t Calculated as the natural logarithm of total assets COMPUSTAT TOBINSQ i,t Tobin's Q (TOBINSQ) is calculated as the ratio of the market value of equity plus the book value of assets minus the book value of equity over the book value of assets.
Market data derived from DATASTREAM; Financial data is from COMPUSTAT.The measure for Tobin's Q has been used by Doidge et al. (2004) and Lins (2003).(A similar measure has also been used by Aggarwal and Samwick, 2003;Fama and French, 2002;La Porta et al., 2002).
Note.This table presents the definitions of all variables used in our main regressions.Note.Table 3 presents the descriptive statistics.Variables are defined in Table 3.

Descriptive statistics
The descriptive statistics of all variables are presented in Table 4. On average, Tobin's Q is 1.71, indicating the market value of most firms in our sample is above the assets' replacement costs.In line with this, the price-to-book value of equity (PTBV i,t ) is on average 2.63, which is above one, representing a higher market value compared to the book value of equity.The logarithm of market value has a mean (median) of 7.91 (7.80).FEMALE i,t is a dummy variable showing that 56 percent of all firms in our sample distinguish between female and male persons (e.g., in salutations) in annual reports.That said, they do not maintain this language throughout the whole annual report.FEM_WORDS i,j is on average 4 percent, which indicates that only 4 percent of all gender-related nouns or salutations refer to female and male individuals, but that in 96 percent only male individuals are addressed.The quadratic value of FEMALE_WORDS i,t is 0.0016 (non-tabulated).ROA i,t is on average 4 percent, which shows that most of the firms in our sample have a positive operating income.The mean (median) of the logarithm of SIZE is 8.32 (8.04).The mean (median) of CAPEX i,t is on average 0.15 (0.09), which shows that 15 percent of sales is invested capital.Firms' sales growth is 7 percent on average (GROWTH i,t ), their leverage is 18 percent.
In Table 5, we present the full sample's Pearson correlations between our main variable of interest FEM_WORDS i,j and market value.The majority of control variables show significant correlations with the dependent variables (TOBINSQ i,t ; PTBV i,t , LOGMV i,t ).

Main Regression
In this section we test our hypothesis (H 1 ), which suggests that considering female persons in salutations in German annual reports is related to a firms' market value.The results are presented in Table 6.(2) We also use PTBV or LOGMV as dependent variables.The variables are defined in Table 3.The results in Columns (4), ( 5) and ( 6) are obtained when we use the same equation but include a variable (FEM_WORDS_sq i,j ) that accounts for a potential non-linear relationship between FEM_WORDS i,j and MV i,t .Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1.
When TOBINSQ i,t is the dependent variable, the results of Equation ( 2) show that the coefficient on FEM_WORDS i,j is negative and significant at the 10 percent level.This indicates that firms which address female and male readers in their annual reports separately have a lower market value.In line with other studies in this area, the effect of ROA of t = 0, t-1 and t-2 is positively and significantly related to TOBINSQ i,t at the 1 percent level.The sign of the coefficient on SIZE i,t and LEVERAGE i,t is negative and significant at the 1 percent level.The coefficient on GROWTH i,t is positive and negative on CAPEX i,t ; however, the coefficients are not significant.Overall, the model's explanatory power is 62 percent.In this context, what is interesting is that only 4 percent of all salutations on average refer to female annual report readers (Table 4).
The results can be interpreted from two perspectives.Firstly, (potential) shareholders or investors, managers and marketing departments may find that using both the female and the male version of nouns to impact negatively on readability.Readability is important as it shows firms do not hide information.This interpretation supports Li (2008), who uses a measure to examine readability, documenting that annual reports of firms with lower earnings are hard to read, since managers may choose lower readability to hide adverse information from investors.Lehavy et al. (2011) document that less readable 10-k filings are associated with greater dispersion, lower accuracy, and greater overall uncertainty in analyst earnings forecasts.If gender-sensitive language reflects a firm's attitude to gender diversity in business and to anti-gender discrimination, our findings contrast with those of prior studies.For instance, prior studies show a positive market reaction to better CSR performance (Dorfleitner et al., 2018), employee satisfaction (e.g., Edmans, 2011), and gender diversity (Campbell and Mínguez-Vera, 2008;Carter et al., 2003;Catalyst, 2004).Nevertheless, our results support the findings of Bøhren and Strøm (2010) who show firms create more value when gender diversity at board level is low.
Secondly, the results indicate that annual report readers may perceive the absence of gender-sensitive language in German annual reports as more readable.According to the socialist feminist framework, gendered power processes produce gender inequalities, which become normal social practice and appear inherent in organizational routines (Hagemann-White, 1989;Acker, 1992).As a consequence, inequalities in language could be interpreted as a form of routine rather than intentional discrimination.Hence, annual report readers may be more familiar with neutral (non-gender-sensitive) language.However, according to human relations theories, addressing both male and female readers equally may also increase female managers' and employees' satisfaction and self-esteem (e.g., Hertzberg, 1959;Maslow, 1943;McGregor, 1960).Section 7 contains a discussion of our findings against the background of gender equality and discrimination in business.

Sensitivity Tests
For sensitivity, two more variables serve as proxies for the firm's market value, namely the natural logarithm of the firms' market value (LOGMV i,t ) and price-to-book ratio per share (PTBV i,t ); both are derived from the DATASTREAM database.As presented in Column (2), the effect of FEM_WORDS i,j on the price-to-book ratio is negative and significant at the 5 percent level.The coefficient of the control variables point into the same direction as in Column (1), except for LEVERAGE i,t , which has a positive effect on PTBV i,t .This is not surprising, as more risk-taking can result in higher profits.When price-to-book ratio (PTBV i,t ) is the dependent variable, the explanatory power is 53.3 percent.Column (3) presents a negative relationship between FEM_WORDS i,j and the logarithm of market value at the 10 percent level.The effects of control variables are similar to the main regression in Column (1), and the model's explanatory power is 90 percent, which could be caused by the similarity of our proxy for firm size (SIZE i,t ) and the logarithm of market value (LOGMV i,t ).Nevertheless, the coefficient of our variable of interest is negative and significant, which is in line with the expectation that gender-sensitive language affects firm value.Moreover, we test whether there is a nonlinear relationship between the variable of interest and the firms' market value.As presented in Columns (4) to (6) of Table 6, the coefficient that represents the quadratic value of FEM_WORDS i,j is insignificant for all dependent variables.Hence, there is a proportional linear relationship between firms that address both women and men in annual reports and a decrease in their market value.

Discussion
In this section, we discuss our findings against the background of the phenomena that determine gender equality at the business environment.Despite the upward trend in female presence in the workplace, prior literature shows that women remain underrepresented in executive and board positions (e.g., Baumgartner & Schneider, 2010;Metz & Kulik, 2014), leading to the broad consensus that there is a glass ceiling issue (Cotter et al., 2001).This implies invisible barriers to women's promotion to upper management positions (Seo et al., 2017).By definition, a glass ceiling is "the unseen, yet unreachable barrier that keeps minorities and women from rising to the upper rungs of the corporate ladder, regardless of their qualifications or achievements" (Federal Glass to explain why women have not reached top-level positions.In fact, the absence of gender-sensitive language could be seen as an institutional mechanism that explains one dimension of the glass ceiling issue. On the one hand, studies on individual mechanisms consider a certain perception of women, namely that women who attain top leadership positions are exceptional.Potential female leaders even lack key qualities, such as assertiveness (Babcock & Laschever, 2009;Sandberg, 2010).Eagly and Carli (2003) claim that women who do occupy leadership roles are typically the survivors of discriminatory processes and therefore tend to be very competent.On the other hand, Cook and Glass (2014) argue that such individual explanations largely fail to consider the range of institutional factors that shape appointment decisions.The so-called glass cliff is a metaphor for a phenomenon whereby women are more likely than men to be appointed to top leadership positions in organizations that are struggling, in crisis and/or at risk of failure (Ashby et al., 2006;Ryan & Haslam, 2007).While research on the glass cliff focuses on the likelihood of women being appointed or promoted, the so called savior effect considers the mechanisms that shape their post-promotion tenure (Cook & Glass, 2014).Cook and Glass (2014) focus on the institutional mechanisms including the glass cliff, decision-maker diversity, and the saviour effect.Interestingly and contrary to predictions of the class cliff and savior effect, they find that it is diversity among decision makers which significantly increases women's likelihood of being promoted to top leadership positions.
Although there is general agreement that women face more barriers to becoming leaders than men, especially for roles that are male-dominated (Eagly & Karau, 2002), there is still less agreement about the differences between women's and men's leadership style once they attain these positions (Eagly & Johannesen-Schmidt, 2001).When it comes to leadership style, women have several advantages, but also suffer some disadvantages stemming from prejudicial evaluations of their competence as leaders, especially in male-dominated organizational contexts (Eagly & Carli, 2003).When looking at managerial role requirements and managerial motive patterns within hierarchical organizations (Miner, 2008, p. 11), in addition to other desires (such as the desire to exercise power or to compete), a leader also desires to assert themselves.The "be assertive" role requirement was referred to as the "masculine role" until the early 1970s, which shows that language can be misleading, since women had to possess the same desires in higher management positions as men.Many theories of leadership have focused mainly on stereotypically masculine qualities (e.g., Miner, 2008).Women tend to be viewed as lacking the skills to lead a large organization, which is underlined by substantial evidence of an implicit bias against women leaders generally (Eagly & Karau, 2002;Schein, 2001Schein, , 1973)).
However, prior research suggests that women are oriented towards achieving group coordination and maintaining human relations (Crutchfield, 1955;Karau & Williams, 1993).An interesting piece of information in this context is a phenomenon referred to as social loafing, where individuals working together in a group put in less effort than when they work alone (Latané et al., 1979).Kugihara (1999) explores how women behave when working in groups and finds that women engage less in social loafing than men.Hence, the leadership qualities of women depend on the situation at hand.According to Eagly (2007), it is reasonable to believe that stereotypically female qualities such as cooperation, mentoring, and collaboration are important to leadership, certainly in some contexts and perhaps increasingly in contemporary organizations.
When analyzing our findings against the background of the phenomena that determine gender equality in business environments, namely the glass ceiling, social loafing, leadership styles, and motivational theories, we expect that the absence of gender-sensitive language in annual reports can still be seen as an instance of negative bias or underlying discrimination against women.By following up on the theories presented above, we suggest that the use of gender-sensitive language may help to overcome invisible barriers such as gender inequality which may still exist between the lines in business texts.

Conclusion
In contrast to most existing German content analyses that focus on market reactions to the sentiment in business texts, this study is the first to focus on whether women and men are addressed equally in annual reports and if so, whether this is related to firm's market value.First, the results indicate that the number of firms using gender-sensitive language in annual reports-that is, that address women in annual reports as often as they do men-has decreased in recent years.Second, the results suggest that gender-sensitive language is negatively related to a firm's market value.We interpret the negative relationship between gender-sensitive language and market value as follows.On the one hand, annual report readers may unintentionally perceive gender-sensitive language in annual reports as less readable.On the other, the presence of gender-sensitive language has not yet become normal social practice.In showing that the market's response even to the few firms that do use gender-sensitive language is negative, this study makes a considerable contribution to the glass ceiling debate.Indeed, gender-sensitive language in annual reports and other relevant text documents may be used to overcome invisible barriers for women in business.Also, gender-sensitive language may help to increase employee satisfaction and female shareholders' and employees' identification with the firm.In addition, our results are particularly interesting against the background of the Gender Equality Act concerning women and men on supervisory boards that was adopted in 2015.The low number of firms that use gender sensitive language is somehow contrary to this Act that aims to increase the number of women in upper management positions.
Nevertheless, the interpretation of the results is limited due to two aspects.First, during the period under review, women were still in the minority at board level; the number of firms that use gender-sensitive language may have increased since 2017.Second, the bag-of-words technique used in this study does not consider the words' context (Kearney & Liu, 2014;Manning & Schütze, 1999).Thus, written texts in annual reports that are not gender-sensitive do not necessarily have to be less diversified or gender-discriminating in other dimensions (e.g., at the board level, workforce).Also, the results do not indicate whether firms treat women differently than men.Future research may consider additional methods to study investors' reactions to gender-sensitive language in business documents.Experimental studies are particularly suitable for answering such questions because researchers can control the risk of omitted variables by manipulating the independent variable and treatment conditions (Libby & Seybert, 2009).
the New York Stock Exchange (NYSE).Note 4. Evidence is supported by previous studies that use a context-specific dictionary to measure the tone in news articles (Ammann et al., 2014;Garcia, 2013).

Table 1 .
Sample distribution

Table 2 .
Wordlist and bag of words This table represents the two German wordlists we use to analyze the level of gender sensitive language in annual reports.Column (1)

Table 3 .
Variable definition

Table 5
This table presents the Pearson correlation matrix.All variables are defined in Table 3. Bold indicates significance at the 1% level.

Table 6 .
Regression results + ∑ INDUSTRY i,t + ∑TIME i,t + ԑ i,t