Scrub Typhus and Comparisons of Four Main Ethnic Communities in Taiwan in 2004 versus 2008 Using Geographically Weighted Regression

Purpose: On the main island of Taiwan, a higher risk of scrub typhus infection has been reported in endemic clusters in Southeastern Taiwan and in mountainous township areas. However, research on health care problems associated with scrub typhus in Taiwanese ethnic peoples is limited. This study employs spatial analysis of areal data to determine spatial features related to scrub typhus and the four main Taiwanese ethnicities: Hoklo, Hakka, Mainlander, and aboriginal communities, respectively. Methods: We used a GWR spatial method to analyze the local regressed relationships between scrub typhus incidence and ethnic community percentage in 349 townships in Taiwan, and the subsequent spatial regressed resultants and local parameter estimates were compared between two periods of 2004 and 2008 by kappa statistics. Results: In the GWR models, the spatial regressed relationships of scrub typhus incidences and the Hoklo communities showed significant and negative parameter estimates in numerous locations, showing clusters in Southeastern and Southwestern Taiwan, and areas of the central and southern mountainous townships. Both Hakka and Mainlander communities in the mountainous townships showed less-regressed clusters with scrub typhus prevalence. However, clusters of Aboriginal populations were positively correlated with scrub typhus in highly infected mountainous areas and in Southeastern Taiwan. The kappa value results and the comparisons of local parameter estimates in the 349 townships in Taiwan between 2004 and 2008 indicated that the incidence of scrub typhus in the Hoklo communities was substantial, in the Hakka communities was fair, in the Mainlander communities was slight, and in the aboriginal communities was moderate, respectively. Conclusion: The aboriginal communities have been closely associated with higher risks of scrub typhus in the mountainous townships and in the southeastern portion of Taiwan.


Introduction
Scrub typhus is a vector-borne zoonotic disease caused by Orientia tsutsugamushi, in which intracellular parasites live within the cells of other animals. O. tsutsugamushi lives primarily in mites belonging to the species Leptotrombidium (Trombicula) akamushi and Leptotrombidium deliense (Beers & Berkow, 2004). Scrub typhus infects approximately 1 million people annually, and a billion more are estimated to be at risk (Kawamura et al., 1995;Rosenberg, 1997). Because the disease is limited to Eastern and Southeastern Asia, India, Northern Australia, and adjacent islands, it is also commonly referred to as tropical typhus (Beers & Berkow, 2004;Devine, 2003). The infection is transmitted to humans and rodents by various species of infective trombiculid mites that feed on lymph and tissue fluid rather than blood. The mites have a 4-stage life cycle: egg, larva, nymph, and adult. The larval stage is the only stage that transmits the disease to humans and other vertebrates. In regions where scrub typhus is a constant threat, a natural cycle of O. tsutsugamushi transmission occurs between mite larvae and small mammals (e.g., field mice and rats). Humans enter the cycle of rickettsial infection only accidentally (Suputtamongkol et al., 2009). The seasonal occurrence of scrub typhus varies with the climate in different countries, and occurs more frequently during the rainy season. Forest clearings, riverbanks, and grassy regions provide optimal conditions for the infected mites to thrive. These small geographic regions are high-risk areas for humans, and have been called scrub-typhus islands (Beers & Berkow, 2004

Data Collection and Management
The percentages of the four major Taiwanese ethnic communities in each township were obtained from an official report of the Council for Hakka Affairs (2004 and2008;Council for Hakka Affairs, 2012). According to self-reports in official governmental statistics, Han Chinese constitute 98% of the Taiwan population, whereas Taiwanese aborigines constitute the remaining 2%. The composite category of the "Taiwanese resident" is often reputed to include a significant population of at least four constituent ethnic groups: the Hoklo (71.3%), the Hakka (13.8%), the Mainlander (8.5%), and the Taiwanese aborigines (1.9%; Council for Hakka Affairs, 2012).
Data for confirmed cases of scrub typhus were obtained from the Notifiable Infectious Diseases Statistics System and Infectious Diseases Database at the Taiwanese Center for Disease Control (Department of Health, 2012). The Ministry of the Interior provided the demographic data used in this study (Ministry of the Interior, 2012). The age-adjusted standard incidence rates were calculated with a direct adjustment using the global population in 2000 as the standard population (Ahmad et al., 2001). The age-adjusted standard incidence rates from 2000-2010 were calculated. The standardized incidence ratio (SIR) of scrub typhus was calculated for each township and then used as the response variable in the GWR model. The GWR model used the following explanatory variables: percentage of the Hoklo, Hakka, Mainlander, and aboriginal communities.

Geographically Weighted Regression
The GWR model extends the traditional standard regression framework that estimates local, rather than global, parameters (Fotheringham et al., 1998). The model is a type of local statistic that produces a set of local parameter estimates showing how a relationship varies over space. This enables examining the spatial pattern of the local estimates to gain a better understanding of possible hidden causes for this pattern (Fotheringham et al., 2002). Conversely, a traditional regression method, such as ordinary least squares (OLS), is a type of global statistic that assumes that the relationship under study is constant over space, and therefore, assumes that the parameter is the same for the entire study area.

An OLS model can be defined as follows:
y where y is the response variable, β 0 is the intercept, β i is the parameter estimate (coefficient) for the explanatory variable x i , p is the number of explanatory variables, and ε is the error term.
The GWR model allows local, rather than global, parameters to be estimated for the study area. Thus, the GWR model rewrites the OLS model as follows: where u j and v j are the coordinates for each location j, β 0 (u j ,v j ) is the intercept for location j, and β i (u j ,v j ) is the local parameter estimate for the explanatory variable x i at location j.
The weight assigned to each observation is based on a distance-decay function centered on observation i.
The estimator for the GWR model is similar to the weighted least squares (WLS) global model, except that the weights are conditioned on the location u relative to the observations in the data set, and hence, they change for each location. The estimator takes the following form: W (u) is the square matrix of weights relative to the position u. A particular location can be indexed (u j , v j ) in the study area. X T W (u)X is the geographically weighted variance-covariance matrix, and y is the vector of the value of the response variable.
In the area in which this study was conducted, the sample points produced by the polygon centroids were not regularly placed, but were clustered. A convenient way to implement the adaptive bandwidth specification is to select a kernel that allows the same number of sample points for estimations. The weight can then be calculated using the specified kernel and the value set for any observation with a distance that exceeds the bandwidth to zero. The bi-square function is as follows: where wi (uj,vj) is zero when di (vj,uj) > h. The term h represents a quantity known as the bandwidth. This is a near-Gaussian function with the useful property of the weight being zero at a finite distance.
The bandwidth was chosen by minimizing the Akaike information criterion (AIC) score, calculated as follows: Al 2 log log 2 where tr(S) is the trace of the hat matrix. The AIC method has the advantage of accounting for the fact that the degrees of freedom may vary among models centered on different observations. The optimal bandwidth was determined by minimizing the corrected AIC, as described by Fotheringham et al. (2002). The GWR models produce a set of local regression results, including local parameter estimates and local residuals, which can be mapped to show their spatial variability. Geographically weighted regressions were employed and mapped using ArcMap 9.3.
We used the Benjamini-Hochberg (B-H) procedure to control the false discovery rate, which consistently modifies the significance level for each test. This procedure was used to determine the significance of parameter estimates produced by the GWR model. Thissen et al. (2002) proposed a quick and easy method for calculating the B-H procedure false discovery rate using Microsoft Excel (Thissen et al., 2002). The B-H approach controls the FDR by sequentially comparing the observed p value for each of a family of multiple test statistics (from largest to smallest) to a list of computed B-H critical values [pB-H(i)]. The critical value on the list is determined for each test statistic, and indexed by i by linear interpolation between α/2 (for the largest observed p value) to (α/2)/m, where m is the family size (for the smallest of the P values). Because the last value is the Bonferroni critical value, the reason for the power gain of B-H relative to the Bonferroni approach is clear; the B-H approach compares only the smallest of the m observed p values to the Bonferroni critical value. All other p values are calculated using less stringent criteria. The local parameter is estimated to be significant if the p value is less than the B-H critical value; otherwise, it is deemed non-significant (Thissen et al., 2002). The results were also used in detecting the spatial similarity between the 2004 and 2008 periods.

Detecting Spatial-Pattern Consistency
The Kappa statistic for map comparisons have been developed that provide parametric tests for the similarity of spatial patterns across pairs of variables (Monserud & Leemans, 1992;Hagen, 2003). We used the Kappa statistic (Fleiss, 1981), which reflects the consistency between two clustering calculations (i.e., the significant determinations in local parameter estimates across the 2004 and 2008 periods). A value close to 1 represents nearly perfect agreement, whereas values close to or below 0 represent poor agreement. Landis and Koch (1977) developed a useful scale for interpreting the Kappa estimate: 0.81-1.00 (almost perfect); 0.61-0.80 (substantial); 0.41-0.60 (moderate); 0.21-0.40 (fair); 0.00-0.20 (slight); and < 0.0 (poor) agreement.

Results
Table 1 presents a summary of the age-adjusted incidence rates between 2000 and 2010 on the main island of Taiwan, showing that all incidence rates related to men were higher than those for women. Gender ratios, defined as the ratio of men to women, generally ranged from 1 to 2, but increased to 2.29 in 2008.    Figure (f) Vol. 5, No. 3;2013 concerned about its increased incidence (Department of Health, 2012). A higher risk of scrub typhus infection is not only endemic to Southeastern Taiwan and mountainous township area, but also in the Pescadore Islands, Kinmen Islands, and Matou Islands (Centers for Disease Control, 2011). The GWR method is a specific type of spatial regression that generates parameters disaggregated by the spatial units of analysis. We considered analyzing the contiguity-based spatial units (e.g., 349 administrative government areas on the main island of Taiwan) using the GWR method. However, the method was unsuitable for examining isolated regions (e.g., Pescadores, Kinmen, & Matou islands). Therefore, the main island of Taiwan was considered as this study area.
The geographical profile for O. tsutsugamushi Hyashi density shows that seropositive outcomes have been observed in captured small rodents and their loaded chigger mites throughout the main island of Taiwan. The main reservoir hosts include Apodemus agrarius, Bandicota indica, and Rattus losea, and the key vector chigger mite is Leptotrombidium imphalum (Kuo et al., 2011a(Kuo et al., , 2011b. O. tsutsugamushi was clustered in less developed areas with a relatively low population density, namely the mountainous township area and Southeastern Taiwan; in these areas, a higher incidence of scrub typhus was reported (Kuo et al., 2011a;Kuo et al., 2011b). Frequent human visitation in an endemic area is a critical factor that is increasing the probability of scrub typhus infection, and such visits might also provide more food resources for small rodents. Such factors can enhance the prevalence rate of scrub typhus, such as farm work (Ogawa et al., 2002;Lee et al., 2006;Liu et al., 2009;Kuo et al., 2011a), and troop activity (Payne et al., 2009;Department of Health, 2012). In this study, the habitations and mountainous activities in aboriginal communities provide more chances to contact with the vector chigger mite, leading to much higher infected events than the other ethnic groups. Therefore, the aboriginal communities were closely associated with the prevalence of scrub typhus in the main island of Taiwan. This information can improve planning for the most advantageous types of health care policies and implementing effective health care services.

Conclusion
The combined method of GWR and kappa statistics is useful for directly analyzing the local regressed relationships between two variables and to detect the two resultant patterns between 2004 and 2008, in which GWR calculated the local parameter estimates of 349 townships on the main island in Taiwan and kappa statistics determined the extent of spatial consistency. Our conclusion indicates that clusters of the aboriginal population have positive significance and correlation to higher risks of scrub typhus incidences in endemic areas of the mountains and in Southeastern Taiwan.