Development and Assessment of Localized Seasonal Rainfall Prediction Models: Mapping and Characterizing Rift Valley Fever Hotspot Areas in the Southern and Southeastern Ethiopia

Rift Valley Fever disease has been recognized as being among permanent threats for the sustainability of livestock production in Ethiopia, owing to shared boarders with RVF endemic countries in East Africa. Above-normal and widespread rainfall have outweighed as immediate risk factor that facilitated historical outbreaks of the disease in the East Africa. The objective of the present study, thus, was to develop prospective localized seasonal rainfall anomaly prediction models, and assess their skills as early indicators to map high risk localized rift valley fever disease outbreak areas (hotspots) over the southern and southeastern part of Ethiopia. 21 years of daily rainfall data; for five meteorological stations, was employed in diagnosing existences of any anomalous patterns of rainfall, along with a cumulative rainfall analysis to determine if there were ideal conditions for potential flooding. The results indicated that rainfall in the region is highly variable; with non-significant trends, and attributed to be the results of the effects of large-scale climatic-teleconnection. The moderate to strong positive correlations found between the regional average rainfall and large scale teleconnection variables (r ≥ 0.48), indicated some potentials for early prediction of seasonal patterns of rainfall. Accordingly, models developed, based on the regional average rainfall and emerging developments of El Niño/Southern Oscillation and other regional climate forcings, showed maximum skills (ROC scores ≥ 0.7) and moderate reliability. Deterministically, most of the positive rainfall anomaly patterns, corresponding to El Niño years, were portrayed with some skills. The study demonstrated that localized climate prediction models are invaluable as early indicators to skillfully map climatically potential RVF hotspot areas.


Introduction
The burdens from vector-borne infectious diseases are affecting the least developed countries in Africa disproportionally (LaBeaud, 2008;Hotez & Kamath, 2009).Rift Valley Fever (RVF) is among such mosquito-borne viral zoonotic diseases.Its outbreaks had inflicted pronounced health and economic impacts in much of the countries in sub-Saharan Africa (Anyamba et al., 2010;Rich & Wanyoike, 2007).Since its first isolation in Kenya in 1930 (Daubney & Hudson, 1931), RVF outbreaks have been recorded in many countries in the region (Davies, 2010), and, recently, it has emerged in new geographical areas, with outbreaks reported in Yemen and Saudi Arabia in 2000 (Balkhy & Memish, 2003).
Despite the long perceived high risk for the disease, many countries in Africa, including Ethiopia, have not reported RVF disease.One reason is because most RVF viral activity is cryptic and at a low level and not associated with detectable disease in humans and animals (Davies, 2008).Another reason could be the lack of systematic surveillance activities for such epidemic diseases or, if exists, because they are not necessarily optimal in many of these countries (Davies, 2010).As a result, most of them remain unaware of the circulation of the virus within their territories, which can be validated based on the fact that many African countries have found significant seroprevalence in sheep, goats and cattle for the RVF virus, yet without any clinical signs being reported in humans or animals (Davies, 2008).
However, despite the proven potentials of such systems in identifying continental-and regional-scale eco-climatic conditions associated with potential vector-borne disease outbreaks (Linthicum et.al., 1999;Anyamba, et.al., 2009), there are some discrepancies between the spatial accuracy required by policy makers and actors at local levels and what has been being delivered, mainly because such monitoring and prediction systems have been based on interpretations of remotely sensed proxy variables as a substitution for leading RVF indicator elements; such as the Normalized Diffrence Vegetation Index (NDVI) as a proxy for regional rainfall.Furthermore, several factors, at local levels, including topography, closeness to surrounding water bodies, etc., determine how patterns of rainfall at such smaller spatial scales would evolve.Hence, such approaches might open a window for overestimation or underestimation of the leading RVF outbreak indicator variables.For example, if NDVI is employed, as proxy for rainfall, specifically, in areas maintained with irrigation or where there have been unprecedented changes in land use it may be misleading (Gikungu, et al., 2016).There is, therefore, a real need to further improve the spatial and temporal accuracies of forecasts of the leading RVF indicator variables, and subsequently our skills in predicting RVF outbreaks.The use of high spatial resolution mapping to identify flooded regions in RVF endemic areas at risk or even the use of radar data has been suggested as possible approaches to improve such discrepancies (Anyamba et al., 2010).
Ethiopia has started providing emphasis to RVF disease following a serious of outbreaks of the disease in the neighboring countries.The country has developed RVF Contingency and Preparedness Plan since June 2008, and set active RVF surveillances in place over high risk areas bordering RVF endemic countries.The plan underlines the importance of RVF specific Early Warning (EW) programmes and early outbreak prediction models in mounting appropriate, timely, and cost-effective responses against the disease, based on early predictions for potential above normal seasonal rainfall over high risk areas (MoARD, 2008).However, to date, no single effort, that the author is aware of, were made in an attempt to analyze and explain the coupling between patterns of local climatic variability with risks for potential establishment and spread of the RVFV, let alone an attempt to develop locally relevant RVF prediction models based on any of the leading RVF outbreak indicator variables.
Apart from that, several studies have previously investigated the influence of ENSO and other regional forcings on the patterns of rainfall in Ethiopia.Most of these studies, however, have focused on variability and predictability of the June to September rainfall season, and, merely, for areas over the central and northern half of the country (National Meteorological Service Agency (NMSA), 1996; Bekele, 1997;Shanko, & Camberlin, 1998;Korecha & Barnston, 2007;Gissila, Black, Grimes, and Slingo, 2004).Paradoxically, there are no previous studies, except one by Degefu, Rowell, and Bewket (2017), regarding the influence of such large scale teleconnections on patterns of rainfall for the (September) October -December ((S)OND) season in Ethiopia, which is known as the short rainy season for areas over the southern and southeastern parts of the country.Despite such limitations, the findings of Degefu et al., (2017) depicted statistically significant positive correlations between Sea Surface Temperature (SST) and patterns of the short rainy season's rainfall over the southern and southeastern parts of the country.Furthermore it has been indicated that there are skills in inferring patterns of seasonal rainfall in the country based on those large scale teleconnection variables (NMSA, 1996;Bekele, 1997;Shanko, & Camberlin, 1998;Korecha & Barnston, 2007;Gissila et al., 2004;Degefu et al., 2017).Recognizing such a possibility, the NMA of Ethiopia has started, recently, to provide seasonal rainfall predictions based on such teleconnection patterns, though its importance continues to be somewhat underweighted (Degefu et al., 2017).
Hence, in responding to the quest for better capability for predicting the seasonal pattern of rainfall for areas characterized as having high risk for RVF disease outbreaks in Ethiopia, and for improved early and rapid detection of possible RVF outbreaks at the local levels (MoARD, 2008), this particular study argues and shows that simple empirical prediction models can be efficiently used.The approach can be one way to identify local scale 'rainfall hotspots' (rainfall hotspots, in this study, are defined as areas for which an anomalously above normal and wide spread seasonal rainfall is forecasted for), which can serve as early indicators for areas with high potential risks of RVF outbreaks.Such local scale rainfall prediction models could enhance the efficiency in targeting zones, districts, or provinces to deploy resources, in time, and strengthen efforts for active surveillance, detection, and control of RVF disease.Furthermore, predicting the pattern of rainfall at such finer-local scale, at a reasonable lead-time preceding potential RVF outbreak periods in neighboring countries, could help to prioritize targeted surveillance and mosquito control activities, in accordance with available resources.
The paper begins with a short description of the study area, the data employed, and the methods used in the analysis, followed by the results and discussion section that elaborates on, in order of appearance, results of the quality control process, seasonal and annual trends in the rainfall series, daily and monthly cumulative rainfall totals, and patterns and strengths of seasonal correlations of rainfall with large scale teleconnection indices.The results of subsequent analyses on predictability of the pattern of the regional rainfall are also presented.Finally it ends with the main conclusions drawn from the study.

Description of the Study Areas
The study areas are purposefully selected, for this particular study, based on their geographical proximity and climatic similarity to RVF endemic countries in the HoA region (Keneya and Somalia), and the nature of cross-border livestock movement (trade or seasonal migration).Accordingly five districts were selected; Deghabur and Kebridhar (from the Somali regional state), Dire, Moyalle and Yabello (from the Borena zone of  Gonzalez-Rouco, Luis, Quesada, and Valero, 2001;Štěpánek, Zahradníček, and Fardaet, 2013).Outliers (suspicious data) are observation values very distant from a threshold value of a specific time series data that can be due to measurement errors or to extreme meteorological events (Gonzalez-Hidalgo, Lopez-Bustins, Stepanek, Martin-Vide, and De-Luis, 2009;Göktürk, Bozkurt, Sen, and Karaca, 2008).Several approaches that focus on temporal and/or spatial variability can be applied in order to identify outliers and diagnose whether they are erroneous or not (Barnett & Lewis, 1994;Peterson et al., 1998).In this particular study, the Tukey fence outlined in (Ngongondo, Yu-Xu, Gottschalk, and Alemaw, 2011) was used to censor outliers in the rainfall datasets.
The primary objective of outliers trimming is to reduce the size of the distribution tails in order to make a safer use of the nonresistant homogenization techniques used later (Štěpánek et al., 2013).Hence, rather than rejecting extreme values in the data (suspicious data values) they were replaced by some threshold value that kept the information of an extreme event and yet did not have such an important influence on the nonresistant statistical techniques employed latter in this research.
The Turkey fence is the data range: Where Q 1 and Q 3 are respectively the lower and upper quartile points, 1.5 are standard deviations from the mean, and IQR is the interquartile range.
In this study values beyond these limits were considered as suspicious data points and subjected to further evaluations to check if the trimmed values carried any physical meaning and, hence, for suppression of false alarms.If no plausible interpretations were found, outliers were set to a limit value corresponding to ±1.5×IQR, otherwise, to keep the information from extreme events, suspicious outlier values of each monthly precipitation series were identified as those values trespassing a maximum threshold for each time series (Trenberth, & Paolino, 1980;Peterson, Vose, Schmoyer, and Razuvaev, 1998), defined as: Where, Q 3 is the third quartile and IQR the interquartile range.Subsequently, suspicious values were replaced by the corresponding unique Pout values (Gonzalez-Rouco et al., 2001;Göktürk et al., 2008).The original values of the outliers were restored, latter, for specific studies concerning extreme values (e.g.potential flooding).This method has more resistance against outliers because quartiles are used in this method (Gonzalez-Rouco et al., 2001).

Homogeneity tests:
The second step of the quality control process involved homogeneity tests.A homogeneous climate series is defined as one where variations are caused only by changes in weather and climate (Conrad & Pollak, 1950).The presence of in-homogeneities is a common problem in climate time series.These irregularities in climate data can deceive the actual results and lead to some wrong conclusions (Vicente-Serrano, Beguería, Lopez-Moreno, García-Verac, and Stepanek, 2010).Thus, to assess some meaningful climate analysis, the climate data must be homogeneous (Stepanek, Zahradnicek, and Skalak, 2009).Although several techniques have been developed for detection of irregularities on a site and their adjustment, no single procedure is explicitly recommended.
In this particular study, due to their lower demands in application and interpretation as well as because the stations are randomly distributed, with poor correlations among the stations, three homogeneity tests; Pettitt test, Buishand Range (BR) test, and Standard Normal Homogeneity Test (SNHT), were used for absolute testing of homogeneity (using stations own data) of the rainfall series.For each of the daily rainfall series, two testing variables, annual mean and annual maximum values, were considered.The following sections outline in detail the methodology for performing these tests.
Pettitt's test: This test is a nonparametric test, which is useful for evaluating the occurrences of abrupt changes in climatic records (Yesilırmak, Akçay, Dagdelen, Gürbüz, and Sezgin, 2008).One of the reasons for using this test is that it is more sensitive to breaks in the middle of the time series (Wijngaard, KleinTank, Können, 2003).The statistic used for the Pettitt's test is computed as follows: Where m i is the rank of the i th observation when the values X 1 , X 2 ….. X n in the series are arranged in ascending order.The statistical break point test (SBP) is as follows: When U k attains maximum value of K in a series then a change point will occur in the series.The value is then compared with the critical value given by Pettitt (1979).
Buishand Range (BR) test : Buishand, (1982) noted that tests for homogeneity can be based on the adjusted partial sums or cumulative deviations from the mean and it is given as follows: The term S k * is the partial sum of the given series.If there is no significant change in the mean, the difference between y i and y will fluctuate around zero.The significance of the change in the mean is calculated with 'rescaled adjusted range', R, which is the difference between the maximum and the minimum of the S k * values scaled by the sample standard deviation (SD) as: The critical value for R/ n is calculated by Buishand (1982).

Standard Normal Homogeneity Test (SNHT):
A statistic T (y) is used to compare the mean of the first y years with the last of (n -y) years and can be written as below: Where, The year 'y' consisted of a break if the value of T y is a maximum.To reject null hypothesis, the test statistic, T 0 , should be greater than the critical value, which depends on the sample size.The test statistic, T 0 , is given as: Classification of the results of the homogeneity tests: After testing the homogeneity of all the selected stations, for the testing rainfall variables, the results of all the three tests were evaluated.The results were classified following (Schonwiese & Rapp, 1997;Amit & Mohammed, 2013).This classification was based on number of tests rejecting the null hypothesis.Three categories were identified: Class1: 'useful'-one or zero of the tests rejected the null hypothesis; Class 2: 'doubtful'-two tests rejected the null hypothesis; and Class 3: 'suspect'-all the three tests consistently rejected the null hypothesis.
The qualitative interpretations of the categories are as follows: Class1: 'useful'-No clear signal of an inhomogeneity in the series is apparent.The series seem to be sufficiently homogeneous for trend analysis and variability analysis.
Class 2: 'doubtful'-Indications are present of an inhomogeneity of a magnitude that exceeds the level expressed by the inter-annual standard deviation of testing variable series.The results of trend analysis and variability analysis should be regarded very critically from perspective of the existence of possible inhomogeneities.
Class 3: 'suspect'-It is likely that an inhomogeneity is present that exceeds the level expressed by the inter-annual standard deviation of testing variable series.Marginal results of trend and variability analysis should be regarded as spurious.Only very large trends may be related to a climatic signal.Hence, series falling in class 3 labeled 'suspect' could not be taken as reliable and removed from subsequent statistical analysis.
Randomness and persistence analysis: One of the problems in the analysis and interpretation of trends in hydroclimatic data is the confounding effect of serial dependence (Partal & Kahya, 2006).The existence of a positive serial correlation in a time series could signify a significant trend, while in fact, due to random effects of the data series.A negative serial correlation, on the other hand, could cause an underestimation of the probability of a significant trend (Yu, Yang, and Kuo, 2006).Thus, time series data required for trend analysis should be random and/or non-persistent (Ngongondo et al., 2011).
Hence, before proceeding to trend analysis, the rainfall time series data were tested for randomness and independence using the autocorrelation function (r 1 ) as described in (Box & Jenkins, 1976) in the following manner: Where x i is an observation, x i+1 is the following observation, x is the mean of the time series, and n is the number of data.
The autocorrelation coefficient provides a measure of temporal correlation between the data points in a series, for different time lags (Brockwell & Davis, 1996).In this study, serial correlation of rainfall series, for all individual stations as well as for the regional average rainfall series, with time, was employed to see the tendency for the series to remain in the same state from one observation to the next or not.Whenever significant serial correlation appeared within a given time series, the data series had been 'pre-whitened', prior to applying the subsequent trend test, following the procedure described in Box & Jenkins (1976).The corrected data series is, thus, obtained as:

Trend Analysis
The Mann-Kendall test: is a rank-based method (Mann, 1945;Kendall, 1975) that has been applied widely to identify significant trends in hydroclimatic variables (Yenigun, Gumus, and Bulut, 2008).The test checks the null hypothesis of no trend versus the alternative hypothesis of the existence of increasing or decreasing trend.Furthermore, it is a non-parametric method, which is less sensitive to outliers (temper values of time series) and test for a trend in a time series without specifying whether the trend is linear or nonlinear (Partal & Kahya, 2006;Yenigun, Gumus, and Bulut, 2008).
The Mann-Kendall's test statistic is given as: Where S is the Mann-Kendal's test statistics; x i and x j are the sequential data values of the time series in the years i and j (j > i), and N is the length of the time series.A positive S value indicates an increasing trend and a negative value indicates a decreasing trend in the data series.
The sign of the function is given as: This statistics represents the number of positive differences minus the number of negative differences for all the differences considered.For large samples (N > 10), the test is conducted using a normal distribution, with the mean and the variance as follows: Where n is the number of tied (zero difference between compared values) groups and t k the number of data points in the k th tied group.For n larger than 10, ZMK approximates the standard normal distribution (Partal & Kahya, 2006;Yenigun, Gumus, and Bulut, 2008) and the standard normal deviate (Z-statistics) is then computed as follows: The presence of a statistically significant trend was evaluated using the ZMK value.In a two-sided test for trend, the null hypothesis H o should be accepted if /ZMK/ < Z 1-α/2 at a given level of significance.Z 1-α/2 is the critical value of ZMK from the standard normal table.E.g. for 5% significance level, the value of Z 1-α/2 is 1.96.All the trend results in this paper have been evaluated at the 5% level of significance to ensure an effective exploration of the trend characteristics within the study areas.
The Sen's estimator of slope: Sen's estimator (Sen, 1968) has been widely used for determining the magnitude of trend in hydro-meteorological time series, depicting the quantification of changes per unit time (Hadegu, Tesfaye, Mamo, and Kassa, 2013).In the method, the slopes (T i) of all data pairs are first calculated as: Where, x j and x k are data values at time j and k (j > k), respectively.The median of these N values of T i is Sen's estimator of slope which is calculated as: A positive value of β indicates an upward (increasing) trend and a negative value indicates a downward (decreasing) trend in the time series.

Cumulative Rainfall Analysis
Since Rift Valley fever outbreaks are known to follow periods of anomalously extended above-normal rainfall and associated potential flooding conditions, cumulative monthly and daily rainfall anomalies, corresponding to the short rainy seasons (OND) of two of the recent RVF outbreak periods (1996/97 and 2006/07), were calculated and plotted.The periods were purposefully selected to encompass the short rainy season for equatorial eastern Africa, which historically preceded or be followed by outbreaks of RVF epidemics.
The cumulative rainfall anomaly index is calculated as: where C n is the cumulative rainfall anomaly value for time steps 1to n, Σ is the summation function, R i was total rainfall at time step i of the series, and M i is the average total rainfall for time step i.
It has been known that the chances for flooding are enhanced if an ElNiño event coincided with the short rainy season for East Africa (October-December) (De-Luı´s, Gonza´lez-Hidalgo, Raventos, Sanchez, and Cortina, 1999).This study illustrated the occurrences of potential flooding conditions by analyzing whether the calculated C n values, for the periods preceding and including the first reported RVF outbreak month for east Africa-(December) were consistently positive or not.Accordingly, consistent positive C n values, for consecutive two to three months, were considered as indicators for potential flooding situations.

Precipitation Concentration Index (PCI)
The monthly Precipitation Concentration Index (PCI) was analyzed following De-Luı´s et al., (1999), which is the modified version of the one given by Oliver (1980).The PCI values are calculated as: Where, P i is the rainfall amount of the i th month.Based on the scale defined in De-Luı´s et al., (1999) The larger the values of the calculated PCI and the more consistent and positive the C n values, then the more will be the potential for flooding condition.

Seasonal Teleconnection Patterns and Strengths
Correlation analyses were employed to investigate if the pattern of rainfall in the southern and southeastern Ethiopia had significant correlations with large scale global and regional teleconnection indices.The correlation tests were set for the two to three months period corresponding to the RVF pre-epidemic and epidemic periods for east Africa.The analyses were based on the regional average SON rainfall totals, calculated as the average of SON rainfall totals for all the five rainfall stations selected from the southern and southeastern Ethiopia, and the two teleconnection indices, the Niño3.4SST and the Indian Ocean Dipole mode (MDI), which are known to affect the pattern of seasonal rainfall in Ethiopia (Ogallo, 1988).Both positive and negative correlation values equal to and greater than 4.5 (r ≥ 4.5) were considered as indicatives for moderate to strong relationships.

Predictability of Seasonal Rainfall Patterns
To determine whether the regional average SON rainfall pattern could be predicted with a purely simple statistical model, based on the large scale SSTs data, Multiple Linear Regression (MLR) models were developed and their skills in predicting the regional average SON rainfall anomaly patterns investigated.Model equations were developed by using the MLR option in the Climate Predictability Tool (CPT) of the IRI (http://iri.columbia.edu).The selection of predictor variables for the models was based on the linear strength of the correlations between historical records of Niño-3.4SST and DMI values and the regional rainfall total anomaly, averaged for the SON period.Both cross validation (Michaelsen, 1987) and retroactive (Barnston et al., 1994) approaches have been widely used in studies involving climate prediction (Korecha & Barnston, 2007;Gissila et al., 2004;Thiaw, Barnston, and Kumar, 1999).In cross validation, a model is developed using all years but excluding each single year, in turn, which is predicted and verified in each case.The retroactive method involves partitioning the time series data into a training period and an independent verification period.Both forecasting approaches are described in-detail in the IRI website (http://iri.columbia.edu).In this study, the retroactive calculation option in the CPT was used to fit a MLR model to an initial subset of the overall model training period, with the models first trained with information from 1987/88 and leading up to and including 1996/97, which resulted in a first training set of 10 years long.The seasonal rainfall of the next year (1997/98) was subsequently predicted using the trained models.This procedure had continued until the 2006/07 regional SON rainfall anomaly was predicted, using the models trained with data from 1987/88 to 2006/07, which resulted in 10 years (1997/98-2006/07) of independent forecast data.All the models, developed for regional average SON rainfall anomaly forecasts, were cross validated (5-years-out window) over the 20 years period from 1987/88 to 2006/07.

Forecast Verifications
In estimating skills of the models in predicting the regional average SON rainfall total anomaly over S-& SE-Ethiopia, the observed and predicted fields were separated into three categories defining above-normal, near-normal, and below-normal rainfall total anomalies.The three-category design employed in this study was based on threshold values as defined by the 33 rd (below-normal) and 67 th (above-normal) percentile values of the climatological record.
Seasonal climate is inherently probabilistic, hence two attributes of interest for probabilistic forecasts were considered, i.e., discrimination (can the forecasts successfully distinguish different outcomes?)and reliability (is the confidence communicated in the forecast appropriate?)(Landman, Beraki, DeWitt, and Lötter, 2014).These two attributes were analyzed by using two of the outputs of the CPT MLR model fitting procedure as forecast verification measures, the Relative Operating Characteristic (ROC) (Mason & Graham, 2002) and the Reliability Diagram (Hamill & Colucci, 1997;Wilks, 2006).The ROC indicates how frequently the forecasts can successfully distinguish different below-normal from normal and above-normal, or above-normal from normal and below-normal.Specific to this study, it can also be translated as whether the forecasted probability for wetter condition, for the average SON period, was higher when an El Niño occurs, compared to when it did not occur.ROC scores for the three rainfall categories (above normal (AN), normal (N), and below normal (BN)) represent the respective areas beneath the ROC curve that are produced by plotting the forecast hit rates against the false alarm rates.If the area is ≤0.5, the forecasts have no skills, and for a maximum ROC score of 1.0, perfect discrimination has been obtained (Landman et al., 2014).The forecasts are considered reliable if there is consistency between the predicted probabilities of the defined rainfall categories and the observed relative frequencies of the observed rainfall being assigned to these categories.

Quality Control
Outliers Identified: The results of the outlier trimming process are given in Table 1, in which Pout values and periods of extreme data values corrected for each station are tabulated.The result of the outlier identification depicts that the seasons that reach maximum values correspond to the rainy seasons of the respective sites in the eastern and southeastern parts of the country.The number of outliers identified has a clear periodic cycle.For all of the observing stations, a higher number of outliers were detected for the respective months in the 'MAM' and 'OND' seasons, while only three extreme data value were detected during the 'JJAS' season for Deghabur (1992) and Kebridhar (1992Kebridhar ( & 1993)).Furthermore, almost all of the months for which outliers were detected coincided with an El Niño episode; none has occurred during a La Niña episode.Such coincidences of anomalous data points with ENSO is not a matter of arbitrary chance, as the ENSO state is known to modulate the seasonal pattern of rainfall in some regions, particularly in the Tropics (Hastenrath, 1995).Specifically, for East Africa, it has been well known that an El Niño episode during the OND period is associated with an above-normal rainfall, with chances for flooding.The effects of ENSO on the SOND seasonal rainfall in the study areas is assessed more quantitatively latter.
Thus, in the background of such mechanisms as a possible explanation for possibilities of anomalous weather patterns, specifically during the SOND periods; the overall agreement of the temporal distribution and seasonal characteristics of the rainfall and timing of outliers, with extreme data points occurring concurrent with ENSO episodes, supports the idea of the existence of a plausible physical mechanism, which can be attributed as a cause for the detected anomalous data, rather than taking such outliers as, merely, human-induced errors.Hence, in this particular study, outliers' adjustments have been carried out by trimming only very extreme data points (values > Pout), with the aim of reducing large distribution tails, as a means of preprocessing previous to testing the rainfall series for homogeneity and subsequent corrections, if any.Despite great care was taken to conserve as much information as possible about the extreme events, by removing data points that are only greater than the corresponding Pout values, the process followed might inflict the possibility of dismissing interesting climatological information about extreme events.Hence, in the background of a plausible physical mechanism as a possible explanation for such anomalous extreme events, as explained above, and considering the basic objective of the study, i.e., identifying RVF hotspots (areas with anomalously high rainfall), the original values of the extreme data points were restored, latter, and employed in specific studies concerning extreme values, as for example in cumulative monthly and daily rainfall analysis.
Homogeneity of rainfall series: Previous studies outlined that because of the inherent noise in a given time series, statistical homogeneity tests render results with some degree of uncertainty (Cheung, Senay, and Singh, 2008).However, the use of various statistical tests (Buishand Range, Pettitt's, and SNHT tests) and different testing variables (annual rainfall total and annual maximum rainfall total) employed, under the current study, enabled to assess the homogeneity more reliably.Table 2 lists the results of the homogeneity tests for rainfall series and comparative test statistics calculated by the three techniques.It is clear from the table that out of the five stations analyzed for homogeneity, for annual mean and maximum rainfall series, not a single station was categorized as class 2 and/or class 3. Accordingly, the rainfall data series of all the stations were deemed sufficiently homogeneous for subsequent analysis, trend analysis.Autocorrelation and persistence: For all the autocorrelations fallen within 95% confidence limits and standard errors of 0.1 (Figure 2) there was no apparent pattern observed (such as a sequence of positive autocorrelations followed by a sequence of negative autocorrelations).This means that there was no associative ability to infer from a current value of the time series to the next value.Such non-association was the essence of randomness; in that adjacent observations did not "correlate", and that no dependency and periodicity apparently existed in the time series from one year to the next.

Trends
Annual and seasonal rainfall trends: The Mann-Kendall trend test shows that there was no significant trend towards wetter or drier conditions in in the southern and southeastern part of the country, for more than two decades, either for the annual or seasonal rainfall totals.This might be due to large inter-annual fluctuation of rainfall in the region.However, while interpreting the results of trend analysis, it is well noted that trends are real, yet insignificant, with, for example, a decreasing trend of annual rainfall depicted in most of the stations.In regard to the main rainy season in the region ('MAM' season), a notable observation was depicted with a decreasing trend in the rainfall totals for, almost, all of the stations studied, with the exception that an insignificant increasing trend was noted at Kebridahar station.Another interesting observation from the trend analysis was the consistent increasing trends of rainfall portrayed for the small rainy season (OND) for all of the stations studied (Table 3).
Despite the fact that the trends are insignificant, the results indicated a general increase in the OND rainfall total over the southern and southeastern part of the country, with the OND rainfall at the stations of Deghabur, Kebridahar, Mega, Moyale and Yabello had increased by 41.54, 41.78, 54.42, 25.90, and 73.31 mm, respectively, for over the last two decades.Monthly rainfall trends: Considering rainfall during months of the small rainy season, encompassing the two months prior to the outbreaks of the RVF in eastern Africa (October and November) and the first two months of the RVF epidemic periods (December and January), an increasing trends were observed at most of the stations in the region, for all the months, except for November, for which a decreasing trend were depicted for Moyale and Mega stations, and December, with decreasing trend observed at Kebridahar station (Table 4).Despite the very little variability of the stations with regard to the magnitude and direction of trends, increasing trends, yet insignificant, of rainfall were observed for all stations and for all the months preceding and concurrent with the reported RVF outbreak months for East Africa.
In general, there was a positive trend in rainfall of the small rainy season over the southern and southeastern part of the country.However, neither of the trends was statistically significant.This result agrees with previous findings regarding the trends and spatial distribution of annual and seasonal rainfall in Ethiopia (NMSA, 1996;Cheung, Senay, and Singh, 2008).Cheung et al., (2008) argued that there are no significant changes or trends in annual rainfall at the national level in Ethiopia.It has also been confirmed that, between 1951 and 2006, no statistically significant trend in mean annual rainfall was observed in any season in the country (NMSA, 1996).However, several previous time-series studies of rainfall patterns in Ethiopia carried out at various spatial (e.g., national, regional, local) and temporal (e.g.annual, seasonal, monthly) scales have depicted many contradictions regarding their findings on annual and seasonal rainfall trends and climate extremes in the country.Both Seleshi & Zanke, (2004) and Verdin, Funk, Senay, and Choularton, (2005) have reported a decline in trend of annual rainfall totals in southern and southeastern Ethiopia.Seleshi & Zanke, (2004) have, further, indicated a decline in Kiremt rainfall in those areas.Despite such contrasting reports, on larger spatially aggregated scales, the result of rainfall trend analyses in the current study were found to be quite consistent and in agreement with most of previous similar studies that have concluded that there is no significant trend in annual and seasonal rainfall totals over the country.

Monthly and Daily Cumulative Rainfall Totals
Historically, for East Africa, the first reported case of Rift Valley fever disease outbreaks paralleled with or followed the short-rainy season (October-December).Such patterns were attributed to the influence of ENSO and other large scale teleconnections that is found to consistently result in anomalously above normal rainfall and flooding in the region, creating conducive climatic conditions for the vectors that carry or transmit the RVFV.In this study, for all the stations studied, similar pattern of consistent and above-normal rainfall is depicted in the monthly and daily rainfalls accumulated over the three-months period (OND) of the recent RVF outbreak periods in East Africa  Figure 3 further indicated that rainfall during both the 1997/98 and 2006/07 short rainy season (SOND) was poorly distributed in the study areas, with high concentration of the rains close to the actual reported RVF outbreak month (December) for the HoA region.The cumulative rainfall for September and early October were very much low (less than 150 mm), depicting the previously drier conditions prevailed in the region.The increase above 200 mm cumulative rainfall had only occurred closer to the month of December.
The ten-day cumulative rainfall anomaly for each of the months from October-to-December 1996/97 and 2006/07 (Figure 4) also depicted that the OND seasonal rainfall was anomalously high, and that most of the individual months had received anomalously above normal rainfall.Moreover, the results for the Precipitation Concentration Index (PCI) analyses for Deghabur, Kebridhar, and Yabello depicted a high to very high concentration of the monthly rainfall distribution (PCI > 20%), in agreement with the findings of the cumulative rainfall analyses.However the PCI values for the remaining two stations, Mega and Yabello, indicated uniform monthly distribution of rainfall in the areas.The calculated mean PCI values, for all stations, are portrayed in Table 4. Thus, the persisted tendency for wetter-than-normal conditions during the 1997/98 and 2006/07 small rainy seasons (ONDJ) were an ideal condition for potential flooding in the region.Furthermore, the probability of persistence of RVF virus in the study areas, owing to the perceived high risk for the RVFV infection to enter into the greater mosquito vector population of these areas was, thus, much higher owing to the conducive climatic conditions depicted in the findings of this study.Hence, had the infection entered in the grater mosquito population of the study areas, a significant risk of transmission of the diseases to livestock raised along communal grazing lands and to those which are kept in areas close to the over flooded areas would have been evident.

Seasonal Teleconnection Patterns and Strengths
The seasonal rainfall variability for the selected study areas in the S-and SE-Ethiopia, and the associated modes of SSTs (Niño3.4 and IOD), are shown in Figure 5, depicting a similar pattern of variability in the seasonal (SON) average regional rainfall as that of the equatorial east Pacific SST anomaly and the Indian Ocean Dipole (IOD) mode index (MDI).The strengths of linear relationship between the regional average rainfall total and the Niño-3.4SST anomaly index and the Indian Ocean MDI, for all individual months and for the average of the SON period, are also shown in Table 5 and 6, respectively.The physical mechanisms through which the large scale teleconnection signals from the Pacific Ocean, or from any other location, arrive in Ethiopia and subsequently influence the spatial and temporal pattern of rainfall in the country is outside the scope of this paper.As depicted in Table 5 and 6, the association of the regional average SON rainfall totals with both ENSO and IOD modes, in early months (January-July), is weak, and increases progressively as the time of the ENSO and IOD states approaches the beginning of the small rainy season (OND) in the southern and southeastern Ethiopia.The correlations are moderate, 0.57 and 0.52 for August and September ENSO modes and 0.51, 0.48, and 0.60 for the July, August, and September Indian oceans' DMI, respectively, suggestive of some predictability of the average regional rainfall patterns two to three months in advance of the month for the first reported case of RVF in East Africa (December), based, solely, on the ENSO and IOD states of the respective individual months.In a similar manner the moderate positive correlations (> 0.5) between the regional average SON rainfall totals and the Nina3.4SST and the Indian Oceans mode indices, happening simultaneously for the respective months (SON), also indicated the possibility for predicting the pattern of the average regional rainfall for SON, corresponding to the period preceding the reported RVF epidemic month (December) in East Africa, given that the large scale ENSO and MDI indices for the respective SON months are predicted and made available two to three months in advance.The correlation results for individual stations also indicated a persistent and statistically significant positive correlation (significant level= 0.05) among the average SON rainfall total anomalies for each of the individual stations and the average of the ENSO (Table 7) and IOD (Table 8) modes occurring nearly simultaneously (in SON).The stronger correlation for the regional average SON rainfall totals than the correlations for any of the individual stations could be due to the filtering effects of spatial aggregation with respect to the random variability present in single location rainfall.
The stronger patterns of relations between the small rainy seasons' rainfall in the southern and southeastern Ethiopia and global SST indices (both ENSO and IOD) depicted in this study were similar with the findings of Degefu et al., (2017), except for the length of the season considered.Degefu et al., (2017) indicated that rainfall variations during October and November depicted similar statistically significant patterns of positive correlation between the Niño3.4 and IOD indices and gridded rainfall over southern Ethiopia.It is also in agreement with the findings of similar studies reported for equatorial east Africa, mainly for Kenya and Tanzania (Black et al., 2003;Saji et al., 1999).

Predictability of the Regional Average SON Rainfall Total Anomaly
Predictive models developed: Different models are fitted for the regional-average SON rainfall total anomaly.For example, the first model regresses the regional average SON rainfall total anomaly over both the Niño-3.4 SSTs anomaly index and the Indian Ocean MDI, averaged for the same period (SON), while the fifth model diagnose the predictability of the regional SON rainfall anomaly by fitting a MLR model with Niño3.4SST anomalies and the Indian Ocean MDI, with different monthly lags, expressed as combinations of atmospheric and oceanic predictors whose values are available upon completion of the September month to predict the regional average SON rainfall total anomaly.In general, the predictability of the regional average SON rainfall total anomaly is investigated by using either of the Niño3.4SST and MDI, independently, or in combination of both as potential predictors.
Historical records for the Niño-3.4 SST andIndian Ocean MDI indices over 1987-2006, which showed moderate to stronger correlation (r ≥ 0.4) with the regional average SON rainfall total anomaly (Table 5 and 6), were included in the respective models, as predictors.The retroactive calculation procedure, employed in this study, resulted in 10 years (1997/98-2006/07) of independent forecast data.The MLR fitting procedure also provided stable estimates for the coefficients of the resulting model equations, for both the cross validation and the retroactive hindicasts.
The performances of the models over the 20 years period are presented, here, in terms of the Kendall rank correlation coefficients, which is commonly referred to as Kendall's tau (Thiaw, Barnston, and Kumar, 1999).Kendall's tau is a measure of rank correlation, and is considered a robust (to deviation from linearity) and resistant (to outliers), alternative to Pearson's or "ordinary" correlation.Kendall's correlation measures the discrimination skills of the respective models (do the forecasts increase and decrease as the observations increase and decrease?).The model equations are presented in Table 9, along with the respective Kendall's tau correlation coefficients.As depicted in the table, the discrimination skills of the models improved from the first (Model#1) to the last model (Model#5), in response to addition or removal of the large scale predictor variables.

Forecast Verifications
Retroactive forecast skills: The findings of the forecast verification process, over the 10 years retroactive period, from 1997/98 to 2006/07, using the 'verification' option in the IRI's CPT, are outlined in the following sections.+ The coefficients and the overall "goodness of fit" of the models are statistically significant at the 95% level Relative operating characteristics: All the models showed maximum skills (ROC scores > 0.7) in successfully distinguishing the above normal and below normal rainfall categories as opposed to the normal category, for the SON period, as shown in the Relative Operating Characteristic (ROC) diagrams for the categorical forecast measures (Figure 6). Figure 6 also presented the ROC scores (in the right bottom panels in the diagrams) obtained by retroactively predicting the regional average SON rainfall total anomaly for the S-and SE-Ethiopia over the 10 years retroactive period from 1997/98 to 2006/07.The ROC scores for the normal category are less or around 0.5 for all models, except for the second model, which employed the average SON DMI as the only predictor of the regional average SON rainfall total anomaly.Reliability: Among the five models constructed to forecast the regional average SON rainfall total anomaly over the S-and SE-Ethiopia, the highest ROC scores were obtained for Model#1.This section, thus, elaborates on verifying how reliably this specific model could discern the pattern of the SON rainfall anomaly in the region.Figure7 shows the reliability diagrams for forecasts of the regional average SON rainfall total anomalies produced by Model#1.
Figure 7. Reliability diagrams and frequency histograms for the regional average SON rainfall total anomaly forecasts; (a) for above-normal (>67th percentile) and (b) for below-normal (<33rd percentile) categories.The predictors are the Indian Ocean MDI index for the month of September and for the seasonal (SON) average.The thick Red curves represent the observed relative frequencies for the above-and below-normal categories and the thin red line is the weighted least squares regression line for respective reliability curves The figure presented the reliability curves for the above and below normal categories along with weighted least squares regression lines for the two categories.The weighting is relative to how frequently forecasts are issued at a given confidence level (Landman, DeWitt, Lee, Beraki, and Lötter, 2012).Regression lines along the diagonal of the reliability diagram imply perfect reliability, while regression lines above (below) the diagonal imply that observed wet (dry) SON rainfall season tend to occur more (less) frequently than predicted.Furthermore, histograms plotted within the reliability diagrams outlined the frequencies with which the two categorical forecasts occurred and revealed how strongly and frequently the issued forecast probabilities departed from the climatological probabilities.
Deterministic skills: The previous section presented verification statistics for the probabilistic forecasts using the models developed.Similar to the discussion for reliability, the deterministic forecast performance for the model which showed the highest ROC score (i.e., Model#1) is presented here.
Figure 8 shows the plot of observed regional average SON rainfall anomaly index and the retroactive and cross-validated predictions for the regional average rainfall anomaly, for the same period, obtained by regressing the observed regional average SON rainfall anomaly with the Indian Ocean Dipole MDI values over the 10 years retroactive period, from 1997/98 to 2006/07.The model performed well in portraying most of the positive anomalous rainfall pattern (wettest condition) for the SON period corresponding to the three El Niño years (1997/98, 2002/03, and 2006/07), and the driest condition for the 1999/00 La Niña event, with the signs of the predicted rainfall anomaly during these years captured by the forecasts, emphasizing the already demonstrated strong positive correlation between the average regional SON rainfall total anomaly and the teleconnection anomalies.These findings are also in agreement with the results of the correlation analyses presented, which outlined that the rainfall anomaly in the region is highly correlated with a positive Indian Ocean Dipole mode (Table 6).The results from both hindcasting techniques are depicted in Figure 8.The short-term (SON) rainfall anomaly prediction captured the signs of the rainfall anomaly for 8 out of the 10 years of the retroactive prediction period, corresponding to a "hit rate" of 80%.However, despite the skills of the model in predicting the sign of the SON rainfall anomaly correctly during both the El Niño and La Niña events, the model underestimated the observed severity of the anomaly, especially for the 1997/98 SON period.Moreover, the amount of the overall variance of the regional average SON rainfall total anomaly pattern explained by the model depicted a moderate skill; with R 2 values of 0.423 (R = 0.651) and 0.39 (R = 0.621), for the retroactive design and the cross-validation approach, respectively.
One interesting observation is noted when fitting the relationship between SSTs and regional rainfall anomaly concurrently in time.When the average SON MDI is employed as a predictor, independently, the resulting MLR model (Model #2) explained 35.3% (r=0.59) and 26.4% (r=0.51) of the overall variance of the regional average SON rainfall anomaly for the cross-validated and retroactive approaches, respectively-far more than the other candidate predictor taken alone, Niño-3.4SSTs.Hence a simple regression, with the average SON MDI as the only predictor could also be sufficient (Figure 9).
Notwithstanding large amount of the overall variances of the average SON rainfall pattern in the region remained unexplained, overall, the fitted models could be taken for granted, specifically in view of the primary objective of this particular study, which is discriminating the anomalous pattern of rainfall during the three months period preceding the reported outbreak of RVF epidemics for East Africa, for which the model depicted higher skills.
Figure 9. Observed and model predicted standardized regional average SON rainfall anomalies.Dark black line (observed); deep red line (cross-validated forecast); deep blue line (retroactive forecast).The predictor is average SON MDI that is fitted to the regional average SON rainfall anomaly by simple linear regression

Conclusions
This paper documents a potential operational prediction of 'rainfall hotspot' areas at finer spatial scale, which can serve as early indicators of high risks for outbreaks of RVF epidemics.Diagnosis of the pattern of the short rainy season (SOND) over the southern and southeastern Ethiopia confirmed stronger correlations among the regional rainfall total anomaly, the convergence of ENSO conditions in the eastern Pacific, and the concurrent warming of SSTs in the western equatorial Indian Ocean region, similar to the rest of RVF endemic countries over the equatorial east Africa region.Examination of trends in annual and seasonal rainfall shows an absence of any systematic patterns of changes in the long term rainfall trend across the region.Thus, yet insignificant, the observed seasonal rainfall trends over the study areas are, thus, attributed to be the result of the large scale teleconnections and the associated atmospheric and oceanic driving forces.Analysis of the regional rainfall anomaly pattern, concurrent with the late 1997/ 2006-and early 1998/ 2007 outbreak periods in the HoA region, indicated that a similar RVF conducive climatic situation had been persisted over the southern and southeastern lowland parts of the country.Accordingly, had the RVFV been entered into the greater mosquito vector populations in this part of the country, it would have taken no longer before the establishment, spread, and the occurrence of RVF disease at an epidemic proportions.
The moderate to high correlations between the regional average rainfall anomaly and the large scale teleconnection variables suggested some predictability skills two to three months preceding outbreaks in neighboring RVF endemic countries.The study demonstrates the capacity of a simple localized climate prediction models in skillfully mapping climatically suitable RVF (rainfall) hotspot areas, based on emerging developments of ENSO and other regional climate indicators.Thus localized climate prediction models are invaluable as early indicators for areas having high risk for RVF potential outbreaks and could be taken as an important and integral part of the disease surveillance.Such models deemed critical in mounting effective localized response against the disease, in accordance with available local resources, and help reduce the impact of outbreaks of vector-borne diseases such as RVF.

Figure 2 .
Figure 2. Autocorrelation of long-term annual rainfall totals for (a) Deghabur (b) Kebridahar (c) Mega (d) Moyale (e) Yabello (1997/98 and 2006/07)  as well, which indicates ideal ecological conditions for Rift Valley fever mosquito vectors emergence and survival.The daily and monthly rainfall totals accumulated over the three months (September -November), preceding the reported outbreak period of the RVF activity for east Africa (outbreaks in December 1997/98 and 2006/07), and including the actual outbreak month (December, 1997/98 and 2006/07) were shown in Figure3, for Deghabur, Mega, Yabello, and Moyale stations.

Figure 4 .
Figure 4. Cumulative ten-day rainfall anomaly for the 1996/97 and 2006/07 OND short rainy season a) Deghabur b) Kebridahar c) Mega d) Moyale e) Yabello stations, respectively.Black bars indicate 10days cumulative rainfall anomaly for each ten-day period of the Oct. to Dec. months and deep red bars indicate the cumulative seasonal (OND) rainfall total anomalies for the years 1996/97 and 2006/07, respectively, in order of appearances within the figures

Figure 5 .
Figure 5.Time series plots of standardized regional rainfall total and SSTs anomalies for (a) Deghabur (b) Mega (c) Moyale, and (d) Yabello.The figure depicts time series plots of the regional rainfall total anomaly (dark black line), Niño3.4SST anomaly (deep red lines), and Indian Ocean Dipole Mode Index (deep purple lines), all averaged for the SON season and for the period 1994-2006; The SST indices are computed for the same period as the rainfall indices

Figure 8 .
Figure 8. Observed and model predicted standardized regional average SON rainfall total anomalies; (a) Retroactive hindicast technique and (b) Cross-validated hindicast technique , PCI values below 10 indicate uniform monthly rainfall distribution; values between 11 and 20 indicate high concentrations of monthly rainfall distribution; and values of 21 and above indicate very high concentration of monthly rainfall distribution.Subsequently the results from the cumulative rainfall analysis and the calculated PCI values were interpreted together.

Table 1 .
Results of the outlier trimming process * Pout = UQ + 3IQR, where UQ is the upper quartile and IQR is the inter quartile range

Table 2 .
Comparison of results of the various homogeneity tests * Indicates non-significant change points at 0.05 significance level; **indicates significant change points at 0.05 significance level

Table 3 .
Trends of annual and seasonal rainfall totals in southern and southeastern Ethiopia for the period * ZMK is Mann-Kendall trend test; slope (Sen's slope) is the change (mm)/year; bold values indicate statistical significance at 95% confidence level according to the Mann-Kendall test

Table 4 .
Trends of monthly rainfall totals and mean PCI values for selected five stations for the period 1987-2007

Table 5 .
Correlations between regional average SON rainfall total anomaly and the Niño-3.4SST index for individual months from January to December, and for average SON SST

Table 6 .
Correlations between regional average SON rainfall total anomaly and the Indian Ocean Dipole mode index (DMI) for individual months from January to December, and for average SON DMI * 0.48 * 0.60 * 0.72 * 0.84 * 0.50 * 0.77 * * Values in bold are significant at significant level of 0.05

Table 7 .
Correlations between average SON rainfall total anomalies and Niño-3.4SSTindex for the study areas, for the period

Table 8 .
Correlations between average SON rainfall total anomalies and the Indian Ocean Dipole Mode Index (DMI) for the study areas, for the period

Table 9 .
Model equations and the corresponding performance scores for each of the models