Discriminant Analysis to Predict the Hypertension in Women Aged 25 – 54 Years

Background: Hypertension is generally associated with the contributing risk factor for cardiovascular disease in adults. This research is aimed to create a prediction model of the hypertension incidence in women aged 15 through 54 years. Material and Methods: The research subjects are 117 women whose ages range from 27 to 54 years living in the village in the central district of Bogor. Through the instrumentation and Vo2 max measuring performed, the information was gathered concerning the following aspects: a) socio-demographic status; b) the abdominal girth; c) fasting blood glucose level; d) body mass index; e) blood lipids including the total cholesterol and triglycerides. The data analysis was conducted using discriminant analysis. Results: The results of multivariable discriminant analysis showed that the level of Vo2 max is the only distinction maker of the incidence of hypertension with the final equation model Zscore = -3.033 + 0.102*Vo2 max and the cut off point -0.00018. Conclusion: Concerted efforts from all concerned parties are needed to prevent the hypertension especially through the physical activities relevant to a more quality lifestyle.


Introduction
Hypertension is identified as a major risk factor for mortality all over the world (WHO, 2013).It used to be regarded as a prevailing problem in high income countries.Yet, it turns into a global issue which contributes to elevated risk of suffering from cardiovascular diseases in the rich and poor countries as well (Lloyd-Sherlock et al., 2014).More than 80% of death caused by cardiovascular disease takes place in the low and middle income countries (WHO, 2014).Mills et al., (2016) estimated that in 2010, the number of hypertensive patients worldwide was 1.39 billion people, representing approximately 31% of all adult populations.This amount means that there was an increase of 5.2% in the prevalence of hypertension between 2000 and 2010 globally.This data also illustrates the increasing gap in the prevalence of hypertension between high-income countries and low-income countries, as well as middle-income countries.Between 2000 and 2010, the prevalence of hypertension decreased by 2.6% in high income countries, while in countries with low and middle income increased by 7.7%.The number of people with hypertension living in low-and middle-income countries is almost twice as high as around 1.04 billion compared to it in high-income countries that amount to only 694 million people.The difference in overall burden due to hypertension in high-income countries with it of low-income (middle) countries is magnified by disparities in the level of awareness, treatment, and level of control of hypertension (Bloch, 2016).
The prevalence of hypertension is markedly on the rise particularly in terms of the increasing proportion of hypertension among women.The proportion of hypertensive women in America increases twice as much as the proportion of the hypertensive men in the period of 1998-2004(Wolz et al., 2000)).The prevalence of hypertension among women all over the world is predicted to rise by 13% between 2000 and 2025 (Cutler et al., 2008).According to the data of US National Center for Health Statistics (NCHS) in the year of 2015 the proportion of hypertension in women aged 18-39 years in America (men 9.2%; women 5.6%), aged 40-59 years (men 37.2%; women 29.4%) was relatively lower than that of men.With the age group of 60 above, however, there are a lot more women with hypertension than men (men 58.5%; women 66.8%) (Fryar et al., 2015).
Few researches devote their attention to women as the risk group despite the fact that this group has a greater risk of suffering from hypertension and obesity than men (Dogan, Toprak, & Demir, 2012).The emphasis is placed even more on the overweight and some illnesses associated with cardiovascular, yet little remains uninvestigated concerning other contributing risks such as the physical activity particularly the physical fitness by the virtue of which the blood pressure elevation can be prevented in women.With the difference in the population groups, the risk factor is to be more importantly identified in terms of the high blood pressure in women in order to carry out a specific intervention measure and minimize the likelihood of the development of cardiovascular problems in the future.
The purpose of this research is to predict whether a respondent tends to be included in the hypertension or non-hypertension group of women in Bogor in 2016 using discriminant analysis that results in a discriminant model equation to predict if ones of the woman group in Bogor are prone to hypertension or not.This discriminant function equation yielded will offer an appropriate prediction to classify individuals into the proper groups based on the independent variable scores and evaluate the classification accuracy.Identification of the occurence of hypertension using discriminant analysis is a type of classification statistical techniques (Filiz & Yaprak, 2009).

Materials and Method
This research constitutes an analytical observation research with a cross sectional design.Samples are purposefully selected of the women population aged 25 years to 54 years in two villages namely Paledang and Gedung Bogor village in the municipality of Bogor.This research was conducted from July to August in 2016.As many as 117 women aged 25 to 54 years were recruited from 2 villages which were randomly selected, Paledang and Gudang Bogor village in the district of central Bogor.
All respondents were not currently pregnant and were able-bodied and able to sit straight up for anthropometric measure.The blood pressure is measured twice on the left arm using the digital tensionmeter "OMRON" HEM-7130.When a difference of 10 points presents itself between the first measurement and the second measurement, the third measure is to be conducted.The measuring of body weight was conducted using AND digital scale whose weight capacity is 150 kgs with the accuracy of 50 grams.The stature was measured using a metal measurement which had been validated.To achieve the standard and validity of the blood check, the examination was conducted by a clinical lab Prodia in Bogor.Glucose hexokinase II (GLUH) was employed to determine the blood glucose level, while enzymatic method was used to gain the total cholesterol and Glycerol-3 phospate oxidase method (GPO) for the triglycerides level.The whole blood check was conducted using automatic analytical tool of Hitachi 747.
The fitness test indicated by the level of Vo2max was carried out using ergocycle method.The respondents were required to take ECG test (Electrocardiogram) prior to pedaling.They were told to sit on the Monarch bike prepared and do the warm-up for about five minutes by pedaling with a constant speed of 50 rpm along with gradual increase of load to reach 300 kpm (1kp).The workload assigned is 600 kpm (2kp) for men and 450 kpm (1.5kp) for women with constant speed of 50 rpm for 5 minutes.When the heart beat reaches 170 bpm or when a cry of complaint or a sign of pain was seen such as a chest pain, tightness, the experiment was immediately stopped.
The limitations of this study that may give rise to a possible bias include the fact that not all variables of hypertensive covariates were gathered since they are not continuous in form such as dietic pattern, smoking behavior and education variable.Possibly, other covariate data failed to be put together such as emotional mental stress, salt intake and etc, which constitute hypertension risk factors.A fitness test indicated by the measuring of Vo2max with some inclusive considerations gave rise to the involvement of only some selected respondents which means that the respondents are not regarded as a general population.Lastly, the results of research cannot provide a thorough causal conclusion due to the cross-sectional design of the research.Aside from the limitations of the data, researchers gathered information about the physical activity factor which sheds light on the level of fitness, an important variable for the incidence of hypertension.

Subject
As many as 117 women aged 25 to 54 years were recruited from 2 villages which were randomly selected, Paledang and Gudang Bogor village in the district of central Bogor.All respondents were not currently pregnant and were able-bodied and able to sit straight up for the anthropometric measure.

Analysis
Discriminant analysis is a collection of multivariate analysis techniques that use statistical methods to characterize or separate two or more classes of objects or events (Jang, Anderson-cook, & Kim, 2015).In many cases, discriminant analysis is parallel to multiple regression analysis such as linear regression.The primary difference between the two is that the regression analysis is closely related to continuous dependent variables, whereas discriminant analysis has a categorical dependent variable.The methodology employed to conduct discriminant analysis is similar to that of regression analysis.Discriminant analysis is carried out through a phase of variable selection to determine which independent variables are most influential.Moreover, there must be a residual analysis to further determine the accuracy of the discriminant equation (Qais Mustafa Abdul Qaser, 2015).
The first stage was to conduct univariable analysis to discover the data normality using kolmogorov smirnov test where if p value> 0.05 then data is normally distributed.Furthermore, conducting bivariate analysis is intended to find a correlation between variables.Correlation between variables reveals signs of collinearity between independent variables namely, if the value of r> 0.8; multicollinearity among variables is then ascertained.The equality of the variables can be viewed from the group covariance matrices against the Box's M, the value of Box's M> 0.05 means that group variance is relatively equal.The discriminant function was formed by conducting a stepwise discriminant analysis, where the variables were entered one after another into the discriminant model.The significance of discriminant function was unveiled by conducting the F test where p value> 0.05 signifies that the discriminant function can serve to reveal a clear difference between two groups of dependent variable.To determine the accuracy of the classification of discriminant functions, including the accuracy of the classification of individuals, Casewise Diagnostics was used (Nasution, Bangun, & Sitepu, 2018).
Discriminant analysis model constitutes an equation showing a linear combination of various independent variables that is: Description: Z jk = discriminant score for the discriminant function j for the object k.

α = Intercept
W n = discriminant score for independent variable X nk = independent Variable n for the object k

Ethical Approval
This research was conducted after obtaining the ethical research approval which is issued by the ethical committee of health research and development agency with the approval number: LB.02.01/5.2/KE.257/2016.The after-elucidation approval is requested for all respondent candidates in a written form and signed prior to being the respondents of the research

Results
The Table 2 shows that there were 117 respondents successfully interviewed and the data thereof gathered with the average age of the respondents of 41.76 years.The socio-economic status gauged by the average amount of household expense on a monthly basis was Rp 1,878,698.The average body mass index was 25.15 and the average abdominal girth was 82.807.The average triglycerides value was 121.81 meaning that most of the respondents basically have a normal level < 150 mg/dL (Cleemant, 2001).Fasting blood glucose was also quite normal which is below 110 mg/dL (Cleemant, 2001).The last one was the fitness level which is indicated by the level of Vo2 max which is approximately 29.78 ml/kg/minute.The result of Kolmogorov Smirnov test to ensure that the normal distribution of the data is that almost all variables encompassing the age, socio-economy, body mass index, the abdominal girth, the total cholesterol, triglycerides, fasting blood glucose, and Vo2max indicated by P value > 0.05, meaning that the data is normally distributed.The correlation test using Pearson correlation among variables was conducted to find out the collinearity among the independent variables indicating that r value ≤ 0.8 signifying that there is no multicolinearity.The value of Wilks Lamda ranging from 0-1, if the value is close to 0 then the data for each group tend to differ, whereas if the value is closer to 1, the data for each group tend to be similar (Bian, 2011).The body mass index, the abdominal girth, and Vo2max variable with P value < 0.05 means that there is a difference between groups or respondents that are classified into hypertension and non-hypertension group.The significant value of variable "age", "socio-economy", "total cholesterol" and "triglycerides" respectively 0.103; 0.436; 0.101; and 0.847, means that the four variables do not impose effects on the respondents in terms of hypertension and non-hypertension experience.To create the cut off score (the limit value) below is the formula used: Where: Classification is carried out by comparing the Zscore on the critical cutting score that is -0.00018 meaning that an observation is classified into one group when the observation Zscore obtained is less than -0.00018.If the observation Zscore is greater than -0.00018, the observation is categorized into the second group.After the classification conducted on the data above the result obtained is as follows: one is considered being of non-hypertension if Z cu ≥ 0.00018 and of hypertension if Z cu < 0.00018.
From the output of SPSS it is known that the precision of prediction of the model yielded to elucidate the incidence of hypertension is 62.5%.To take account of the possibilities of various biases, the predicting power test was conducted using Leave-one-out-cross validation method, and the same result was obtained 62.5%.Based on the accuracy value of this model the classification yielded in this research is regarded as good because the discriminant model has a precision level >50%, which is considered a pretty high precision model, so to speak.

Discriminant Function Interpretation
The discriminant function obtained is: The sample case: if one has a fitness level indicated by Vo2max=43 ml/kg/minute, then the data entered into the discriminant function: Z score = -3.033+ (0.102 x 43) = 1.35.Therefore, the score 1.35 is greater than the cut off score Z CU= -0.00018, the person is predicted to belong to the non-hypertension group.If Vo2max=20 ml/kg/minute, then Zscore = -3,033 + (0,102 x 20) = -0.99.Therefore, when the score is -0.99 which is a value less than the cut off score Z CU= -0.00018, the person is predicted to belong to hypertension group.

Discussion
Hypertension is a silent killer that rarely shows symptoms.Increasing public awareness of this disease is the key to access to early detection.Increased blood pressure is a serious warning sign that indicates the need for significant lifestyle changes.People need to know that increasing blood pressure is dangerous for them, so steps need to be taken to control it.To control blood pressure, it is necessary to do healthy lifestyles such as doing balanced diet, reducing salt intake, avoiding alcohol and tobacco use, and making regular exercise (WHO, 2013).Lack of physical activity is a global health hazard and is a common and increasing problem, both in developed and developing countries.Physical activity and fitness are important variables for mortality and morbidity associated with overweight and obesity (Ministry of Health and Family Welfare, 2011).The mechanism of the trend of overweight and obesity as risk factors for hypertension in many developing countries has been widely studied and associated with nutritional transitions, urbanization, technological development, globalization of the food market, and rising income (Appiah, Steiner-Asiedu and Otoo, 2014).
This research is the first research to be conducted in Indonesia that relates the fitness level (Vo2max level) to hypertension diseases aside from several other factors which are commonly known as the risk factors for hypertension diseases.The results of discriminant analysis reveal that predictor factor between a group of hypertension sufferers and a non-hypertension group is the fitness factor (Vo2 max), while factors such as age, socio-economy, body mass index, the abdominal girth, total cholesterol, and triglycerides do not serve as predictor variable.There are some possible reasons why variables such as age, socio-economy, body mass index, the abdominal girth, total cholesterol, and triglycerides are not considered as distinction makers.Firstly, upon determining respondents-to be, fitness test was conducted to measure the level of Vo2max that led to the selection of the ones whose medical history does not show cardiovascular disease and severe obesity experience that are not considered general population.Second reason relates to smaller samples and low exposure to risk factors that are measured and investigated.Research indicates that age, socio-economy, body mass index, the abdominal girth, total cholesterol, and triglycerides are not significantly related to the occurrence of hypertension.Finally, the low exposure to several factors investigated, several important factors are not included in the prediction model.
The positive benefits of the health, social and emotional aspects obtained are clearly the main reasons for advocating physical activities for the entire community, although to date, the level of physical activity recommended in many countries is the same, both for boys and girls, as well as for adults (Hands & Parker, 2016).
Research conducted by Stockie, (2009) in Canada stated that there is a relationship between socio-economic status and the level of physical activity in adolescents.It is stated that income, employment, and education will affect the level of physical activities, including access to fitness centers.The access includes registration fees, selection of programs offered by the fitness center and transportation from home to it.Complex socio-economic factors contribute significantly to the level of physically active days (Miklankova, Gorny, & Klimesova, 2016) Research conducted in Turkey found that obesity is a major risk factor for the occurrence of hypertension.They stated that the risk the obese have is 1.68-4.94times greater than the respondents with normal body weight.In their research they also stated that the socio-economy factor is a crucial risk factor for the hypertension with OR of 1.47.Low income families consumed bread and meals made of flour or wheat that are filling yet fattening.In addition, most of the people lived in the rural areas and did not have a permanent job.Hence, little did they care about their physical appearance and their reluctance to change their lifestyle caused the high prevalence of hypertension.On the other hand, most of the hypertension patients did not have any access to the health care for the proper treatment because of the low income (Dogan, Toprak, & Demir, 2012).Another study indicates that the low household income is related to the high prevalence of hypertension (Juraschek et al., 2014) the research conducted in 2014 finds that the high level of fitness is related to the low hypertension.Furthermore, among the respondents with hypertension, the level of fitness has a strong and reverse relationship with the incidence of hypertension.
The research conducted by Cocks et al., (2013) portrays the cross-sectional relationship between the level of fitness and the occurrence of hypertension.Furthermore, research conducted by (Shin & Ha, 2016) to 1201 women aged 30 to 59 in South Korea showed that a person's health and blood pressure levels are related to the fitness level.Systolic blood pressure is significantly correlated with muscle endurance and strength, while diastolic blood pressure is significantly correlated with muscle endurance.Several other studies have shown that people with hypertension, are physically considered less active than those without hypertension.High fitness factor (Vo2max) has been shown to be a protective factor for the further development of pre-hypertension towards hypertension, (Faselis et al., 2012) deaths from coronary heart disease, as well as other risk factors for cardio vascular disease (Faselis et al., 2014).Sedentary behavior associated with low levels of Vo2max is common among the world community today and is associated with a group of cardio vascular risk factors including high blood pressure, total cholesterol, body mass index, and obesity level, low cholesterol high-density lipoprotein.Conversely there are also those who argue that hypertension may directly cause low fitness (Sharman, La Gerche, & Coombes, 2015) Several researches conducted in several countries shed light on how the fitness factor can have an effect on the hypertension incidence amongst other things, the research conducted by Cocks et al. in 2013 indicated that the intensive bodily exercises result in the elevated production of the synthesis of endothelial nitric oxide, lower aortic stiffness, and increased insulin sensitivity of the entire body (Cocks et al., 2013).Other researches find that bodily exercises lower down the circulation of noradrenalin and decrease the vascular resistance.This is in line with the research that shows high level of fitness is associated with the low body weight which is the crucial risk factor for the incidence of hypertension (Spartano et al., 2016).
Zhang, Chen and Hu (2011) argued that the mechanism of physiological and biological fitness in terms of the blood pressure is intricate and is not yet fully understandable.It is not completely clear whether the beneficial effect results from the physical activity or other influential risk factors or pathophysiological mechanism.One of the important mechanisms is the weight loss will lower the blood pressure.Another mechanism includes the total peripheral resistance.The arterial pressure depends largely on the the cardiac output and the total peripheral resistance.Hence, every change in the arterial pressure after the exercises involves one or two these variables (Pescatello et al., 2015).The fact shows that the etiology of hypertension is multifactorial and it is unclear how these factors interact to contribute to the development of hypertension.Recent findings from animal studies try to show that aerobic exercise can prevent an increase in blood pressure through changes in insulin sensitivity and autonomic nervous system function (Moraes-Silva et al., 2013), while endurance training can prevent an increase in blood pressure through regulation of vasoconstriction (Araujo et al., 2013).Although the exact mechanism has not been fully explained, the available data have provided sufficient information to establish biologically plausible mechanisms for the relationship between physical activity and hypertension (Diaz & Shimbo, 2013).
The relationship between physical activity and preventing hypertension is generally weaker in older women compared to it of younger women.However, because the incidence of hypertension is higher in older age, increased physical activity can theoretically prevent a number of hypertensive cases in both the young population and the older population.Thus, actually the overall health benefits can be obtained if the lifestyle is changed for the prevention of hypertension are aimed at all ages (Diaz & Shimbo, 2013).Several other studies also reported that patients with high blood pressure in men and women showed a significant reduction after aerobic exercise and intensive training conducted for 6 months (Shin & Ha, 2016).This result is in line with the results of this study which showed a negative correlation between Vo2max and blood pressure in respondents, supporting the finding that there was a relationship between Vo2max levels and blood pressure.
Currently, hypertension is a major contributor to mortality and morbidity due to cardiovascular disease, whoce the prevalence continues to increase in the majority of women.The increasing number of hypertension at this time cannot be ignored and is only considered as an individual problem because of its significant impact on individuals and the health care system which causes social and economic burdens.The amount of burdens due to hypertension does not only require increased awareness, treatment, and control of blood pressure, but also a joint effort for primary prevention.Lifestyle changes are expected to result in a lower prevalence of hypertension.Hypertension can be prevented by a comprehensive strategy with the target of the general population, both in individuals and in groups at high risk of hypertension.Interventions, especially in changes in lifestyle, have a greater chance of success and being greater when the target is the older-age group compared to the younger-age group.

Conclusion
There is a significant difference between the subject with hypertension and the subject without which is proven by the analysis of Wilk's Lambda with P value≤ 0.05.Variable that makes a difference in the subject with hypertension and the subject without hypertension is Vo2max, which serves as a stronger discriminant factor.Discriminant function model obtained is: Z score = -3.033+ 0.102* Vo2max and the cut off point 0.00018.Discriminant function model has an accurate subject classification of 62.5%, the accuracy of the model is considered high and of great use to classify a respondent into hypertension and non-hypertension group.

Table 1 .
Operational Definition to distribute and use oxygen during the intense workout, measured in millimeters of the oxygen used in one minute per kilogram of bodyweight".The measuring of Vo2max uses ergocycle astrand method.

Table 2 .
Descriptive statistics of all women respondents in the district of Central Bogor

Table 4 .
To ensure that variables are equal, group covariance matrices are seen against Box's M. If the p value > 0.05 group covariance matrices are relatively equal.It is indicated by the result of the testing Box's M whose significance value 0.485 with p-value greater than 0.05.It signifies that the population covariance matrices are equal and there is no difference.It indicates that the data above fulfill the assumption of discriminant analysis that further discriminant analysis can be carried out.The final stage is forming discriminant function using Stepwise Discriminant Analysis method, where variables are entered into the discriminant model one after another.To test the significance of the discriminant function the significance of F test is to be seen, when p value > 0.05 it shows that the discriminant function is capable of showing a clear difference between the two groups of dependent variables.Casewise Diagnostics test is carried out to test the accuracy of the discriminant function classification.If the discriminant function has classification accuracy > 50%, the model accuracy is considered pretty high.Matrix Structure Coefficient, Eigenvalues (canonical correlation), Wilks Lambda, Variable Enter Removed, Canonical discriminant function coeficient, function centroids dan Final Discriminant model The benefit of this model or function is to predict if one respondent belongs to the hypertension or non-hypertension group.Functions at Group Centroids indicated the average discriminant score for subject in the two groups.The table above showed that the discriminant function has the same value, yet one group has a negative centroid whereas the other group has a positive centroid.Hence, there are two types of respondents which are called two group discriminants, where one group has a negative (group mean) that is -0.586 and the other group has a positive centroid (0.207).
closeness of relationship between the discriminant score and groups (two respondent groups) whose value is 0.332; showing that the relation is pretty close by the associative scale between zero and one.To see the significant value of the discriminant function formed represented by the Wilk's Lambda, the above output apparently indicated that the significant value of Wilk's Lambda test 0.000 (p < 0.05) showed a significant difference between the hypertensive respondents and non-hypertensive respondents.The result of Canonical discriminant function coefficient showed that the discriminant model or function formed is: Z score = -3.033+ 0.102*Vo2 max.