Gaussian Model to Predict the Risk of Developing Type 2 Diabetes Mellitus in Mexican Population Taking as a Reference Risk Factors

In this research work, an ordinal Gaussian model is constructed, whose objective is to predict the degree of risk of contracting type 2 diabetes mellitus (2DM), taking as reference the risk factors in the Mexican population. It is estimated that the Mexican population has a hereditary susceptibility to develop 2DM, however, the probability increases depending on risk factors; area of residence, background of parents with 2DM, tobacco consumption, alcohol consumption, physical inactivity, body mass index (BMI), and type of feeding, which, despite positively intervening in the appearance of 2DM, they can be modified to obtain the inversely proportional effect. However, the social, economic and political context are important components for the population. Risk factors, as explanatory elements of the prevalence of 2DM, are of the utmost importance to delay or control their early development, as some are factors that can be muffled. For the development of this model, the information published in the National Health and Nutrition Survey (ENSANUT) of 2012 was taken, based on the adult population 20 years of age or older. Among the most outstanding results is the higher prevalence of risk that women have with respect to men, and the fact that age is a fundamental basis for contracting type 2 diabetes mellitus.


Introduction
Diabetes is a chronic disease that occurs either when the pancreas does not produce enough insulin (It is a hormone responsible for regulating blood sugar) or when the organism does not use effectively the insulin that produces (American Diabetes Association, 2017a).
There are three types of diabetes:  Type 1 Diabetes (1DM) is characterized by the production of deficient insulin in the organism.
 Type 2 Diabetes (2DM) is when the organism does not use insulin efficiently.
 Gestational diabetes (3DM) is a transient disorder that is produced during the pregnancy and carries a risk of developing diabetes in a specific moment (American Diabetes Association, 2017a).
2DM is viewed as one of the biggest challenges for the public health, not only in developed countries but also in lower and middle-income countries (Barquera, 2003). Mexican population has been influenced through diverse factors over time, transitions lived by inhabitants in our country, have conditioned morbidity and mortality.
The main causes of mortality in Mexico are nontransmissible chronic diseases (NCD) for instance, 2DM and ischemic heart disease. It is important to mention that (NCD) are conditions that mainly affect to the adult population and are characterized by being incurable. These diseases accompany the person for his/her life, becoming a permanent sick person. On the whole, these conditions are asocciated with a level of degeneration of tissues and organs which are usually progressive, until terminal failure of an organ involved evolves (Trindade, Dos Santos, Dalva de Barros, & Silvia, 2014).
In 2000, one of the main causes of mortality in Mexican population was diabetes and was placed in the third place. In 2010 it went up to the second place, and in 2015, it maintained as the second cause of death in Mexico, having 98,521 deaths (Main causes of mortality- INEGI 2000INEGI , 2010INEGI & 2015INEGI , 2015. DM has become a problem of the public health in Mexico, thus prevalence of this disease has boosted alarmingly in the last decades (Medina, Tolentino-Mayo, López-Ridaura, & Barquera, 2017).
Risk factors considered for 2DM explain that the globalization of the economies, the modifications in the production systems and the uncontrolled urbanization have disturbed the consumer system and grown the exposure to risk factors to develop obesity and NCD such as harmful diets, physical inactivity, and consumption of tobacco and alcohol (Schmidhuber & Prakash, 2007).

Problem Statement
The objective of this research is to estimate the prevalence of type 2 diabetes mellitus, taking as a reference the risk factors that affect the health of Mexican population through the construction of probability model, taking into account the National Health and Nutrition Examination Survey (NHNES) 2012.
According to the objective of the phenomenon of 2DM, it will be in function of the related processes with the rapid urbanization, the economic growth and the probable genetic tendencies. The evidence shows a higher influence of elements such as changes in the consumption pattern and reduction of physical activity (Herná ndez-Ávila, Gutié rrez, & Reynoso-Noverón, 2013). Diabetes is a disease that is originated because of the combination of different factors, for instance, the age, the obesity, the sedentary lifestyle, the inadequate nutrition and the family history (Gonzá lez-Villalpando, Dá vila-Cervantes, Zamora-Macorra, Trejo-Valdivia, & Gonzá lez-Villalpando, 2014a). In Mexico, obesity and physical inactivity constitute risk modifiable factors for 2DM (Paternina-de la Ossa, Villaquirá n-Hurtado, Já come-Velasco, Galvis-Ferná ndez, & Granados-Vidal, 2018). For example, tobacco use, unhealthy diets, physical inactivity, harmful use of alcohol are consequences of the rapid urbanization and the lifestyles in the twenty first century and they have been showed in NCD (World Health Organization, 2018). The function of the type 2 diabetes mellitus: P(Y)=F(X 1 , X 2 , X 3 , X 4 , X 5 , X 6 , X 7 , X 8 , X 9 ) (1) Where: -F is the function of equation This information was gathered from the National Health and Nutrition Examination Survey (NHNES, 2012).

Methodology
Based on the objective, the construction and the development of the Gaussian Model is supported on the operations research methodology which is composed by five stages such as problem statement, construction of the model, model solution, model validation and implementation (Taha, 2012).
1. Problem statement: it was outlined the relation among the 2DM (dependent variable) in Mexico, based on independent variables of sex, age, residence zone and parents' background with 2DM, tobacco consumption, alcohol consumption, physical activity and food.
2. Construction of the model: the mathematical statement was made based on the relationships between the dependent variable and the independent variables.
3. Model solution: seeing that the mathematician statement, it was used different operations which were performed towards the construction of equation that indicated the level of risk to develop 2DM, taking into account the independent variables.
4. Model validation: in this stage, the stated assumptions of statistical inference were fulfilled, which were useful to validate the model. 5. Implementation: starting from the stated assumptions of statistical inference, it was undertaken with parameter interpretation; as well as, the possible scenarios of the phenomenon 2DM, taking as a reference its current dynamics.

Model Variability
The current model is ordinal in natural; therefore, it must fulfill with two assumptions. First of all, it should not be co-linearity among the independent variables and second of all, the degree of adjustment. The co-linearity is a problem of the regression analysis that focuses on the model predictors are related to linear combination (Peña, 1987).
The regression model has faced important consequences because if the predictors are found in the linear combination, the influence of each one of them can not been distinguished in the criteria due to they are sneaky one another. In addition, it does not show an explanation about the phenomenon to study, also the forecasts are not reliable since there is another combination of predictors which were introduced in the model and would vary the order and produce predictions in the contradictory criterion. (Peña, 1987).
Approaches based on the correlation of explicative variables, there is a method that consists of calculating the called "factors of inflation of variance" or VIF's defined as: j=1,2,...9 Where: R J is the coefficient determination of the regression of the umpteenth regressor over the rest.
The criterion of evaluation of co-linearity, hypothesis test is the following: Ho: co − linearity vs. Ha: there is no co − linearity Having a level of confidence of 0.95,with a level of significance of 0.05 and fulfilling with the assumption of there is not co-linearity among the independent variables (X1 woman , X2, X3, X4, X6, X7 and X9), the equation 26 explains 19.37% of risk dynamic of developing 2DM. X5 and X8 are not significant according to the model.
If VIF is higher to 10 units of each independent variable, it will be co-linearity, which means that Ho is approved. According to the criteria taken from table 1, it can be seen that the VIF of each of the independent variables can explain the 2DM variable, since they are below the 10 units. Consequently, Ha is approved, in other words, there is no co-linearity between independent variables.
The deviance of a GLM is defined as the degree of variability of the data that must be compared. The deviance of the invalid model with the residual deviance. This is viewed as a measure as soon as the model keeps the variability of the data, which would be:  Deviance of invalid model is of 457.14 units.
 Deviance of the residual is of 368. 61 units.
Substituting in the equation (3), it is obtained that: In light of the foregoing, the equation that foresees that 2DM keeps 19.37% of the variability of the data.

Results
The analysis of the result of the current model is based on the interpretation of the parameters of the following algebraic expression, si X woman, X ,X ,X , X ,X y X remain constant: In terms of normal distribution: The probability to develop 2DM is of 0.3339, that means that if there is no intervention of the explicative variables, any interviewee has a level of risk to develop 2DM of 33.39% , with a level of cofidence of 0.95. 0.3339, is the probability of developing 2DM, where independent variables do not intervene, this probability is interpreted as a genetic parameter, then this factor is read.
The literature explains that there is a genetic factor in Mexicans, as there are variants of the indigenous population that give a greater predisposition to suffer a metabolic disease such as diabetes (Bonilla, 2017). The genetic variant called haplotype is composed of five changes in a gene called SLS16A11. This haplotype was found associated with a predisposition of diabetes in the Mexican population. The haplotype explains 20% of the prevalence of 2DM in the Mexican population (Bonilla, 2017).
If X Woman and X , X , X , X , X and X remain constant, the probability of developing DM2 is 0.3594, that is, if the interviewee is a woman, she has a risk of developing 2DM of 35.94%, with a confidence level of 0.95.
In starting from the assumption that the probability of risk for the interviewees without focusing on the sex is of 0.3339, and female is of 0.3594, so the risk rate of women with regard to men is of 7.63%. . 763

Rr
As a result, women become susceptible to develop 2DM thus it is stated that: Obesity is the most visible risk factor in women to develop 2DM, in relation to men (Kautzky-Willer, Harreiter, & Pacini, 2016). In women eating disorders are more common than in men, these are mainly characterized by periods in which women eat without control, or the opposite, they do not eat for fear of gaining weight, and with this the probability of developing 2DM increases (American Diabetes Association, 2017b).
If X , and X woman, X , X , X , X and X remain constant, chance of developing 2DM is of 0.3372, in other words, the age is an important factor to develop 2DM. The interviewees will have a a level of risk of 33.72%, with a level of confidence of 0.95.
That is to say: In terms of normal distribution: For each elapsing year,the probability of risk of the interviewees to develop 2DM is approximately 0.98% that means of 1%.
. 98 The age related to 2DM argues that: A higher age, a higher risk to develop 2DM, more heart diseases and strokes (American Diabetes Association, 2017b). 2DM ocurrs in mature age over 45 years old (American Diabetes Association, 2017b). Even though, there are diagnoses of 2DM of an early age associated with obesity and physical inactivity.
If X , 2, or 3. y X woman, X , X , X , X and X remain constant ,chance of developing 2DM based on the residence zone, it will be interpreted in the following way: That is to say: In terms of normal distributuion: That is to say: In terms of normal distribution: On the basis of the table 2 and taking as a reference the context of genetic risk, it can be observed that people who live in their place of origin of the metropolitan area, they will have a higher risk of developing 2DM with regard to the other rural and urban areas.
In the metropolitan area there is a little more risk of developing 2DM, but there is very little discrepancy with the urban area, the difference is only 0.55% (35.04% -34.49%), it is less than 1%. However, it can be said that in urban areas the domain of refined and indu (27) If X , 2, or 3 and X woman, X , X , X , X and X remain constant, chance of developing 2DM on the basis of the residential area and its interpretation would be the folowing: That is to say: In terms of normal distribution: On the basis of the table 3 and taking as a reference the context of genetic risk, it can be observed that people who have both parents with 2DM, have a higher risk of developing 2DM with regard to other background of father and mother. In relation to the background information on the parents with 2DM, it is mentioned that: American Diabetes Association (2013) of genetic aspects of diabetes clarifies that diabetes is not a hereditary disease, it is just a pattern which has a difference of some hereditary traits. Nevertheless, there are some people who are more likely to develop diabetes than others. For instance, twin studies have showed results that genetic factors play a relevant role in the emergence of 2DM. That means that genetic factors are not enough because in a test in which twins have identical genes and anyone of them have 2DM, the risk of the other one is 3 of 4 (American Diabetes Association, 2013). Genetic susceptibility or lifestyle factors of someone diagnosed with 2DM, it is difficult to know if his/her situation was due to a family history or a risk factor; it is likely that both. As a result, some studies have demostrated that it is possible to delay or to prevent 2DM when a person does physical exercise and has a healthy body weight (American Diabetes Association, 2013). Developing 2DM might depend on diverse risk factors that include family history of 2DM;although, there is a probability of 11% of having 2DM until 70 years old (American Diabetes Association, 2013). Scientists claimed that the risk of developing 2DM in a child is higher when his/her mother has 2DM (American Diabetes Association, 2013).
If X , or 2 and X woman, X , X , X , X and X remain constant, chance of developing 2DM with alcohol consumption, it will be interpreted in the following way: That is to say: In terms of normal distribution : On the basis of table 4 and taking as a reference the context of genetic risk, it can be observed that people who consume alcohol, have higher risk to develop 2DM with regard to people who do not consume it. Alcohol consumption in relation to the 2DM, it can be explained that: Alcohol consumption increases or dicreases blood glucose levels, thus alcohol has calories. It is recommended to drink ocassionally and just when the level of sugar in the blood is controlled (Howard, Arnsten, & Gourevitch, 2004). Excessive alcohol consumption increases the risk of pre-diabetes and 2DM in women and men. On one hand, high consumption of alcohol increases the risk of abnormal glucose regulation in men. On the other hand, associations are more complex because there is a higher risk with a high consumption of alcohol (Cullmann, Hilding, & Östenson, 2012). Alcohol has efects in the genetics of diabetes for people who suffer type 2 diabetes mellitus. It is included aspects such as increasing obesity, inducing pancreatitis, alterations in carbohydrate metabolism and glucose. There are some periods of hypoglycemia when there is a high consumption of alcohol and there are also long fasting periods caused by the intake. However, there is another reason that can produce hyperglycemia, when people consume lower amounts of food or when they are coupled with meals (Dí az- Martí nez, et al., 2009). If X 7 =1, 2, or 3 and X 1 woman, X 2 , X 3 , X 4 , X 6 and X 9 remain constant with chance of developing 2DM based on the physical activity(FA), it is interpreted in the following way:  If X (FA, active) chance of developing 2DM is of 0.3519. That means that if the interviewee has an active physical activity, his/her level of risk to develop 2DM will be of 35.13%, with a level of confidence of 0.95.
[E Ŷ ] − .429 + . 69 + . 9 + . 5 − . 42 + . 79 + . 49 + . 79 That is to say: In terms of normal distribution:  If X 2 (FA, moderate) chance of developing 2DM is 0.3707, that means that if the interviewee has moderate physical activity, his/her level of risk to develop 2DM will be 37.07%, with a level of confidence of 0.95.
[E Ŷ ] − .429 + . 69 + . 9 + . 5 − . 42 + . 79 + . 49 2 + . 79 That is to say es decir: In terms of normal distribution:  If X 3 (FA, non active) chance of developing 2DM is of 0.3897, that means that if the interviewee has non active physical activity, his/her risk to develop 2DM will be 38.97%, with a level of confidence of 0.95.
That is to say: In terms of normal distribution: On the basis of the table 5 and taking as a reference the context of genetic risk, it can be observed that people who have non active physical activity, have a higher risk to develop 2DM with regard to the other two activities which are active and moderate. Physical inactivity in relation with the 2DM, it is claimed that: Physical inactivity is considered as one of the most important risk factors for mortality in Mexico. It is associated with the emergence and lack of control of diverse chronic diseases for instance, obesity, hypertension, 2DM, dyslipidemias, osteoporosis and cancers (Haskell, et al., 2007). Nonetheless, physical inactivity is a modifiable risk factor to develop 2DM because exercise raises glucose uptake through independent mechanisms of insulin and it also increases the insulin sensitivity (Paternina-de la Ossa, Villaquirá n-Hurtado, Já come-Velasco, Galvis-Ferná ndez, & Granados-Vidal, 2018).
If X or 2. and X woman, X , X , X , X and X remain constant, cause of developing the 2DM based on the type of food (TF), it will be interpreted in the following way:  If X (healthy eating) chance of developing 2DM is of 0.3631, that means that if the interviewee has a healthy eating, his/her level of risk to develop 2DM will be of 36.31%, with a level of confidence of 0.95. That is to say: In terms of normal distribution: On the basis of the table 6 and taking as a reference the context of genetic risk, it can be observed that people who have an unhealthy eating, show a higher level to develop 2DM with regard to a healthy eating. Unhealthy eating in relation to the 2DM, it is explained that: The prevalence of 2DM is associated with the westernized diet, rich in sugar, animal fats 1 , refined starches, carbohydrates and meats (Oggioni, Lara, Wells, Soroka, & Siervo, 2014). Worldwide high calories intake or diets high in animal saturated fats, increase the risk of developing diseases associated with metabolic disorders such as obesity and diabetes (Gómez & Latorre, 2010).

Argument
2DM is one of the major world health emergencies in the twenty-first century. Each year more people are living this condition which can lead to complications over the life (International Diabetes Federation, 2015). Globally, 415 million people with DM were registered, and there is a projection in 2040 which estimates that there will be 642 million people with DM (International Diabetes Federation, 2015).
2DM and its complications are the main causes of death in most of the countries (International Diabetes Federation, 2015). It is the most common type of diabetes and it has increased along cultural and social changes. In high-income countries, adults develop 2DM until 91%. IDF, estimates that 193 million of people with 2DM are not diagnosed and, as a result; a higher risk to have complications.
Additionally, one of 15 adults have impaired glucose tolerance, this condition is associated with a higher risk to develop 2DM afterwards (International Diabetes Federation, 2015). In 2040, for each hundred thousand men 7.7% will develop 2DM, and for each hundred thousand women 6.4% will develop DM. The rate will increase with regard to that 2015 (International Diabetes Federation, 2015).
2DM is the most prevailing type of diabetes in the world, adults are generally susceptible to develop it, but nowadays is also developed more frequently in children and teenagers. There are people with 2DM who have not been diagnosed yet because of the minor symptoms than those showed in type 1diabetes and also because years may pass before they are diagnosed (International Diabetes Federation, 2015).

Conclusions
Gaussian model was done through risk factors to explain 2DM, it illustrated the behavior of the phenomenon. In light of the foregoing and considering the complexity of the phenomenon of 2DM in function of the risk factors, it was taken as an element Gaussian model, due to the response variable adjusted to the probability model from a Gaussian. When applying the technique of generalized linear models, the results obtained in the construction of Gaussian model were the following. Firstly, gage runs were done through different distributions in which the response variable can be modeled (2DM), these distributions match with the Logit, Probit and Gaussian model, since a nominal and cardinal context, taking as a reference a level of confidence from( − ) to 0.95 and a level of significance from ( to 0.05. The ordinal regression Gaussian model was the one with the best adjustment, it let see that the intercept X 1woman (sex), X 2 (age), X 3 (residence zone), X 4 (family history with 2DM), X 6 (today's alcohol consumption), X 7 (physical activity) and X 9 (type of food) are significant within the model, thus P-value of all of them is below from the level of significance ( . 5). Likelihood to develop 2DM impact in the variables X 1woman , X 2 , X 3 , X 4 , X 6 , X 7 and X 9 . All the parameters are meaningful to predict 2DM, thus they are within the confidence intervals, with a level of confidence of 0.95, and a level of significance of 0.05.
On the basis of these results, it can be concluded that, if there is not intervention of the explicative variables in any of the interviewees, the likelihood to develop 2DM is of 33.39%. If the interviewee is woman, cause of developing 2DM is of 35.94%. If probability of risk to all the interviewees is of 33.39% without taking into account the sex, women's risk is of 39.94%,then the risk rate of women with regard to men is of 7.63%. Cause of developing 2DM is of 33.72%, in other words, age is a factor to develop 2DM. For each elapsing year, the probability of risk of the interviewees to develop 2DM is approximatedly of 0.98%, that means of 1%. People who have their place of origin in the metropolitan zone, show a higher risk to develop 2DM with regard to the rest of the zones such as rural and urban. People who have both parents with 2DM, show higher risk to develop 2DM with regard to the other descendants such as father and mother. People who consume alcohol, have higher risk to develop 2DM with regard to people who do not do it. People have non-active physical activity, have higher risk to develop 2DM with regard to the other two activities such as active and moderate. People who have unhealthy eating, have a higher risk to develop 2DM with regard to healthing eating. The risk factors as explanatory elements of the prevalence of 2DM, are utmost importance to delay or to control the early development of 2DM, because some of them are factors that can be modified.

Final Considerations
The survey that was used was the ENSANUT (2012) is self-report information of type 2 diabetes, the results would be more convincing if the data were from blood glucose tests of the observations. The model that was carried out is probabilistic in which there is a 95% confidence level, however, the 5% level of significance is present; that is, according to the model, 95% of respondents should behave based on the equation and 5% go out of that behavior. The work would be improved by adding more variables, which would allow to give greater clarity and specificity to the topic, to use more updated data. This work emphasizes that further studies are carried out in greater depth.