Use of Quantile Regression in Determining Factors Associated with BMI Among Vulnerable Adolescents in Rivers State , Nigeria

Body Mass Index has been investigated using the traditional regression methods which may not provide a complete picture of the effects of the independent variables when the outcome is continuous and skewed. Information on the nutritional status of vulnerable adolescents in Nigeria is scanty thereby hindering appropriate intervention by policy decision-makers. We investigated the nutritional status of vulnerable adolescents by examining their body mass index (BMI). A cross-sectional survey of vulnerable adolescents, aged 10-17 years was conducted in three local government areas in Rivers state, Nigeria. A structured questionnaire was used to gather information on the economic status, means of livelihood and accessibility to education, nutrition and health of the adolescents. Quantile regression models were fitted to the data. About 39% of the 494 adolescents were underweight, 49.8% had normal weight, 5.5% were overweight while 6.1% were obese. Age was a significant predictor of BMI for the males at the 50th quantile. Adolescent males that experienced food insecurity showed lower BMI compared to those who were food secured. Age, sex, food insecurity and household economy were determinants of BMI among vulnerable adolescents.


Introduction
Multiple regression models assume that a linear relationship exists between some variable Y, the dependent variable, and k independent variables, X 1 , X 2 , ….. X k .These independent variables are also referred to as explanatory variables because of their use in explaining the variation in Y. Unbiased estimates of the model parameters are obtained by methods of least squares.The Linear regression model focuses on modeling the conditional mean of a response variable without accounting for the full conditional distributional properties of the response variable.The assumption that the covariate affects only the location of the response distribution but not its scale or shape is often violated.
Just as linear regression methods based on minimizing sums of square residuals enable one to estimate models for conditional mean functions, quantile regression methods offer a mechanism for estimating models for the conditional median function and the full range of other conditional quantile functions.Quantile regression model (QRM) explores potential effects on the shape of the distribution and facilitates the analysis of the full conditional distributional properties of the response variable.For a set of covariates, the linear-regression model (LRM) specifies the conditional mean function whereas the QRM specifies the conditional-quantile function.
The QRM and LRM are similar in certain respects, as both models deal with a continuous response variable that is linear in unknown parameters.A fundamental aspect of linear-regression models is that they attempt to describe how the location of the conditional distribution behaves by utilizing the mean of a distribution to represent its central tendency.The LRM also assumes homoscedasticity; that is, the conditional variance, Var (y|x), σ 2 is assumed to be a constant for all values of the covariate.When homoscedasticity fails, it is possible to modify LRM by allowing for simultaneous modeling of the conditional mean and the conditional scale.Quantile regression is necessary when the conditional quantile functions are of interest.The LRM fails when there are outliers in the data set.The quantile regression estimates are more robust against outliers unlike the OLS.Outliers in the LRM tend to have undue influence on the fitted regression line.The usual practice employed in the LRM is to identify outliers and eliminate them.This practice of elimination of outliers undermine research where the outcome of interest is on social stratification or inequality, as outliers and their relative positions to other observations are important aspects of inquiry.In terms of modeling, it is necessary to model the relationship for the majority of cases and for the outlier cases, a task the LRM cannot accomplish.
Quantile regression has been used in a broad range of health applications.In health research, quantile regression has been used especially in areas of growth chart, where percentile curves are commonly used to screen for abnormal growth (Chen, 2005).It has also been used to determine the BMI in Korean adolescents (Kim et al., 2015).Longitudinal analysis using quantile regression was conducted among men; age and physical activity were found as predictors (Bottai et al., 2014).Quantile regression was also used to investigate changes in the BMI distribution of Chinese adults and the results showed that effects of different covariates were different across the BMI distribution.(Ouyang et al., 2015).
Adolescence is a period of bodily change, with a particular impact of weight change, probably more complex than in adults.Body mass index is an indicator for measuring the nutritional status defined as the ratio of weight (kg) to squared height (m 2 ).Several studies on adolescent health in developing countries have emphasized reproduction, sexual health, HIV/AIDs, STI with few focusing on the BMI of adolescents (Kurz,1996;Omigbodun et al., 2010).
In India, studies have shown that overall nutrition status was very poor among adolescent girls of poor rural groups; 79% suffered severe chronic energy deficiency (BMI < 16), 74% suffered anaemia and 44% had signs of vitamin B complex deficiency.(Chaturvedi et al., 1996).Similarly in urban Bangladesh, school girls aged 10-16 years reported inadequate food intake.(Ahmed 1997).
Adolescents may be affected by severe under-nutrition during emergency situations and for longer periods.Stunting has been reported among adolescents in under-nourished populations; 27% in Urban Guatemala, and 65% in rural Philippines (Kurz and Johnson-Welch, 1994).A study in Port Harcourt, Nigeria evaluated the nutritional status of adolescents using BMI and found prevalence of underweight, overweight, obesity and stunting to be 6.4%, 6.3%, 1.8%, and 5.4% respectively.(Adesina et al., 2012).
Findings from other countries on the health and nutritional status (using BMI) of adolescents have been variable.Various factors have been found to be associated with BMI of adolescents such as socioeconomic status (Fokeena andJeewon 2012, Hayward et al., 2014), depression (Revah-levy et al., 2011), sedentary lifestyle and energy intake (Grujic et al., 2009).In addition a study in Port Harcourt, Nigeria, found high socio economic status, high maternal education, spending more than 3hrs watching TV and frequent ingestion of snacks to be associated with being overweight in adolescents (Adesina et al., 2009).
Previous studies on Body Mass Index and its determinants have used the traditional linear regression analysis and logistic regression and were limited in their ability to capture or analyze the change in the BMI distribution.Such models do not account for the heterogeneous changes in the dispersion of the association of independent variables with BMI across the conditional BMI distribution.The application of quantile regression analysis rather than the traditional regression is important when the effect of the explanatory variables differ at different levels of the response variable.
The magnitude of the problem of nutrition among vulnerable adolescents has been underreported in Nigeria and there are no documented studies on factors affecting nutritional status of vulnerable adolescents.The results of this study will provide important insights and empirical evidence for programmatic efforts on issues concerning the nutrition of vulnerable children in Nigeria.
Therefore, we determined the factors associated with the body mass index of vulnerable adolescents across the entire conditional distribution of BMI using a multivariable quantile regression model.

Study Area
The study was conducted in three local government areas (LGAs) in Rivers state namely; Eleme, Port Harcourt and Obio/Akpor.

Research Design
The survey was a descriptive cross-sectional survey of vulnerable households conducted between September 7th to 12th, 2015 in Rivers state, Nigeria.

Sampling Procedures
The three LGAs were purposefully selected based on the burden of HIV and non-presence of CDC implementation partners in the LGAs.A national household assessment form was used to identify households with vulnerable children based on an index of vulnerability which was determined using scores categorized as follows: most vulnerable: 21-28 points, more vulnerable: 14 -20 points and vulnerable: 7 -13 points (Biemba, et al. 2009, FMWASD, 2007).Most vulnerable and more vulnerable households were selected for interview.A total of 909 households were visited.

Data Collection
A semi -structured questionnaire adapted from Measure Evaluation (MIS 001) and the national child vulnerability index form was used to collect information on socio demographic characteristics, as well as items relating to the key 7 service areas: economy, education /work, food security, shelter, health and protection, psychosocial, care and support (Measure Evaluation, 2017;Biemba et al., 2009).Respondents were heads of households or caregivers and one vulnerable adolescent in the household.A section of the instrument was dedicated to adolescent children (10-17years).The questionnaires were administered directly to the children by trained interviewers.Anthropometric measurements comprising of height and weight were taken for each respondent.At the end of each day of the study, the questionnaires were retrieved and checked for completeness and accuracy by project supervisors.Weight was measured using a well calibrated bathroom scale with their shoes off.Height was measured using a meter rule while standing against the wall.

Sample Size
All the 494 children between the ages of 10-17 years found in these vulnerable households were interviewed.

Measurement of Independent Variables
Independent variables include household economy, food security, age, sex and type of caregiver of adolescent (parent versus non parent).
Household Economy was assessed using 10 questions: 1. Do you usually work throughout the year, or do you work seasonally, or only once in a while?2. Are you paid in cash or kind for this work or are you not paid at all? 3. Did your household incur any food-related expenses in the last four weeks?4. Was your household able to pay for these expenses?

5.
Did your household incur any school-related expenses in the last 12 months?
6. Was your household able to pay for these expenses?
7. Thinking about the last time you bought any food for eating or cooking, where did the money come from?
8. Thinking about the last time you had to pay for any school-related expenses, where did the money come from?
9. Did your household incur any unexpected household expenses, such as a house repair or urgent medical treatment, in the last 12 months?10.Was your household able to pay for these expenses?
Response to each of the 10 questions were scored as 0 or 1 for a "no" or "yes" answer respectively.Meanwhile, for questions 7 and 8, a score of 1 was assigned if the respondent mentioned "income" or 0 if otherwise.The categories of poor and good household economy were derived by using the median (7.0) score as cutoff point.

Food Security
Food security was determined using the question: In the past 4weeks, was there ever no food to eat of any kind in your household because of lack or resources to get food?Yes signifies "food insecure", No signifies "food secure" (USDA, 2006).

Statistics and Data Analysis
The data were cleaned and entered into the computer using CSPro software (CSPro 2013).Descriptive statistics (proportion for categorical variables and mean and standard deviation for quantitative variables) were obtained for all independent variables.BMI was calculated using the formular weight/height squared (kg/m 2 ).The categories were: underweight (<18.5),normal (18.5-24.99),overweight (25.00-29.99)and obese (≥ 30.00).(WHO 2004).
A normality test was done on the outcome variable (BMI) to determine whether the distribution of BMI in adolescents was normally distributed.Afterwards, quantile regression models were fitted to determine factors affecting BMI of adolescents.Separate models were fitted for both males and females.The quantile regression model is given as Where Y i is the response variable Y with probability distribution F(y) = prob (Y≤ y), βτ = (β 1x , ……β px ) is the unknown p -dimensional vector of parameters and ε = (ε i ,….ε n ) is the n dimensional vectors of unknown errors.The regression coefficient at a given quantile (βτ) indicates the effect on Y of a unit change in X, assuming that the other factors are fixed.The models were fitted using the "qreg" command in stata.The "qreg" expresses the quantity of the conditional distribution as linear functions of the independent variable.A null model was first fitted, thereafter the main explanatory variable (food security) was included in a second model, and subsequently other explanatory variables were included in a final model.All analysis was conducted using stata version 12.0 (Stata Corp, 2011).

Characteristics of Adolescents
There were 494 adolescent children (10-17 years) found in the 909 vulnerable households in the three LGAs.There was a slight female preponderance (51.4%).The mean age of the males was 13.3yrs (SD=0.14)while that of the females was 13.4yrs (SD = 0.14).Median BMI for the total sample was 19.37(IQR=5.35).About 43% of the males were underweight compared to 39% of the females.Similar frequencies were observed in all quantiles for both males and females.A higher proportion of males (90.5%) experienced food insecurity compared to females (89.6%).A sizeable proportion of the female adolescents (88.3%) had caregivers who were not their biological parents compared with 10.7% of the males (Table 1).

Distribution of Some Selected Factors Among Adolescents
For both males and females, food insecurity was very high among those underweight (92.2% versus 94.3%).In addition, proportion underweight (for both males and females) was also high among those living with parents (91.3% versus 92.0%).(Table 2).

Quantile Regression Results for the Total Sample
For both males and females, age of adolescents was significantly associated with BMI in the lower quantiles.BMI increased by 0.382 (95% CI: 0.153, 0.611), 0.430 (95% CI: 0.253, 0.606) and 0.407 (95% CI: 0.253, 0.562) in the 10th, 25th, and 50th quantiles respectively.However, there was no significant association at the 75th and 90th quantile.Results also showed that the mean difference in BMI between males and females was -0.71 at the 50th quantile.Food insecurity was associated with low BMI in the 25th quantile.However, there were no significant associations between poor economic status, status of carer and BMI.(Table 3)

Quantile Regression on Factors Affecting BMI Status of Adolescent Females and Males
Table 4 shows the regression coefficients of selected factors associated with BMI of adolescent females.The results obtained suggests that a unit increase in age relates to significant increases in BMI from the 10th quantile to the 75th quantile i.e., BMI increased by 0.625 (95% CI: 0.134, 1.116) among females in the 75 th quantile.However, males had a significant increase in BMI as age increases in the 50th quantile only but with a negative association in the other quantiles (10th, 25th, 75th, and 90th).
Also, food insecurity had a negative association with BMI for males in the 10th and 90th quantiles, i.e., for a unit increase in food insecurity, BMI of males reduces by 1.180 (95% CI: -2.341, -0.019) and 6.014 (95% CI: -11.766, -0.260) in the 10th and 90th quantiles respectively, but with no association among females.There were no association between poor economic status, adolescent caregiver and BMI both in males and females.Table 5 shows the regression coefficients of selected factors associated with BMI of adolescent males.

Discussion
Quantile regression has been frequently used in medicine and in public health to model non-normal, semi-continuous and bounded outcomes such as waiting times, as well as length of hospital stay.(GUSTO 1993, Yoon et al 2003, Pourhoseingholi et al 2008).Quantile regression is usually preferred when there are outliers and heteroscedastic data.
Unlike the Ordinary Least Squares, the quantile regression estimates the conditional median and are more robust to outliers in the response measurements.In certain situations, extreme quantities are more of interest than in the analysis of means.Therefore quantile regression is more useful in the analysis of BMI measurements where more interest is in the analysis of extreme quantities such as underweight and obesity in adolescents.A more complete picture is provided by the effects of the independent variables when a set of percentiles is modeled, which may not be properly captured with models that average over the conditional distribution.Several research on BMI have used the t test, Ordinary Least Squares and other generalized linear models and therefore important features of the data may have been missed with such models that average over the conditional distribution.(Omigbodun 2010, Awoyemi et  Our findings showed that the prevalence of underweight was 38.7%; and was slightly higher in males than females.This result is comparable with a previous study in South West Nigeria.(Omigbodun, 2010).Similar gender differences have been reported elsewhere.For example in South Africa, a study of teenagers revealed similar gender differences with boys reported as more likely to be underweight than girls.(Jinabhai et al., 2007).Several studies among adolescents in developing countries have also reported the same trend.(Kurz, 1994, Kurz, 1996, Venkaiah et al., 2002 ).Furthermore, our prevalence of underweight is higher than that reported among normal children in Port Harcourt (Adesina et al., 2012).However, there are other studies showing contrasting results.Our underweight results were lower than results from previous studies (Chaturvedi, et al., 1996;Ahmed, 1998).Our results are also slightly higher than that in Urban Guatemala but much lower than reports from rural Philippines (Kurz and Johnson-Welch, 1994).Our prevalence of underweight was also higher than perceived weight reported among normal adolescents in Australia.(Hayward et al., 2014).In general, our findings (of underweight, normal weight, overweight and obese) are not really surprising in a vulnerable population such as this.
Age was found to be associated with BMI at all quantiles except the 75th and sex at the upper quantiles.This is consistent with findings from a rural community in Ethiopia where younger age and number of meals have been identified as factors affecting BMI in a rural community.(Yetubie et al., 2010).Overall, the influence of age appeared to be more important for females than for males.We found an association between age and BMI with higher BMI in females than in males in all quantiles.A study in South West Nigeria also found higher BMI in adolescent females; who are not vulnerable though.(Omigbodun et al., 2010).
Food insecurity was also a significant factor of BMI at the lower quantile of 25 and poor household economy at the 75th.The lack of association between food security and BMI may be explained partly or due to children current weight or height which could be due to past access or inadequate food intake.They may also not be the current influences on BMI.This lack of association may also suggest that household characteristics may not always influence current nutritional status of adolescents.Previous studies have reported similar findings on the factors associated with nutritional status of adolescents.
Results of this study also indicate that household economy may not be a determinant of the BMI of vulnerable adolescents.
In both males and females, no association was found except for the total sample at the 75th quantile.This finding is slightly different from that of a recent study that reported an association between BMI and family socio economic status.
The study reported that females with good family economic status were more likely to have higher BMIs than those with average economic status, particularly at upper quantiles of the conditional BMI distribution.(Kim et al., 2015).Studies have suggested a positive association between socio economic status and higher BMI indicating that people with higher socio economic status are more likely to consume higher calorie foods than their low socio economic counterparts.Clearly, adolescents from poorer households have also been reported to be at a higher risk of malnutrition.(Brundtland, 1999;Zere and Mclntyre, 2003;Schellenberg et al., 2003;Thuita et al. 2005).
Despite the strength of our methodology, there are still limitations worth mentioning.Our dataset did not include variables such as parental BMI, fat consumption and calorie intake and therefore we could not assess their effect on BMI.
Nevertheless, this study adds evidence to current literature by providing the factors associated with BMI among vulnerable adolescent population.The findings imply or indicate that policy measures that target vulnerable adolescents should include food security and household economic strengthening with focus on both male and female children.

Conclusions
The Quantile regression allowed the investigation of a continuous bounded outcome such as BMI and its associated factors which may have been missed if the traditional regression methods have been used.Quantile Regression is robust as it makes no distributional assumption about the error term in the model.Investigators should therefore be careful when selecting the traditional regression methods in modeling bounded outcomes such as BMI.

Table 1 .
Characteristics of Vulnerable Adolescents by sex in Rivers State

Table 2 .
Percentage distribution of nutritional variables among the BMI categories by sex

Table 3 .
Quantile Regression estimates for the total sample.

Table 4 .
Quantile Regression estimates for the female adolescents