The Determinants of Mathematics and Statistics Achievement in Higher Education

This study uses a standard education production function in order to relate student grades in mathematics and statistics to three factors. The first factor includes teaching practice measures and classroom learning environment. The second factor comprises teacher characteristics and class size. The third factor represents student control variables. The statistical analysis which is based on mixed effect modeling of student marks in mathematics and statistics courses shows that incoming skills, classroom learning environment, support to the students and students attitude toward mathematics and statistics are the most significant predictors of achievement in mathematics. However, teaching practices were not found to be crucial for improving mathematics grades.


Background
Previous studies investigating the determinants of college student performance in mathematics indicate that incoming skills measured by grades in high school mathematics are among the most significant predictors of student success in math and science courses.In addition, it was found that many students have poor attitudes toward mathematics and this may affect their achievement in their required math and statistics courses as suggested by Popham (2005) and Goodykonntz (2011).In 2006, Statistics Canada in their study of the Program for International Student Assessment conducted a preliminary analysis on the cause for higher levels of achievement in mathematics with the national assessment results, where they have attributed the level of anxiety towards math, level of confidence, parental education attainment and family socioeconomic status as contributing factors to achieving a higher score.Perry and McConney (2013) studied the strength of the relationship between school socioeconomic status and achievement in mathematics for Canada and Australia.They found that the relationship is significantly stronger in Australia and that student outcomes are more equitable in Canada than in Australia, which is due to differences in the ways in which the two education systems are funded.In addition to the socioeconomic factor, the present study uses statistical methods to investigate what other factors can play a significant role in mathematics college achievement, in particular those related to attribution.

Literature Review
In the literature, a number of researchers (Tachie and Chireshe, 2013) noticed that having a solid background in mathematics is crucial since it serves as a gateway to future professions in a variety of fields (Tella, 2008).Also, it is believed that mathematics is very important in our daily lives because it deals with real life situation that are directly related to our daily activities (Ojose, 2011).One aim of our study is to test this claim and to see how students think about the importance of mathematics and statistics in their daily lives.
In the paper we apply mixed effects models in order to investigate which key factors have a significant role to improve mathematics and statistics grades.A survey is conducted and prompts for students' preferences and reactions to mathematics and statistics.The survey questionnaire was given to UPEI (University of Prince Edward Island, Canada) students who took first and second year math courses between September 2012 and May 2013.Referring to the literature (Raudenbush and Bryk, 1989;Reynolds and Walberg, 1992), it was suggested that there are three main determinants of student achievement scores in these courses.The first factor includes three teaching practice measures which are: the help and support provided to students, classroom environment (students involvement in class meetings, interactive-style or lecture-style teaching), and whether teaching method allows students to relate to their daily lives.The second factor comprises class control variable to mesure the class size.The third factor accounts for student control variables which include a study habit variable, student incoming skills (high school average grade in mathematics), and student attitude toward mathematics and statistics.These selected variables describe learning environment factors in a social, physical, psychological and pedagogical context which impacts students leaning, their achievement, and their attitudes.Fraser (2007Fraser ( , 2012) ) showed that the quality of classroom learning environment is a crucial determinant of students learning and that students would do better if they have a positive perception about their classroom environment.Afari (2013) investigated the relationship between psychological feature of learning environment and students attitudes towards mathematics.The structural equation models results of his study showed that teacher support and personal relevance were positively related to students attitudes in learning mathematics, whereas classroom involvement was not significant.However, measuring students attitude and calculating its correlation with other factors may not be a very simple task.A literature review on results of past research on attitudes toward mathematics (Larsen 2013) shows evidence that more research is required to determine a more unanimous definition of attitude, a more effective method of measurement, and further classification of experiences which modify attitudes.

Objectives of the Study
In this paper we investigate the following questions: -Are learning environment factors and students background significant to determine achievement in mathematics and in statistics?
-Can we apply a statistical method that will take into account parameter multidimensionality and correlation between the unobservable components in the error term and the independent variables so that factor estimators can be computed efficiently ?
-Are there recommendations that can be implemented in order to help students achieve better results in mathematics and statistics ?
In order to find solutions we test the significance of several factors related to learning environment on college students mathematics performance using quantitative methods that yield accurate estimates of these factors.We use mixed effects modeling approach to analyze the impact of learning environment factors on achievement scores in mathematics and in statistics.This method guarantees unbiased and efficient estimators of the factors studied.

Theoretical Framework
Attribution theory which explains how students interpret their achievements, provides an important method for examining and understanding motivation in academic achievement.We use this theory to investigate how students learning environment affects their achievement in mathematics.Specifically, if a student obtains a low mark in mathematics or statistics subjects she will attribute her poor performance to a specific cause, such as lack of ability, lack of effort, poor instruction, and negative learning environment.The selected attribution can affect her subsequent motivation to engage in similar learning activities.
The study is based on Weiner's attribution theory (Weiner 1984(Weiner , 1992(Weiner , 2005) ) which can be applied to higher education students achievement to describe the cognitive process by which a student perceives the cause of what has happened to her either as caused by herself or by others.Based on this concept, the causes of good or poor academic performances may be explained by internal or external factors.When students do well in a math subject they might attribute the achievement to their own effort or ability.However when they do poorly, they will probably attribute their failure to other external factors and perhaps they would think that their performance is caused by either the difficulty of the subject, or their weak math background that is due to the fact that mathematics was not taught by competent teachers in their high school, or that the mathematics professor at the university is not engaging their students in the learning process, or their negative attitude towards mathematics, or that mathematics and statistics are not helpful for their career in the future.In addition, it is important to understand the relationship between attribution and student behavior.If a student attributes failure in a math subject to a lack of effort he will be more likely motivated to put forth additional effort when preparing for another math exam in the future.However, if a student attributes failure on an examination to classroom learning environment factors he will be less likely to exert effort for a subsequent examination.This impact of attribution on student behavior is crucial for faculty members and university executives to find the right policies that will help students in their learning and in their achievement goals (Ames 1992) and also to provide them with the needed support so that they can improve in their academic performance.
The present study investigates the underlying theory by evaluating the statistical significance of the relationship between students' achievement in mathematics and the attribution factors.

Research Design
The literature on quantitative methods of educational indicators and studies that model student learning achievement as a function of the characteristics of their schools and their family background are quite extensive (Glasman and Biniaminov, 1981;Kaplan and Elliott, 1997;Kaplan and Kreisman, 2000;Koller, Baumert, Clausen, and Hosenfeld, 1999).Yee (2010) investigated the relationship between attitudes toward mathematics and achievement in mathematics for Singapore students.The study suggests that students were extrinsically motivated to study mathematics, but the relationship between extrinsic motivation and achievement was weak.
This paper builds upon the ongoing research and existing literature and provides a detailed statistical analysis based on multilevel mixed-effects models in order to find unbiased and efficient estimators of the key factors that have impact on student achievement in mathematics and in statistics.
Linear mixed models for multilevel analysis have attracted the interest of several researchers including McCulloch, Searle, and Neuhaus (2008), Raudenbush (1998), and Raudenbush and Bryk (2002), Laird and Ware (1982), and Young, Reynolds and Walberg (1996).One of the advantages of these models, also called hierarchical linear models, can be seen from their flexibility in handling fixed and random effects and their ability to solve the problem of correlated errors which occurs when the error terms cluster by some grouping variables.In this case, standard regression analysis will result in wrong standard errors of the estimated parameters and this in turn will lead to invalid statistical inference.
Consider the following linear mixed model, where Y is n-dimensional vector of responses, is covariate matrix for the fixed effects , is covariate matrix for the random effects , is vector of error terms with .
Also, we assume that fixed effects and error terms are independent, .
Model (1) may be estimated by maximum likelihood (ML) method or restricted maximum likelihood (RML) which has the advantage of leading to unbiased estimators.The idea is to maximize the likelihood over a restricted parameter space by forming a set of linear contrasts of Y that do not depend on the fixed effects but rather depend on the estimated variance component.This method will be adopted in the statistical analysis in order to identify the significant determinants of student achievement scores in mathematics and statistics at UPEI.

Statistical Analysis and Discussion
In this section, we study alternative hierarchical linear models of student math scores in order to estimate the fixed effects and the variance components and to conduct statistical inference.Raudenbush et al. (2006) provide a list of statistical packages to estimate hierarchical linear and nonlinear models.The data were collected for 168 students who agreed to participate in the Math survey and student participants of this survey are sought from the following courses: Math221(Introductory Statistics 1), Math222 (Introductory Statistics 2), Math151(Calculus 1), Math152 (Calculus 2), Math112 (Calculus for Social Science), and Math324 (Applied Regression Analysis).
Student scores are recorded on a scale of 100 points.The survey includes 18 questions which form the factors investigated in this research.The questions are presented in the appendix.

Random Effects ANOVA Model
We start with a random effect ANOVA model in order to determine what proportion of the mathematics scores' variance is due to cross-class differences as compared to student difference: (2) 2) defines a two-level hierarchical model where the first level is given by student i and the second level is represented by class j.
denotes the mean score in mathematics in class j and refers to the overall mean.Level-1and level-2 error terms are denoted by and , with variances and , respectively.
The estimations were performed with STATA programs and the results are reported in Table 1 below.supports the use of multilevel modeling instead of an ordinary regression model.In fact, we can compute an intra-class correlation coefficient given by which shows how much variance of math scores is due to differences cross classes.The results indicate that 5% of the total variance is attributable to cross-class differences.

Random Intercept Model
Next, level-1 variables are added to the model assuming fixed effects, but the intercept is allowed to vary across the classes in order to account for cross-class differences in mathematics and statistics achievement scores and to provide an estimation of variance accounted for by each data level (Snijders and Bosker, 1999).
We ran alternative regression models to identify the significant covariates.Our results show that these are: help and support provided to the students (W), student attitude toward mathematics and statistics (X) and incoming skills from high school (Z).Interestingly, we found that teaching practice (interactive or lecture-style teaching) was not a significant factor for math scores.In addition, we investigated whether demographic differences within the population may influence mathematics learning, but the results indicate that it did not matter whether students studied at PEI high schools or if they came from outside the island.
We start with the estimation of the mixed model and we interpret the results reported in Table 2: (3)   The table shows that students who think they were provided with help and support in their math classes obtained, on average, nearly 4.5 marks higher ( than those who did not think there was enough help and support for them.Also, students who like mathematics and statistics will score on average 4 points higher than those who do not like mathematics.In addition, it is estimated that students who obtained on average 10% higher marks in mathematics in high school will get nearly 7.6% higher grades in mathematics courses. The results in table 2 also show evidence of variation in the intercept with a high estimated variance of level-2 error term, Furthermore, the value of the LR test to fit a random intercept model, with a p-value nearly zero, suggests the rejection of the null hypothesis of homogeneous intercept across all math and statistics classes.In addition, from the previous tables it is shown that the total variances of Y in model 3 and model 2 are (9.88+135.51=145.39)and (11.95+229.07=241.02),respectively.Thus we can see that the three model covariates account for almost 40% of total variation in math scores .

Variation in the Intercepts
We suggest a practical method to assess the variation in the intercepts.This variation can be accounted for by including the percentage of students who have positive attitude towards mathematics and statistics (PATT) and also the percentage of students who think mathematics and statistics will allow them to find a good job in the future (PJOB) as additional covariates in the model, where the class-level intercepts are given by .
First, we only add PATT to the model and we report the statistical results in Table 3:  It can be noticed that level-1 variables are still significant and the added level-2 covariate is also statistically significant.Math scores tend to be higher in classes with higher percentage of students who like mathematics and statistics.For instance, with a ten percent increase in the share of students with positive attitude toward mathematics, math scores are expected to rise on average by 1.4 percent.The results also show that level-2 covariate has reduced the size of level-2 variance component (from 9.88 to 4.28).The proportion of variance explained by the percentage of students with positive attitude toward mathematics and statistics can be computed as Therefore the model shows that 56.6% of the variation in the intercepts is due to the percentage of students who like mathematics and statistics courses.Next, we add both level-2 covariates and we estimate model (4): The results of Table 4 show that all level-1 and level-2 variables included in the model are statistically significant.Specifically, math and statistics achievement scores are higher in classes with high percentages of students who believe that mathematics and statistics will enable them to find a good job in the future.The low LR test value of 0.04 shows that level-2 variance component is no longer statistically significant when we include both level-2 covariates.Also, the proportion of variance in the intercepts which is explained by the percentage of students with positive attitude toward mathematics and statistics and the percentage of those students who think that math will allow them to find a good job in the future has increased to 94 percent, ( ) .

Random Slope Model
The primary motivation for using mixed models with multilevel data lies in the fact that the errors within each randomly sampled level-2 unit are most likely to be correlated and thus requires the estimation of a random effects model.After we account for error dependence it becomes possible to make accurate inferences about the fixed effects of interest.In the following, we estimate a model with random slope for the effect of students' perception about mathematics and statistics on their achievement scores. (

Conclusion
The purpose of this research is to conduct a statistical analysis in order to estimate the impact of potential factors affecting students' achievement scores in mathematics and in statistics based on a two-level hierarchical model in which the first level is given by student characteristics and the second level is given by class characteristics.
Because of the correlation between the unobservable determinants in the error term and the independent variables of the model, least squares method will produce biased standard errors of the model coefficient estimates.To overcome this issue and in order to cope with parameter multidimensionality, we use alternative linear mixed-effects models which treat clustered data adequately and assume two sources of variation, within cluster and between clusters.The restricted maximum likelihood estimation results show significant positive impact of student incoming skills, student attitudes toward mathematics and statistics, and support to students.However, our results show that teaching practice and also teacher experience and teacher credentials do not have an effect on achievement scores in mathematics and in statistics.In addition, students who thought that their math classes provide interactive teaching did not perform better than those who thought that their math classes are based on lecture-style teaching.We also find significant correlation between classroom learning environment and students' attitude toward mathematics and statistics.This result implies that regardless of the teaching method, in a typical math course students can do better in mathematics and in statistics if they think that their classroom learning environment is favorable.This will enhance their motivation to achieve their goals and therefore they will be willing to seek additional help from support centers to understand the material and obtain higher marks.
Also, since the statistical results show that teaching in high school is a key determinant for the success of college students in mathematics, it is important to identify the best ways that high school math teachers can help students achieve better incoming skills and higher performance in college.One suggestion given in the literature is to train high school teachers to advocate skillfully for the achievement of students by employing practical mathematics learning activities and by developing appropriate curriculum and education programs which are focused on how to engage students in solving mathematics problems.As argued by Riordan and Noyce (2001) the curriculum can make a significant contribution to improve student learning.
This study also shows a significant role of math help centers in providing the assistance and support that students need to understand class material and to perform better in statistics and in mathematics.Thus, it is recommended that Mathematics departments encourage the creation of these support centers.
As a direction to future research, it will be interesting to collect more data from all students who are taking mathematics and statistics courses at UPEI and to monitor their performance over subsequent semesters, and also if we can add a third level to the hierarchical models by surveying students from all Atlantic Canada universities.

Table 1 .
Estimation of random effects ANOVA model The data were collected from students who agreed to participate in the survey.The students were in 8 lower and intermediate level mathematics and statistics classes including calculus, introductory statistics and regression analysis.Table1 shows that the grand mean grade estimate is equal to 75.59 which is relatively high.The likelihood ratio test value of 3.46 suggests that we reject the null hypothesis of no cross-class variation which

Table 3 .
Estimation of the variation in the intercepts

Table 5 .
Estimation of random slope We checked the validity of a random slope for X by computing LR test statistic based on the difference between log-likelihoods for the model with random intercept only and the model with a random intercept and a random slope.The LR test value is 2(652.5311-650.1548)= 4.7526 rejects the null hypothesis at the 5% level of significance.Thus, the variance component on the slope of student attitudes is significant and shows that the slope varies across classes.