Exploring Correlates of Business Undergraduates ’ Closed Versus Open Grading Assessment Learning Perceptions

Motivated by a lack of scales for measuring business undergraduates’ grading assessment learning perceptions (GALP), this research created two three-item GALP scales, closed and open. Two separate samples of senior business undergraduates (fall, 2015, n = 220 and spring, 2016, n = 690) were used. Closed GALP and open GALP were identified via exploratory and confirmatory factor analyses. Subsequent stepwise regression analyses consistently showed that satisfaction/reputation had a positive impact and accounted for the most variance in these two GALP scales across both samples. Research limitations and future research issues are discussed.


Introduction
Grading has been defined as "the process of calculating or measuring a student's work and assigning a letter grade" (Speck & Jones, 1998, p. 18), and is an inherent part of a faculty member's job.Learning is also important in a course, and two types of learning are declarative knowledge and skill acquisition (Noe, 1986).Declarative knowledge is cognitive, while skill-acquisition can involve application.Both components, grading assessment and learning outcomes, are typical components in a course syllabus (Smith & Razzouk, 1993).There are generally different grading assessment items typically detailed in a course syllabus, often including: quizzes, examinations, individual papers, group papers, presentations, participation, and attendance (Holmes & Smith, 2003;Smith & Razzouk, 1993).Across all types of class delivery modes (e.g., face-to-face, hybrid or blended, online), graded assignments can take a variety of forms (e.g., in-class or take-home, individual or group), as well as involve discussion boards or blogs (Comer, Lenaghan, & Sengupta, 2015).
1.1 Review of the Literature-Grading Fairness Assessment Holmes and Smith (2003) studied grading fairness perception using a single "open or creative" GALP item, an essay assignment, versus a more "closed or structured" GALP item, quantitative problem solving.For their study, the most common fairness complaint concerning essays was that instructors provided either minimal or no feedback with the essay grade, while for problem solving the highest frequency complaint was instructors not giving partial credit.Pepper and Pathak (2008) found that three aspects of grading, i.e., explicitness of grading criteria, frequency of feedback, and proactive instructor techniques (e.g., opportunities for discussion) positively influenced perceived fairness of classroom participation grading.Bacdayan and Geddes (2009) found that undergraduates' perceptions of quiz fairness were positively influenced by: unbiased grading practices, useful and quick instructor feedback, and students' preparation boosting quiz grades, while quiz fairness was negatively influenced by surprise questions.
Other grading fairness assessment research has been more general.Sabini and Monterosso (2003) used three different scenarios to explore college students' perception of grading fairness, where fairness allowed for deviating from grades being strictly based on performance.The first scenario showed that effort should also be rewarded, the second scenario showed students believed in an exam re-take opportunity if an unforeseen event interrupted study, and the third scenario revealed students' believing that learning disabilities should be accommodated.Stewart Wingfield and Black (2005) used a passive versus active course design to test impact on student outcomes.The passive course design was traditional lecture, while there were two active designs, participative and experiential.Grading evaluation in the passive course design was done via traditional multiple choice exams and in the experiential course design grading evaluation was done using case studies and self-assessments.In the participative course design, students had the greatest input into the grading evaluation criteria, which included group work, presentations, in-class discussions and in-class exercises.Results showed that while students perceived both active course designs to be more useful to their future careers than the passive design, there were no differences in student grades, course satisfaction or perceptions of how each course was conducted.
Across eight potential classroom situations using a sample of business undergraduates, Duplaga and Astani (2010) explored which classroom policy statement within each situation provided the fairest treatment for all students in class.The eight situations were: attendance, collection and grading of homework, extra credit, late assignments, make-up exams, make-up quizzes, sanctions imposed for cheating on a quiz, and sanctions imposed for cheating on an exam.Given the understandable lack of student agreement about fairest treatment within each policy, the authors invited readers to examine student fairness perceptions for each situation to compare to their own classroom policies.

Measuring GALP
Prior research has measured undergraduates' grading fairness perceptions of specific course components.An unanswered research question, however, is can valid individual GALP items be successfully combined into scales that are reliable?Finding factor analyzed/reliable scales to measure GALP would give more confidence such grading assessment learning perceptions are being accurately measured (Nunnally, 1978).The first research question asked if different valid GALP items could be successfully aggregated into reliable scales.

RQ1-Can reliable GALP scales be developed? 1.3 Explaining GALP Using Successively Added Variable Sets
Given the lack of research focusing on understanding antecedents of student GALP, prior research on general models of student outcomes, such as: perceived learning (Arbaugh, 2005); persistence towards graduation (Reason, 2009, p. 661); and development (Blau & Snell, 2013;Sandoval-Lucero, 2014) were reviewed.These models generally propose an increasing impact of independent variable sets for explaining the dependent variable or outcome.Based on the distal (less impactful) to proximal (more impactful) ordering of variable sets across these models, four variable sets were tested: background; college related; professional development and motivational for their successively increasing impact in explaining GALP.The specific variable examples used within each variable set were primarily adapted from Blau and Snell (2013, p. 693).Based on their model, these were: (1) background variables (i.e., gender, race, highest parent education), then (2) college-related variables (i.e., grade point average or GPA, average hours worked/week, average hours on course work/week), followed by (3) professional development variables (i.e., joining a student professional organization or SPO, SPO meetings attended, internships completed) and finally (4) motivation variables (i.e., motivation to attend, satisfaction).
Given the exploratory nature of this study, three additional variables were measured: in-state resident; general type of major; and graduate in four years.Whether a student was an in-versus out-of-state resident was collected as a background variable.Since the university receives state funding, out-of-state students have a higher tuition rate, so could this affect GALP?General type of major was measured having either a quantitative versus non-quantitative or qualitative major.This was explored for impact on GALP.Finally, whether the student graduated in four years (Yes/No) was included in the college-related variable set to see if this affected GALP.This leads to the second research question (RQ2): RQ2-Will the four variable sets of background, then college-based, then professional development and finally motivational account for increasing variance in undergraduates' GALP scales?

Samples and Procedure
The sample of undergraduate business students came from the business school in a large state-supported urban university in the Mid-Atlantic region of the United States.This university had a fall, 2015 total student enrollment of more than 39,000, including 6,661 undergraduate business students.Senior business students were required, as part of their graduation application, to fill out an on-line Senior Student Satisfaction Survey (SSSS).Study measures were part of the SSSS and data were collected twice, first in fall of 2015 for 345 graduating seniors, and then in spring of 2016 for 770 graduating seniors.This difference in sample size is not unusual since most seniors graduate in the spring semester.However, given the linkage to graduation, response rates in each semester were over 85%.Across both samples, 94% of the respondents were full-time students (taking at least 12 credits), and 95% took their courses at the Main Campus, as opposed to smaller satellite campuses.The university institutional review board (IRB) approved the research.A general demographic breakdown of each sample is provided in Table 1.Overall there is consistency between the fall and spring samples.Not surprisingly, the spring sample has a higher "graduate in four years" percentage than the fall sample.b Spring, 2016 demographic variables sum to n = 770, including missing data.

Measures
The independent variable measures are broken down into four variable sets: background; college-related; professional development; and motivation.

Student background variables.
Four variables were measured: gender; highest parent education level; race; and in-state resident.The race and in-state resident data were based on student records.The response categories and percentages are shown in Table 1 for each sample.
College-related variables.Five variables were measured: accumulated grade point average (GPA); major; average hours worked per week; average hours per week spent on course work outside the classroom; and four-year graduation (Yes/No).The GPA, general type of major and four-year graduation data were based on student records.Each student's primary major was taken from student records.This was then recoded into one of two general types of major categories, either quantitative or non-quantitative based on business school guidelines.A quantitative major requires more courses using/applying formulas, statistics and mathematics, while a non-quantitative or qualitative major has fewer courses with this emphasis.Using this general distinction, the following six majors were classified as quantitative: Accounting, Actuarial Science, Economics, Finance, Management Information Systems, and Risk Management and Insurance.The other majors, i.e., Business Management, Entrepreneurship, Human Resource Management, International Business, Legal Studies, Marketing, and Real Estate, were classified as non-quantitative or qualitative.
Professional development variables.Three variables were measured: when did a student first join a student professional organization or SPO (when join SPO); how many SPO meetings did a student attend on average during a semester (attended SPO meetings/semester); and how many formal internships or co-ops did a student complete while at the university (number of internships).When join SPO was measured from 1 = never to 5 = as senior, and the full response category breakdown for each sample is shown in Table 1.Attended SPO meetings/semester was measured as: 0 = none, 1 = 1-3 per semester; 2 = 4-6 per semester; 3 = 7-9 per semester; 4 = 10-12 per semester; and 5 = 13 or more per semester.Number of internships was measured as 0 = none, 1 = 1, 2 = 2, 3 = 3, 4 = 4, and 5 = 5 or more.
Motivation variables.Two multi-item variables were measured, academic motivation to attend and satisfaction/reputation. Items were aggregated into a scale and divided by the number of items so that the scale mean reflected the response scale.Academic motivation to attend used the general referent, "rate the importance of the following items in regard to why you chose to attend this business school," on a 6-point response scale, where 1 = strongly unimportant to 6 = strongly important.The four items were: specific majors, professors, business school reputation, and job opportunities., 2013).Given the lack of prior research on aggregating GALP items, to test RQ1 exploratory factor analysis (EFA) was done using the fall sample, followed by confirmatory factor analysis (CFA) using the spring sample.The goal of RQ1 was to create reliable GALP scales to then use for RQ2.Descriptive statistics (mean, standard deviation) and correlations were reported for continuously measured variables for each sample.Testing for RQ2 was done using stepwise regression analyses.Stepwise regression analyses are appropriate to test the significance of the incremental variance in dependent variables explained by each added independent variable set (Stevens, 1992).Based on prior general theory and research (Arbaugh, 2005;Blau & Snell, 2013;Reason, 2009), the background variables were entered as Step 1, followed by the college-related variables in Step 2, the professional development variables were added to the model in Step 3, and finally the motivation variables were added in Step 4. Race was recoded into a binary variable, i.e., white versus non-white, for the regression analyses (Stevens, 1992).Only the final full regression models will be reported for the fall and spring samples.Regression models were checked for outliers (standardized residuals greater than three).In each sample, one outlier was deleted.It was determined that the assumptions of no multicollinearity, linearity, and homoscedasticity were satisfactorily met (Steven, 1992).

GALP Scale Development
The EFA results for the eight GALP items using the fall sample are reported in Table 2. Due to missing data, the sample size dropped to 257 respondents.Follow-up analysis did not show any significant relationships between student background variables and the eight GALP items, suggesting that the data were missing at random.Using a principal components analysis, along with a scree test (Stevens, 1992) two factors were indicated.There were three factors with eigenvalues over one.However, the third factor could not be interpreted.Using varimax rotation (to maximize factor independence) and the criterion of at least a .60 item loading on a factor, along with no double loading complications, three items cleanly loaded on each of the two factors.The two factors accounted for 54% of the total variance.Inspection of the three items loading on the first factor (i.e., multiple choice question exams/quizzes; in class exams and quizzes; take home exams/quizzes) suggested a "closed/structured" GALP factor.Inspection of the three items loading on the second factor (i.e., open-ended question exams/quizzes; written assignments, such as case analyses, essays, journals; presentations, including oral/visual, Power Point) suggested an "open/creative" GALP factor.However, two items, online message boards or blogs and participation/attendance points, did not load sufficiently on either factor and could not be used in further scale development.Based on these EFA results, using a three-item "closed GALP" scale, a coefficient alpha of .67 was found, while a coefficient alpha of .72 was found using a three-item "open GALP" scale.a General referent for all items: "I find the following testing methods best reflect my course knowledge and skills", responses from 1 = strongly disagree to 6 = strongly agree.
Given these EFA results, for the spring data, confirmatory factor analysis (CFA) was used for these six items, testing that same three items again loaded on each factor.Using CFA, the following fit statistics of the six items to the two GALP constructs, i.e., open and closed, were found: X 2 (7, N = 770) = 116.67,p < .05;Adjusted Goodness of Fit (AGFI) = .92;Comparative Fit Index (CFI) = .94;Root Mean Square Residual (RMR) = .06;and Root Mean Square Error of Approximation (RMSEA) = .10.Thresholds for acceptable fit (Bentler, 1990) should be at least .90(AGFI, CFI) and less than .08 for error measures (RMR, RMSEA).The thresholds are exceeded for three of the four indices.The coefficient alpha for the three-item open GALP scale was .67,and it was .61 for the three-item closed GALP scale.Overall, the EFA and CFA results support creating three-item "open" and "closed" GALP scales.

Descriptive and Correlation Results for Both Samples
Means, standard deviations and correlations for continuous variables for the fall and spring samples are shown in Table 3. Due to missing data across variables, the fall sample size decreased to N = 238.The variable means are fairly consistent between each sample.Looking at the correlations, the fall sample correlations are shown below the diagonal divider (----), while the spring correlations are above the divider.There are more statistically significant correlations for the spring sample since its sample size is almost three times larger than the fall (N = 691 for spring, N = 238 for fall).However, the magnitude of the correlations is generally consistent for the fall versus spring samples.For example, the correlation (r) between the closed GALP and open GALP scale is .24for the fall and r = .25 for the spring sample.Although statistically significant, the overlap (r 2 ) between the GALP open and closed scales is only 6% across both samples, further supporting each is distinct and can be used as separate dependent variables in the regression analyses.

Final Regression Model Results for Both Samples
The final regression models for the fall and spring samples are shown separately in Tables 4 and 5, respectively.Additional missing data in the fall sample lowered the sample size to N = 221, and with one outlier deleted the sample size used was N = 220.Given that the sample size used for the fall regression analyses was less than one-third the size of the spring sample (N = 690), three levels of significance were used for interpreting the results, and indicated as: + p < .10,*p < .05,and **p < .01(all two-tail).a Gender, (1 = male, 2 = female); b Highest Parental Education, (1 = some high school, 2 = high school diploma, 3 = some college, 4 = associate degree, 5 = four year degree, 6 = graduate/professional degree); c Race (1 = non-white, 2 = white); d In-State Resident, (1 = no, 2 = yes); e General type of major, (1 = quantitative, 2 = non-quantitative); f Graduate in four years, (1= Yes, 2 = No); g When Join SPO, (1= never, 2 = as freshman, 3 = as sophomore, 4 = as junior, 5 = as senior).
Table 5. Final stepwise regression models for incrementally testing the contributions of variable sets for explaining closed versus open grading assessment learning perceptions (GALP) Spring, 2016 Sample Step  Highlighting key fall sample results in Table 4 first, the background variables set accounted for 5% of the variance, and within this set only race (b = -.29,non-white higher) was significantly related to closed GALP.The only other variable related to closed GALP was satisfaction/reputation (b = .24)within the motivation variables set.Overall, 12% percent of the variance in closed GALP was explained across all variables.For open GALP, record-based GPA (b = .28)had a marginally significant effect.The motivation variables set accounted for 12% of the open GALP variance, satisfaction/reputation (b = .29)was significant and academic motivation to attend (b = .12)was marginally significant.Overall 17% of the variance was explained in open GALP.
Looking at the spring sample results in Table 5, some results consistent with the fall sample were found.The background variables set accounted for 4% of the variance and again, race (b = -.30,non-white higher) was significantly related to closed GALP.The professional development variables set accounted for an additional 2% of the variance and two variables within this set was negatively significant (attended SPO meetings/semester, b = -.05; and number of internships, b = -.11).The motivation variable set accounted for 15% of the variance in closed GALP and both variables were positively significant (academic motivation to attend, b = .08;satisfaction/reputation, b = .37).Overall 22% of the variance in closed GALP was accounted for.For the spring sample, within the background variables set, gender (b = -.14, males higher) and race (b = -.16,non-white higher) were significantly related to open GALP.The college-related variables set accounted for 3% of the open GALP variance, and both record-based GPA (b = .29)and graduate in four years (b = .13,not graduate higher).Finally, the motivation variables set accounted for an additional 13% of the variance and both variables, academic motivation to attend, b = .09;satisfaction/reputation, b = .32,were significant.Overall 18% of the variance was explained in open GALP.

Discussion
To our knowledge, this is the first empirical study of business undergraduate GALP.Two reliable three-item GALP scales were developed, i.e., open and closed, which supported the first research question.For the second research question, variable sets were tested using stepwise regression analyses in separate fall and spring samples.Separate samples allowed for testing whether results could be replicated.More support was found for the second research question using the spring sample, with the caveat that the larger spring sample size allowed for more power to detect relationships.Given the lack of prior research, this study and its findings are best regarded as "exploratory".The most consistent correlate across samples and GALP scales was satisfaction/reputation such that undergraduates with higher perceived satisfaction/reputation had the highest closed and open GALP.However, given the cross-sectional research design, causality cannot be determined so it is also feasible that higher closed and open GALP lead to higher satisfaction/reputation.The positive relationships between satisfaction/reputation and both types of GALP may reflect a general "self-fulfilling prophecy", happy (unhappy) students being more likely to perceive adequate (inadequate) assessment methods.
Race had a significant impact on three of four GALP scales, with white students having lower GALP than non-white students.However, given the discrepancy in sample sizes (the white samples were several times larger than any other racial group), along with the heterogeneous mix of non-white students, further study is needed.
For the spring sample, finding that: attending more SPO meetings/semester, and having more internships, were each negatively related to closed GALP is also in need of additional follow-up.

Study Limitations and Implications for Future Research
Although both types of factor analyses supported the "closed" versus "open" GALP scales, and correlation analyses showed their independence, the scale internal consistency reliability (coefficient alpha) of each three-item scale was less than ideal across the fall and spring samples, i.e., closed -.67 and .61;open -.72 and .67.Ideally, scales should have a reliability of at least .70(Nunnally, 1978).In addition, two items had to be dropped in the initial EFA, due to insufficient factor loadings, i.e., online message boards or blogs, and participation/attendance.Going forward, separating out class participation, as an "open" GALP item versus class attendance, as a "closed" GALP item may be useful (Stewart Wingfield & Black, 2005).In addition, perhaps separating blogs as an open GALP item versus online message boards may be useful.However, vetted BBA core (required) course syllabi indicated that some message boards were graded pass/fail versus others had a formal grading rubric.
In addition, team assignment GALP items (e.g., group paper, group presentation) were not included.Ideally, adding factor-supported items to both the closed and open GALP scales should strengthen their internal consistencies, and could allow for greater discriminant validity in explaining GALP scales.It is important to note that these GALP scales were generated based upon quantitative and qualitative (non-quantitative) BBA core (required) course syllabi.Sample size permitting, it would be interesting to compare GALP for other types of classes, including major and capstone.If available, student SAT scores might be linked to GALP, for example students with higher SAT math (verbal) scores might prefer closed (open) GALP.Both samples were composed of full-time undergraduate business students at an urban-based public university.Testing the generalizability of these scales using non-business and part-time college students in other college settings, e.g., private, rural, would be beneficial.

Practical Implications
Perceived fairness of an instructor's grading policies is a typical item asked for in an undergraduate teaching evaluation (Peterson, Berenson, Misra, & Radosevich, 2008).Nargundkar and Shrikhande (2012) studied over 100,000 student evaluations of teaching effectiveness over four years in the business school at a large public university.They found that two factors: grading assignments (fairness and objectivity of grading practices) as well as student motivation (the instructor's ability to motivate students), both superseded instructor presentation ability, in relative importance as indicators of overall teaching effectiveness.The importance of perceived fair grading is consistent with organizational justice research (Colquitt, Greenberg, & Zapata-Phelan, 2005).For professors, an on-going sensitivity to students' closed versus open GALP can help them to improve/revise their course delivery.Offering students opportunities for mid-course GALP item feedback may allow an instructor to revise GALP items.Such revision may improve an instructor's final course teaching evaluation.
Teaching evaluations can also be an important part of student satisfaction with their BBA program (Holmes & Smith, 2003).The consistent positive relationship between satisfaction/reputation to both closed and open GALP scales support an institution's business school continuing to assess undergraduate satisfaction with its programs.However, this consistency of both types of GALP scales to satisfaction/reputation needs to be tested against other school settings, e.g., engineering, liberal arts.In addition, the results support monitoring employer perceptions of student placements, as curriculum changes based on employer feedback may increase the market value of graduating students (Blau et al., 2016).Collectively, these efforts can also positively impact students' motivation to attend a particular business school, as well as an employer's motivation to recruit graduates from this school.

Conclusions
Individual business undergraduate GALP items have not been previously aggregated into scales.Marks, Haug and Huckabee (2016) recently suggested that business undergraduate perceptions of their curriculum can influence their satisfaction and have implications for a business school's strategic recruitment and retention efforts.Part of curriculum perception should involve undergraduates' GALP.This study found a consistent positive relationship between satisfaction/reputation and two GALP scales, open and closed.One goal of this study is to stimulate future efforts to measure GALP, and to further understand its antecedents as well as consequences.
2015 demographic variables sum to n = 345, including missing data.

Table 1 .
Descriptive statistics for background variables-Fall 2015 and Spring 2016 (Blau, Halbert, Atwater, Kershner, & Zuckerman, 2016)fall sample and .79 for the spring sample.Prior research(Blau, Halbert, Atwater, Kershner, & Zuckerman, 2016)supports this scale.Satisfaction/reputation was measured by aggregating two items: "overall I am satisfied with the Bachelor of Business Administration (BBA) program," and "the reputation of the business school influences your market value to potential employers".Both items used a 6-point response scale, where 1 = strongly disagree to 6 = strongly agree.The coefficient alpha for this scale was .82 for the fall sample and .79 for the spring sample.
Grading assessment learning perception (GALP) items.Eight items were asked, using the following general referent: please indicate your level of agreement with the following statement: "I find the following testing methods best reflect my course knowledge and skills."A 6-point response scale was used, where 1 = strongly disagree to 6 = strongly agree.The eight items were: multiple choice exams/quizzes; open-ended question exams/quizzes; written assignments (case analyses, essays, journals, etc.); presentations (oral/visual communication, Power Point, etc.); in class exams and quizzes; take home exams/quizzes (online, open book, etc.); online message boards or blogs; and participation/attendance points.These eight GALP items were generated based upon a systematic review of the graded components within quantitative and qualitative (non-quantitative) BBA core (required) course syllabi, to generate full domain coverage of GALP items.Data analyses.All data analyses were done using SPSS-PC version 22 (SPSS

Table 2 .
Exploratory factor analysis for grading assessment method item loadings with two-factor extraction and varimax rotation

Table 4 .
Final stepwise regression models for incrementally testing the contributions of variable sets for explaining closed versus open grading assessment learning perceptions (GALP)