Competency-Based Evaluation in Higher Education—Design and Use of Competence Rubrics by University Educators

Competency-based learning requires making changes in the higher education model in response to current socio-educational demands. Rubrics are an innovative educational tool for competence evaluation, for both students and educators. Ever since arriving at the university systems, the application of rubrics in evaluation programs has grown progressively. However, there is yet to be a solid body of knowledge regarding the use of rubrics as an evaluation tool. This study analyzes the use of rubrics by 150 teachers at 5 Spanish universities. The comparative analysis allows us to determine how these rubrics are being used to assess (or not) competencies. This study examines the educators’ intentions and the pedagogical aspects that are considered in the design and application of rubrics. The results and conclusions may lead to suggestions for improvements and strategies that may be implemented by university professors when creating appropriate competency-based scoring rubrics.

As for the competency-based approach, a rubric is an evaluation scale that is preferentially used by teachers (and by students in self-assessment and peer-assessment tasks) in order to assess competence descriptors. It is based on a series of relevant dimensions that may be assessed quantitatively and qualitatively in regards to a gradual and reasoned scale which, at the same time, should be shared with all participants.
As for the sense and scope of the rubrics, their potential lies in their ability to offer accurate assessments in terms of the quality of student work (Blanco, 2008), ensuring that each one is assessed with the same criteria as those of the student's peers, overcoming arbitrariness, inconsistencies or subjectivities of the assessment and, thereby, decreasing the margin of error of assessment (Chen, Hendricks, & Archibald, 2011;Raposo & Sarceda, 2010;Raposo &Martínez-Figueira, 2014).
Similarly, rubrics favor self-regulation in student learning, allowing students to reflect on the feedback provided, plan their tasks, check their progress and review their work prior to presentation. All of this helps to ensure improved performance and decreased anxiety (Eshun, & Osei-Poku, 2013;Panadero & Jonsson, 2013). The assessment goes beyond merely a determination of results, also allowing students to identify their strengths and weaknesses.
Thus, the student's active implication in the self-regulation progress and development of their own learning has led to major transformations in the manner of planning, developing and evaluating different learning situations. It favors the development of general and specific competencies included in the new degrees. For example, in the new models of competence development, the curriculum is not structured by theme-based units, but rather, by learning activities (Mateo, 2006). For this reason, the rubrics are currently used to measure a wide range of higher-order skills or evaluate assignment such as a long-term project, an essay, an exhibit, a lab work, an online course, a demonstration of problem solving, a teamwork or a research report that may vary across discipline (Al-Zumor, 2015;Asari, Ma'rifah, & Arifani 2016;Chujitarom & Piriyasurawong, 2017;Mairing, 2017;Lu & Zhang, 2013). Therefore, scoring rubrics are designed to evaluate the quality of a process -not just the quality of a final-product. Cruz & Abreu (2014, pp. 41-42) found that the scouring rubrics "have a greater impact on the student's education when the designed learning situations: a) involve the selection of tasks or activities that are relevant and significant, b) mobilize and integrate diverse knowledge and skills, and c) are developed in real contexts of professional practice".
In relation to its design, some authors as Jones, Allen, Dunn, and Brooker (2016), establish a five-step pedagogy to improve student understanding and utilization of marking criteria. These guidelines for action include: (1) deconstruction of the rubric and standardizing the marking method; (2) examples and exemplars; (3) peer review; (4) self-review; and (5) a reflective diary. These steps allow the rubric to be utilized uniformly by both students and professor to evaluate work quality.
Therefore, according to the new curricular structure based on the development of competency-based activities, it is relevant to ask: what learning activities are being evaluated with rubrics: the typical assimilative and reproductive tasks from more traditional approaches; or tasks that are more focused on simulation in work groups, etc.?
To offer a response to these questions, we have used the activities classification proposed by Marcelo et al. (2014) as a reference: assimilative, information management, application, communicative, productive, experiential and evaluative.
Until now, educational objectives have focused on the classical approach of specific competence acquisition for each discipline, therefore their definition, development and assessment did not present any difficulties. But this is not the case with generic competencies of a transversal nature. According to Villa and Poblete (2011, p. 151) 'the difficulty in the assessment of competencies may differ based on the competencies themselves, since some of them are more saturated with knowledge, skills and values than others'.
Thus, when considering the classification of generic competencies in the Tuning Project (systematic, ies.ccsenet.org instrumental and interpersonal competencies) (Gónzalez & Wagennar, 2003), we must ask: what types of generic competencies are most commonly assessed with rubrics: Instrumental and systematic competencies (more observable and measurable), or interpersonal skills (with higher levels of reflection)?
Regarding rubric type, Blanco (2008, p. 176) suggested that: The selection of one type of rubric or another depends mainly on the use that is desired from its results; that is, if there is greater emphasis being placed on formative or summative aspects. Other factors to be considered are: the amount of time required; the nature of the task itself; and the specific criteria of the activities being observed.
In this way, knowing the purposes of evaluation, it will be possible to know which are the conceptions and teaching models of the faculty in relation to the evaluation of competencies. These and other issues represent a changing educational paradigm (with respect to the new competence assessment approaches and tools). With this frame of reference, our study was developed to determine the goals of educators when designing a scoring rubric and to analyze the types of rubrics that are used to support and guide the teaching and learning processes.

Objectives
The knowledge of university educator assessment practices based on the analysis of rubric content should permit the identification of how teachers evaluate competencies or whether or not their assessment of disciplinary aspects continues to focus on psychometric principles and declarative contents.
The main objective of this research is to describe, analyze and assess the rubrics that are used by educators in order to determine the level of implementation of the competencies and the disciplinary content of said rubrics.
Therefore, specific objectives may be determined such as knowledge of educators' goals with the design of the scoring rubrics, by exploring a) the type of tasks that is the subject of the assessment, b) the activities that predominate and, c) the generic competencies that are assessed with rubrics.

Participants
150 rubrics from 5 public universities in different regions of Spain were analyzed. The selected universities were chosen for their innovative experiences in the use of scoring rubrics.These evaluation tools were chosen from the subjects included in the virtual platforms of the universities with the authorization of their authors. A letter was sent to the educators-authors of the scoring rubrics requesting participation in the research and the use of their evaluation tools. In the Spanish university panorama, public universities predominate as compared to private universities. It should be noted that in 2015, 89.2% of the university students were registered in public universities as compared to 10.8% in the private ones (MECD, 2015). Furthermore, until very recently, the Spanish university system was quite homogeneous and coordinated, and this tradition continues to dominate in many manners of acting (Mora, 2009). For both of these reasons, the sample of rubrics and universities is not considered to be a poor representation of the Spanish university panorama in this area.
Sampling was intentional, taking into consideration as rubric selection the following criteria: public access to the same and compliance with certain minimum requirements: identification data and basic elements of rubrics, according to specialized literature (Buján, Rekalde, & Aramendi, 2011;Goldberg, 2014;Popham, 1997;Wiggins, 1998).

Information Collection and Analysis Procedures and Techniques
This study combines quantitative and qualitative research methodology. The quantitative perspective is framed within the ex post-facto design studies (McMillan & Schumacher, 2014), whereas the qualitative perspective is established via procedures of inductive-deductive categorization of the analysis units (rubrics) (Miles, Huberman, & Saldaña, 2014).
In addition to this process, there was a qualitative analysis of the application of some categories over certain types of rubrics, highlighting some relevant results from a substantive point of view. So, the category system was created through the qualitative analysis of the rubric content (Denzin & Lincoln, 2012). During the first phase of the qualitative analysis, a reduction was made in the amount of information used to construct the category systems. This phase is described below since it is considered key to defining the categories that are quantitatively analyzed in the results section.
This phase consists of a series of processes that interact with one another:

•
There was a separation of information units based on the distinct topic-based criteria. These criteria arose ies.ccsenet.org International Vol. 11, No. 2;2018 from the proposals of authors who explained the sense and scope of the rubrics.

•
The units were identified and classified based on a mixed classification model (Denzin & Lincoln, 2012). Based on this model, pre-defined (deductive) categories were created, derived from specialized literature on rubrics; ad hoc (inductive) categories were constructed from the observation of the same.
• From the mixed categorization process (deductive-inductive), a synthesis and grouping of the units was created, forming a system of categories for the collection of rubric content Based on the assessment indicators observed in the rubrics, a category system was established. Below is the category system that was finally used in the rubric data collection: Rubrics are didactic innovation tools for formative and summative assessment (Cebrián, 2014;Conde & Pozuelo, 2007), as well as for the orientation and evaluation of the educational practice (Crotwell-Timmerman et al., 2011). Therefore, teachers consider them to be useful tools to evaluate the quality of student reports for a wide range of materials and activities (Blanco, 2008;Ion & Cano, 2011), and guides that help orient students when presenting reports or making final revisions, prior to report completion (Buján et al., 2011).
Below is a classification of the types of rubrics included in the category reports (see Table 1).

Problems
Develop procedures for problem resolution (mathematical or calculation-based).
Work dynamics Acquire skills such as collaboration, leadership, initiative, etc.

Simulation situations
Manage situations similar to professional contexts.
Audio-visual or graphic resources Design bookquest, monographic posters, etc.

Others
Acquire specific skills for a discipline (e.g. clinical exploration).

Category 2. Activities
According to Fernández-March (2006, p. 53) activity is defined as 'the achievement of that which is intended of the students". This author suggests that activities are units of action within the teaching-learning process, including formative objectives such as the actions of both teachers and students.
On the other hand, Marcelo et al. (2014) proposed a classification of activities (assimilative, communicative, experiential, etc.) based on the analysis of learning sequences in university teaching. From this proposal, the activities category was created based on the type of activities assessed in the analyzed rubrics (see Table 2).  Information management Seeking out, contrasting, analyzing and understanding the information related to a problem.

Application
Resolve problems by applying formulas, studied content. E. g.: solve a practical case, practical laboratory application, etc.
Communicative Present, defend a work, argue and exchange information, group dynamics and teaching strategies. E. g: brainstorming.
Productive Design and apply some device, document or resource (web, field work diary, project, trial, etc.).

Experiential
Operate in environments related to the future professional field (hospital, educational center, company, etc.).

Evaluative
Respond to questions following a class session, self-evaluate-peer-evaluate works, take a practical/theoretical test.

Category 3. Generic competencies
This category was created based on the indicators of the generic instrumental, interpersonal and systematic competencies in the Tuning Project (González & Wagenaar, 2003, p. 81). Table 3 presents the evaluated elements from each subcategory identified in the rubric analysis.  Miguel (2006). This author suggests the importance of assessing the development of competencies by evaluating the integral form of all of their components (knowledge, skills and abilities, attitudes and values), without forgetting the assessment of those skills or procedures that may appear to be unrelated to professional performance.
Therefore, the concept of competency includes academic education and professional development, allowing future workers to be more than mere experts (with knowledge) in one or more specialties. The classifications presented in the previous tables (1 to 4), which were created based on the qualitative analysis, allow for the creation of register sheets based upon which the collected rubrics were categorized (n = 150).
According to this classification, the frequency and percentage tables were created in order to determine which categories and components were the most frequently used by the teachers in the rubrics. In addition to frequencies and percentages, each table contains χ2 tests to contrast up to what point the observed frequency distributions were similar to the randomly anticipated ones. Furthermore, it may be affirmed that the greater (or lesser) frequency of some categories over others may be considered significant in the population in which the sample is representative.

Results
Below are the quantitative analyses of frequencies and percentages for each category, including the χ2 tests, to study the distributions obtained from the characteristics of the rubrics and, therefore, to determine the objectives sought out by the teachers in the design and application of said rubrics. As we shall see later, for each category, the χ2 tests have all been found to be significant (one of the contrasts p= 0.032 and the remaining ones p<0.001), suggesting that the observed differences between the sub-category differences were not random.

Student Reports
In Table 5, referring to the student reports, the most frequently evaluated rubrics are written documents (36%) and oral presentations (14.7%), both individually and collectively. On the other hand, the elements that were the least frequently evaluated with the rubrics (2.7%) are capacity to construct data collection tools and the ability to design audiovisual and graphic resources.
These results suggest that educators may be using the rubrics to evaluate the acquisition of declarative skills and abilities (Falchicov, 2005), such as the ability to write a report. Similarly, in a review by Panadero and Jonsson (2013), it was revealed that most of the evaluated tasks were written documents, although oral presentations and projects were also assessed.
Thus, cognitive skills such as the interpretation of complex images, the analysis of multiple information, the integration of information sources, the development of strategies for the management of information, tools and data and other similar skills have been relegated to a secondary position, in favor of professional education-based teaching (Bartolomé & Grané, 2013). Note. χ 2 = 111.5, df = 8, p <0.001.

Learning Activities
As seen in Table 6, productive activities are the most frequently evaluated (28%) as compared to those of an experiential nature (3.7%). Thus we find a potential lack of the principles of applicability and transferability, as promoted by the EHEA. On the other hand, the growing presence of the information and communication technologies (ICT) is leading educators to use rubrics to evaluate a variety of activities related to information management (24.9%).  Note. χ 2 = 217.22, df = 6, p <0.001. The percentage (%) does not total 100 due to the lack of mutual exclusivity of the categories. Table 7 reveals that rubrics which collectively contain the instrumental, interpersonal and systematic competence are evaluated more frequently than the others (48.7%). Next are those rubrics that exclusively assess the systematic competence (30%). Finally, rubrics that assess the instrumental competence in combination with the interpersonal ones are the least frequently used (4.7%). It should be noted that rubrics having no generic competence are also included (16.7%).

Generic Competencies
Therefore, it may be concluded that educators use rubrics to evaluate generic competencies and they do so in a combined manner, in other words, considering the possibility of evaluating the three competence types (instrumental, interpersonal and systematic) in one same rubric.
On the other hand, the results also tell that some educators may use these rubrics only to evaluate systematic competencies and that the interpersonal competencies are not being considered in the new study plans. These results also suggest that a considerable number of educators use rubrics to assess disciplinary content, without considering the relevance of the development and assessment of the competencies. Note. χ 2 = 64.08, df = 7, p <0.001.

Generic Competencies Descriptors
In this subsection we describe one of the results from the three types of generic competencies: instrumental, interpersonal and systematic.
First, as seen in Table 8, the most frequently evaluated instrumental competence descriptors are the basic skills of handling a computer (61.3%) and the information management skill (59.3%). On the other hand, the least frequently evaluated is that of knowledge of a second language (6.7%).
These results warn that the growing presence of ICT is leading teachers to use rubrics to evaluate a variety of tasks related to the use of computerized applications and information management. On the other hand, it also highlights the scarce attention being paid to the learning of a second language, despite the fact that this is a basic tool for student mobility. The same is the case for access and exchange of knowledge between countries. Note. χ 2 = 120.92, df = 10, p <0.001. The percentage (%) does not total 100 since the categories are not mutually exclusive.
Second, in Table 9 we find that the most frequently evaluated interpersonal competencies descriptors is the ability to work in a team (84.7%), and that the least frequently evaluated ones are the ability to communicate with experts from other areas, the appreciation of diversity and multiculturalism and the ability to work in an international context (0%).
The results related to team work suggest that teachers find this competence essential for the future labor development of their students.
On the other hand, the results regarding the less frequently evaluated competencies suggest that university teaching does not consider the development of the student's ability to communicate, relate and collaborate effectively with experts from different specialties and in diverse academic and professional contexts. Note. χ 2 = 539.98, df = 7, p <0.001. The percentage (%) does not total 100 since the categories are not mutually exclusive.
Finally, as seen in Table 10, the most frequently evaluated systematic competence descriptors are the ability to learn (81.3%) and the motivation to succeed (78.7%). On the other hand, the least frequently evaluated descriptors are knowledge of cultures and customs from other countries (2%) and leadership (3.3%).
The results from this section indicate that educators consider those capacities related to establishing and attaining goals and objectives as being relevant, as well as planning their achievement and controlling their advance. The motivation for success is relevant since it allows students to actively confront situations that imply risks and decisions.
Another priority of educators for their students is the acquisition of the ability to learn, which implies, among other skills, that they be able to obtain, process and assimilate new learning skills and knowledge. All of this, while considering their strengths and weaknesses in order to continue to successfully learn. Note. χ 2 = 485.19, df = 11, p <0.001. The percentage (%) does not total 100 since the categories are not mutually exclusive.

Components of the Competencies
As seen in Table 11, the knowledge and skills components are the most frequently evaluated in the rubrics (60.7%). On the other hand, the rubrics evaluating the skills and attitudes/values are the least frequent ones, both collectively (4%), and individually (1.3%). It should be noted that there were no rubrics that collectively evaluated knowledge and attitudes/values (0%). Also, these results demonstrate that it is rare for a competence component to appear alone.
If a competence is defined as the integration and mobilization of diverse components (knowledge, abilities, attitudes/values), the previous results reveal that the 'competencies model' adopted by teachers in the rubrics acquire a simplistic perspective. Therefore, based on these results, it should be warned that the development and evaluation of competencies is only carried out in a partial and isolated manner, without taking into account the multi-dimensional nature of the competencies. Note. χ 2 = 311.53, df = 6, p <0.001.

Results of the Competencies Subcomponents
First, in Table 12 we find that the most frequently evaluated knowledge components are those that are linked to the subject (82%). In other words, those referring to the acquisition, understanding and systemization of the specific knowledge related to the subject matter.
On the other hand, the least frequently evaluated knowledge components are those referring to professional aspects (32.7%). In other words, those skills related to the application and use of knowledge to solve professional, application and knowledge transfer problems, and general or specific procedures of practical situations, work organization and planning, etc.
Based on these results, we find a lack of focus on knowledge and educational experiences provided by the university system in terms of professional and labor practices. So, universities need to maintain a balance ies.ccsenet.org International Vol. 11, No. 2;2018 between professional training as demanded by the labor market and academic education as required to ensure the basic knowledge of the distinct disciplines. Note. χ 2 = 33.63, df = 2, p <0.001. The percentage (%) does not total 100 since the categories are not mutually exclusive.
Second, in Table 13, we find that the most frequently evaluated skills with rubrics are the intellectual (88.7%), referring to the capacity to creatively resolve problems; by developing reflection, synthesis and evaluation strategies; the acquisition of skills to generate and design and implement applied and instrumental knowledge that adjusts to the needs of the real world.
On the other hand, the least frequently evaluated skills are the interpersonal (21.3%) ones, which are related to the ability to listen, argue and respect the ideas of others, to dialogue and work in a team, acquire individual and group responsibility.
According to these results, we may suggest that there is an absence in the evaluation of interpersonal skills, so vital for understanding behavior and desires of others, appropriately interacting with others and establishing empathy in order to effectively communicate in all aspects of life (academic, professional and personal). Note. χ 2 = 75.94, df = 3, p <0.001. The percentage (%) does not total 100 since the categories are not mutually exclusive.
Third, in Table 14 we find that the most frequently evaluated attitudes/values are personal commitment (26.7%), related to motivation development, attention and effort to learn, development of autonomy, having initiative, assessing advantages and inconveniences, making decisions on a personal and group level, taking responsibility, being committed to social change and development, gaining trust in oneself, and so on.
On the other hand, the less frequently evaluated attitudes/values are those of personal development (15.3%) referring to responsibility, rigor and systemization, the ability to express feelings, demonstrate appreciation, satisfactorily interact with individuals and groups, view the perspectives and contributions of others as learning opportunities, consistency, developing autonomy skills in work, with instrumental initiatives (adjustment, tolerance, flexibility) applicable to a wide range of unpredictable situations, and to develop skills related to lifelong-learning.
According to these results, we find an imbalance between both subcomponents, leading us to believe that educators feel that it is more important to evaluate personal commitment than professional development. According to this perspective, these results may occur because the majority of the rubrics do not evaluate competencies and content which implicates real or simulated situations of personal development. Therefore these types of attitudes/values are not highlighted. Note. χ 2 = 4.59, df = 1, p = 0.032. The percentage (%) does not total 100 since the categories are not mutually exclusive.

Conclusions
The analyses carried out and the results of the same have allowed us to identify how educators are carrying out competency-based evaluations. To achieve the general study objective, the degree of implementation of the competencies and disciplinary content of the rubrics was determined.
For the specific objectives, the purposes of the teachers in designing the scoring rubrics were identified, exploring the type of student reports that are the subject of evaluation, the predominant learning activities and the generic competencies that are being evaluated with rubrics.
As a general conclusion it may be suggested that educators continue to be overly attached to the evaluation of disciplinary and declarative aspects. Below we shall expand upon the analysis of these conclusions, providing some recommendations that should be considered in the design and application of the scoring rubrics.
First, regarding the learning results (see Table 5), we find that educators should promote education based on competencies that imply a greater integration and mobility of cognitive resources, in a real learning environment. Bartolomé & Grané (2013, p.76) suggest that: […] should question the manner of teaching, especially of those professors that continue designing their curriculum in terms of acquiring knowledge (supposedly stable) even when risking the ignoring of certain competencies that are needed by the students, content that is not only always insufficient but that also may be useless and even false.
Thus, educators need to reformulate their understanding of 'knowing' and their functions as catalysts for the acquisition of knowledge which may be consistently used, created, duplicated, shared, etc., by students who seek to attain specific knowledge, in a specific time and context.
Second, regarding the types of activities (see Table 6), it is concluded that educators may be using the rubrics to evaluate tasks related to knowledge 'reproduction' (e.g. write a report or essay), as opposed to more experiential character-related tasks (e.g. developing practices in a real context). Thus, the principles of applicability and transferability (as established by the EHEA) may be lacking.
Third, in accordance with the generic competencies (see Table 7), it has been found that the interpersonal competencies considered to have the same or greater importance in academic and professional education have been omitted.
Fourth, the results obtained in regards to the components and subcomponents of the competencies (Tables 8-14 Similarly, educators place more value in knowledge linked to a discipline or scientific area, intellectual skills and intrapersonal attitudes/values. From this perspective, it is necessary to establish a balance between the components and subcomponents that are evaluated in the rubrics with the goal of offering students an integral education. So, educators should focus their efforts on the development of competencies that integrate knowledge and attitudes/values for professional development, as well as the acquisition of interpersonal skills.
According to the above conclusions and recommendations, we can state that if universities professors wish to design quality rubrics, that is, rubrics that appropriately evaluate general and specific competencies, they should focus on the pedagogical elements of these evaluation tools. The results and conclusions of this study suggest some of these elements, which are essential in the design and application of the rubrics. Therefore, if professors follow these types of recommendations, rubrics may be considered authentic tools for competency-based evaluation.
From the results, it can be concluded that the evaluation of competences has implied, in addition to changes in the curriculum, transformations in institutional structures and organizational dynamics.
From the above, Villa and Poblete (2011), and Ion and Cano (2011) highlight the importance that the responsibles for each degree, supervise and ensure the acquisition and competency-based evaluation with the collaboration of the educator. In addition, it is essential to mention that all these decisions must be agreed by all members of the faculty, being the responsibility of each dean and / or career management and departments to carry out the orientations, follow-ups and evaluations of the commitments acquired. Similarly, Villa and Poblete (2011, p. 148) add that "compliance with the requirements and quality standards required by agencies and the Administrations themselves must be demanded".
Finally, according to Cano (2008, p.6), taking into account the recommendations of the University Coordination Council, all these changes should be stimulated through "a series of institutional measures to promote (information, motivation, awareness), training and execution (pilot projects, guides, networks, ...)". This author also describes the political and structural initiatives most valued by the rectoral teams as well as by deaneries or university departments such as "the elaboration of a strategic plan; the identification, visualization and dissemination of good practices; the consolidation of training programs and the definition and revitalization of an educational model of their own " (Cano, 2008, p. 7).
As a future research line, it is expected to conduct interviews with the university educators who designed the rubrics to know the impact of these in the evaluation of competences, the usefulness in the development of their teaching practice, the difficulties and limitations found in their design and application, as well as the level of satisfaction with its use, among other issues to consider.