Introducing and Evaluating the Behavior of Non-verbal Features in the Virtual Learning

,


Introduction
Educational systems are looking for e-Learning programs to address the challenges in the current traditional education system such as poor quality, high cost and limited access (Carol, 2005).E-Learning offers the possibility to learn irrespective of time and space.Potential contributions of e-Learning programs which can be used to overcome the problems of real world class are illustrated as follows: • Shortage of teachers: Providing high quality teaching materials, such as videos, interactive software or information from a "cloud" on internet or a local computer • Shortage of learning materials such as textbooks: The material could be made available on hand-held devices such as e-readers or mobile phones • Shortage of space: No need to cumulate to a specific place

•
Transportation problem: Any one can involve in being at any place (even participants are in different countries) Asynchronous and synchronous are the two main ways of delivering knowledge in e-Learning.Asynchronous e-Learning can be effectively used for in-depth discussions that take place over time, role playing, application activities based on case study scenarios, one-to-one interactions among students and activities that require more independent thinking time.Synchronous e-Learning can be effectively used for showcasing web or computer applications, explaining difficult concepts, delivering lectures via power point, structured group brainstorming, hosting guest speakers, new topic introductions, community building, and question and answer sessions (Johns Hopkins University, 2010).Synchronous e-Learning increases the student commitment and motivation as it provides quick responses through instant massages or by chatting with the learner (Leonard & Shawn, 2006).
The 3D virtual environment facilitates the e-Learning participants to meet, interact and conduct the synchronous e-Learning.Avatar is the artificial icon that represents the real user in the virtual environment.The avatar is activated based on the user commands, which are given by keyboard and/or mouse.Even though the avatar represents the real user in the virtual world, still it is questionable whether avatar makes a fair representation of a real user or not.The appearance of the avatar can make huge effect to the viewer's impression (Theonas, Hobbs, & Rigas, 2007;Alotaibi & Rigas, 2009).The changing the appearance of the avatar on par with the real user is a one way of enhancing the effectiveness of the virtual class.
Besides, communication is a critical point in every field including education.Not only the verbal communication but also the non-verbal communication has got a huge impact in the communication process and the non-verbal communication covers the highest portion in the process of communication which is based on the Mehrabian's communication model (Rakesh, Bimal, & Yohannes, 2011).The non-verbal cues make notable impact on the outcomes of the student responses and behavior.Though the non-verbal communication is an essential component in the conventional learning (Bentivoglio, Bressman, & Cassetta, 1997), virtual classroom lacks it due to the poor connection between the student and the avatar.
This study expects to overcome the drawbacks of the virtual classroom learning, lack of the non-verbal communication and absurd representation of the avatar which was not congruent with the real user by introducing an avatar to represent the real user's non-verbal communication.Besides, the variation of the non-verbal behavior with the virtual learning activities can be identified and it becomes the off shoot of this study.The objectives of this study are establish the fair representation of the real user through the avatar with introducing the non-verbal communication in the virtual learning environment and evaluate the behavior of non-verbal features during the virtual learning activities.

Background and Theoretical Frame Work
Non-verbal communication is a broad area which includes facial expressions, motion of the head, eye movement, gestures and it is difficult to implement all these non-verbal behaviors in the virtual class.The eye blink and the head pose are selected to introduce to the virtual class since the eye blink is an indicator of cognitive load (Bentivoglio, Bressman, & Cassetta, 1997) and focus of attention can be derived with the head pose estimation (Erik & Mohan, 2009).The previous studies are examined and briefly described below (see 2.1 and 2.2) to identify the factors that affect to the eye blink rate variation and recognize the proper methods to detect the eye blink and head pose estimation.

Eye Blink Behavior and Detection Methods
The behavior of an eye is one of the potent non-verbal gestures and it has the ability to create impressions on people.Previous researchers have investigated the effect of eye contact, gaze and gaze avoidance on impressions.
The eye contact and the eye gaze are frequently related to a positive impression and the opposite way reflects the negative impression (Amy, Murray, & Melinda, 2002).Based on the previous findings, the mean rate of the eye blink was 17 blinks per minute, during a conversation, it has increased to 26, and it was as low as 4.5 while reading.In comparison with the rest, the blink rate has decreased by -55.08% while reading and increased by 99.70% during the conversation.The most common blink rate pattern was conversation > rest > reading (Bentivoglio, Bressman, & Cassetta, 1997).When the eyes blink frequency increases, students become more nervous and more careless (Dharmawansa, Nakahira, & Fukumura, 2013).Another researcher reported a significant increase in eye blink frequency when subjects were required to solve anagrams than when resting.Tasks involving speech or memory increase the blink rate, while those required visual fixation reduce the blink rate (Bentivoglio, Bressman, & Cassetta, 1997;Recarte, Perez, Conchillo, & Nunes, 2008).Daydreaming, which produces visual fixation, is associated with low blink rate (Holland & Tarlow, 1975).Other factors besides cognitive, visual and memory tasks also influence blink rate.In addition, when the people are conversing or involving in an interview an increase in blink frequency can be noticed.Increased blink frequency generally reflects negative mood states such as nervousness, stress and fatigue.The blink rate depends on cognitive and emotional states (Ponder & Kennedy, 1928).The eye blink rate depends on the activity that he/she engages and it can be utilized as a measurement of the individual behavior in a group activity based on the historical evidences.
One of the most commonly used techniques to detect the eye blink is Electromyography (EMG) readings, which are obtained by using three small electrodes which are attached to the skin with micro-pore tape around the orbicularis oculi muscle (Anton, 2010).Although EMG based system is effective in detecting the eye blinks through muscle signals, EMG signal quality and electrical noise are affected negatively.Several advantages can be obtained from the web-camera based system compared to the EMG such as ease of the setup, placement of the web-camera and nothing needs to be attached to the person.The web-camera is a cheap and commercially available piece of hardware which is another advantage of the web-camera based system (Dmitry, 2004).

Head Pose Estimation
In addition to the eye blink, the head pose estimation is helpful in identifying the visual focus of attention of a person.The head pose estimation is naturally linked to visual gaze estimation, the ability to characterize the direction and focus of a person's eyes (Erik & Mohan, 2009).The head pose estimation requires the applications such as video surveillance, intelligent environments and human interaction modeling like visualization of human characteristics in the virtual world (Gourier, Maisonnasse, Hall, & Crowley, 2007).There are several approaches that have been used to estimate the head pose and those are listed as follows (Erik & Mohan, 2009).
• Appearance Template Methods

•
Detect or Array Methods

• Nonlinear Regression Methods
• Manifold Embedding Methods

• Flexible Models
• Geometric Methods

•
Tracking Methods

Hybrid Methods
Geometric methods have not reached to their full potential, yet modern approaches can automatically and reliably detect facial feature locations, and these approaches should continue (Erik & Mohan, 2009).
The detection of the eye blink by using web-camera based system and the head pose estimation by using geometric based method is considered in this study to introduce the behaviors of non-verbal features into the virtual Learning via an avatar.In addition, the analyses of the behavior of non-verbal features are helpful to observe the e-Learner behavior during the e-Learning activities.

Architecture of the System
The layout of the whole system is illustrated in the Figure 1.The students and the teacher in the real world are possible to enter the virtual world and all can get together to conduct the e-Learning activities in a specified virtual class, which is located in the virtual world.The eye blink and the head pose visualization systems are activated by the each student himself or herself when they begin to interact with the e-Learning activities.The behavior of the non-verbal features (head pose and eye blink) of the student is detected with a real time video which is obtained using web-camera.The acquired details of the each student are transferred to the virtual world through a server and deposited in a database.The transferred information assists in changing the appearance of the avatar.When the students move their heads or/and blink eyes, those behaviors reflected in the virtual world.Each participant is possible to view his or her behavior as well as of the others through their identical avatars in the virtual world even though they are not in the same place in the real world.The procedure to establish the eye blink and the head movement of the students are consists of several steps as follows: I. Detection of the eye blink II.Estimation of the head pose III.Modification of the avatar IV.Creating a link between the virtual world and the real world All these four steps are needed to complete the establishment of the behavior of non-verbal features and each step is described in the next sections.

Eye Blink Detection System
The procedure to detect the eye blink is briefly illustrated in Figure 2. A real time video of the e-Learner is obtained by using a web-camera continuously.The video consists of a set of frames and each frame is separated to apply the image processing activities.The frame consists of the upper part of the body of the e-Learner, if he/she is in front of the computer.Then the face of the e-Learner is detected in the frame to identify whether the person is available or not, by using Haar-feature based cascade classification.If the face is not detected, it is indicated that the unavailability of the person in front of the computer.Then the process is started from the beginning again.When the face is detected, the process detects the eyes continuously.The detection process of the eyes becomes ease due to the location of the eyes which can be roughly identified on the face area.The same method, Haar-feature based cascade classification which can be used to detect the eyes.If the eyes are not detected, it can be classified as a blink since the face has already detected by that time and eyes are not detected in the close status of the eye.When the eyes are detected, we have to clarify whether eyes are opened or not.Two measurements are used to identify the status of the eyes whether it is opened or not.Two measurements are height-to-width ratio of the eyes and the number of black and white pixels of the eye region.Variation of the values of the two measurements is helps to identify the eye status whether eyes are opened or not.When the eyes are opened, it can be classified as a "not blink".And the blink is identified, if the eyes are closed.Having identified the status of the eye, the procedure is started from the beginning as a continuing process to detect the eye blink of the e-Learner.The 81% of eye blinks can be identified with this system.

Head Pose Estimation
The basic steps to estimate the head pose are illustrated in Figure 3.The head pose estimation is also based on the real time video of the e-Learner.Initially, a web camera captures the image of the student and the student has to mark the face components such as eyes, nose and mouth and the marked areas are stored as source images for the prospective usage.At the same time, the system obtains the threshold value of the image for future usage.Then the web camera captures the face of the student continuously and the face tries to detect with Haar-feature cascade which is based cascade classification.The face detection indicates the view of the student whether frontal or not.The head pose estimation is divided into two parts based on the student view.Pitch, yaw and the roll angels are derived by degrees in frontal or near frontal view.At that time, the e-Learner engages in the computer and their head should be directed towards the monitor of the computer.The basic head directions are obtained from the view of non-frontal such as up/down or left/right when the e-Learner is not engaging with the computer.

Identify the Roll, Pitch and Yaw Angels
When the student is in the frontal view, the system identifies the angels of roll, pitch, or yaw by measuring the location of the face components.The face components such as eyes, nose and mouth are necessary to detect the establishment of a relationship of the face components in order to identify the angels of pitch, the roll and the yaw angels.Template matching is utilized to detect the face components of the user in each frame and the user marks face components in the very first step in the head pose estimation is considered as the source images.
The roll angel can be identified by the angel between the horizontal line and the line that is connected to the left and the right eye.The eyes, nose and the mouth are needed to identify the yaw angel.A line (eye-mouth line) should be drawn through the center point of the mouth and the middle point of the line that is connected to the pair of two eyes.When the nose point is overlapped with the line of eye-mouth, the yaw angel is zero.The positive or negative sign of the value helps to identify the direction of the yaw angel whether it is right or left.The distance between the nose and the eye-mouth line is proportional to the yaw angle and it is identified through the training of images.The pitch angel can be obtained by using the distance of two lines of eye-mouth and nose-mouth.The pitch angel is proportional to the ratio of two values.The first value is the length of the eyes-mouth line and the second value is the length of the nose-mouth line.The relationship is recognized by analyzing the images.The pitch, yaw and roll angels are identified by using this procedure and the methods are completely based on the geometric method with the coordinate system.This method achieves 87% of accuracy.

Identify the Basic Head Directions
The basic head direction is needed to derive when the user is in a non-frontal view.Therefore, the image, which is acquired from the real time video, should be smoothed and transferred to the HSV (Hue, Saturation and Value) color format from RGB (Red, Green and Blue) color format since the image from the web-camera has RGB color format.The image in HSV color format helps to identify the segmentation based on the variation of colors.The threshold mechanism is utilized to segment the image based on sharpness of the pixels in the image and transferred the pixels to white (above the threshold value) or black (below the threshold value) based on the threshold value which is identified at the very first step in the head pose estimation.Then the image consists of several contours and the biggest contour consists of an area with the hair of the user's head as shown in Figure 4(b).The system identifies the basic head direction of the user based on the biggest contour that consists of the user's hair.The rectangle can be drawn covering the biggest contour.That rectangle can be divided into two parts: right and left, the part with the highest portion of the white pixels of the biggest contour, gives the evidence of the head direction.When the left side of the rectangle consists of the highest portion of the white pixels, it is indicated that the person turns his/her head towards the left as shown in Figure 4(a).When the user turns his/her head towards the right side, the right side of the rectangle has the biggest portion of the white pixels as shown in Figure 4(c).The up and down of the head directions can be identified through the rectangle which is divided into two parts such as the upper and the lower part.When the biggest contour is appeared on top of the rectangle and the contour size is very large, the user turns the head downward, the "Down" direction and upward, the "Up" direction can be identified when the biggest contour is appeared at bottom part of the rectangle.The 84% of accuracy is achieved by this system.
When the face is detected, the system identifies the pitch, yaw and roll angels of the head.Otherwise it is identified the basic head directions such as right, left, down and up.Having completed the above steps, the web camera obtains the real time video and identifies the head pose of the user continuously till the user gives the stop command.

Avatar Modification
The eye blink or the head movement based on the user request in the virtual learning environment is not available through the avatar.Then the head model is created separately and attached to the avatar to represent the eye blink and the head movements are shown in Figure 5. Initially, the head model is constructed in the real Skin Template Eyes Eye lids world using Maya and exported to the virtual environment.The eyeball and eyelids are prepared using the objects in the virtual environment and attached to the head model.Then the complete head model is attached to the avatar.The rotation mechanism of the eye lids are used to represent the eye blink of the avatar and the rotation mechanism of the complete head model is used to represent the movement of the head.

Link the Virtual World and the Real World
The behaviors of the non-verbal features of an e-Learner are detected and the avatar in the virtual world is also modified to represent the behavior of the non-verbal features.The establishment of a connection between the real world and the virtual world is the ultimate task to visualize the behavior of the non-verbal features of the e-Leaner in the virtual world through the avatar.The connection between two worlds couldn't establish directly and it could be done through a server.The server helps to establish the connection as an intermediary and the information can be stored in the server.PHP, JavaScript are used to send the information and Http request is used in the virtual world to obtain the e-Learner information.Then the non-verbal details of the e-Leaner are successfully transferred to the virtual world through the server.
Ultimately, the behaviors of the non-verbal features of the real e-Leaner are possible to visualize in the virtual world through the avatar.When the e-Learner blinks or/and move the head, those behaviors are appeared in the virtual world as shown in Figure 6.

Pros and Cons of the System
Several benefits can be acquired through the implementation of the eye blink and the head pose visualization system.This system acts as a tool, which provides information regarding the behavior of non-verbal features to their teacher and peers.The teacher and student can view the real behavior of the non-verbal features of each other and it may be helpful to conduct the learning activities.The relevant decisions can be taken in the aspect of the teacher to conduct the learning sessions effectively by using student information.The connection between the teacher and the student is increased than the normal synchronous e-Learning class with this system.Further, the post-analysis of non-verbal information of the student is facilitated through the repository information in the server.That information will be vital for the teacher to identify the potentials and drawbacks of the class to plan the next e-Learning session more effectively.Since this system deals with two worlds, the information cannot be transferred instantly.The behavioral information is transferred from the real user in the real world to the avatar in the virtual world and time duration for the each activity is illustrated in the Table 1.Although this system is continuously processing, it takes averagely one second to acquire information.Then the information about the behavior of non-verbal features is transferred to the server and that process takes time duration around one second.Finally, the server takes 0.5 second to transfer information to the virtual world.This system provides information from the real world to the virtual world within 2.5 seconds.The time to transfer the information in this study is comparatively low compared to empirical studies.For example, there is a system that transfers information of the basic activity through mobile phones and it takes comparatively longer time to transfer information among the real world to virtual world (Musolesi et al., 2008).
Although the average time is 2.5 second to transfer information, the required time to transfer information through the server depends on the network condition.The time variation based on the network condition to transfer the information is measured.Table 2 depicts the changes in different data volume with the required time to transfer information.Information transferring time is increased with data volume.At least 0.1255 second is needed to visualize the behavior of non-verbal features in the virtual world and it gradually increases with data volume.

Experiment
The experiment was conducted to identify the influence of the visualization system on the behavior of non-verbal features in the virtual Learning especially with regards to the establishment of non-verbal communication and the representation of the avatar instead of the e-Learner.In addition, it is needed to evaluate the behavior of non-verbal features during the virtual Learning.

Subjects
Twenty four students who were in between the age 25-30 from different parts of Asia namely Sri Lanka, India and Nepal were used as the sample of this study.They are following a postgraduate course and majoring in Management and Information Science Engineering.The subjects were instructed to attend two e-Learning sessions and both sessions had the same conditions except the visualization of the behavior of non-verbal features of the e-Leaner.These sessions were conducted as a problem based learning sessions.The first session was conducted with the visualization system of behavioral features and the other one was conducted without the visualization system.Each student had a web-camera to detect the behavior of non-verbal features and a personal computer to engage the e-Learning in activities in the virtual environment.The problem for the discussion is based on the subject of Operational Research and the course materials were provided by the Department of Industrial Management of Wayamba University of Sri Lanka.There were large screens to present a power point presentation to indicate the details of the problem and a large excel sheet, which was possible to edit by any e-Learner to enter their answers.All subjects were experienced two sessions (within-subjects experiment).Each session was taken 20-25 minutes.A pre-formatted questionnaire was distributed to obtain the response of the participants after each session.The voices and the behavior of the non-verbal features of the all participants were recorded with the time.

Analyze the Student Response
The analysis of the questionnaire is illustrated in Figure 7. Almost all the variables included in the questionnaire were positively responded during the session that was based on the behavior of non-verbal features than the session which was not based on the non-verbal features except three variables.Two out of those three variables equally responded and the other one negatively responded.The negatively responded factor was "Did you feel relax?"E-Learners may feel relax when they didn't involve with the detection of their non-verbal features since they know that when they use the non-verbal visualization system, web-camera and lighting equipment are occupied in front of the user to grab the behavior of non-verbal features.They feel relaxed when they didn't use the non-verbal visualization system since their non-verbal behavior was not observed.The other two variables were "Did you behave carefully?"and "Didn't you feel nervous?" that were responded equally.The both sessions were conducted as a problem based learning session and that may be the reason for them to respond equally for these two variables, nervous and behavior.
Figure 8. Eye blink frequencies of the all students during relax and group discussion Other all variables were responded positively and the highly evaluated variables were "Attractiveness", "Avatar gives fair representation to the real user", "Interesting", "Identify yourself", "All participate" and "Active participation" respectively.The highest gap between the two sessions appears under the variable of "Attractiveness".When the non-verbal behavior of the real user is appeared through the avatar, the appearance of the avatar becomes realistic and it may affect the viewer's impression on "Attractiveness" as the avatar appearance change the viewer's impression (Theonas, Hobbs, & Rigas, 2007;Alotaibi & Rigas, 2009).The variables of "Avatar gives fair representation to the real user" was evaluated as the second highest variable and it was indicated as the reflection of the real user non-verbal information through the avatar and it is a one way of making the live avatar giving reasonable representation to the real user.The variables of "Interesting" and "Identify your-self" were evaluated highly due to the enhancement of the viewers' impression with the realistic avatar appearance with non-verbal behaviors.The other positively evaluated variables were "All participate" and "Active participation" and these were indicated as improvements of the virtual learning through the establishment of non-verbal communication in the virtual environment.

Analyze the Behavior of Non-Verbal Features
The non-verbal information of the e-Learners is analyzed to identify the behavior of non-verbal features during the e-Learning activities.All participants' head movements are limited to the frontal or near frontal view and there is no specific information to discuss about the head movement.The average frequency of the eye blinks during the e-Learning session and the relaxing time of the e-Learning participants' are illustrated in Figure8.The graph clearly shows that the eye blinking during the e-Learning activities was very low compared to the relaxing time and it has reduced by 35%.The frequency of the eye blinks may decrease due to several reasons.E-Learners were deal with the computer and the virtual environment that was also providing an attractive atmosphere to engage in this experiment.That might be a reason for decrease of the eye blink during the e-Learning activities since they keep the eyes opened to catch the critical attractive visuals while reducing the blink to avoid the loose of attractive visuals (Nakano, Yamamoto, Kitajo, Takahashi, & Kitazawa, 2009).Although e-Learning session was based on the problem, the stress might be decreased due to the shared responsibility of a group activity.Less cognitive load or low stress, rich critical information or attractive visual information and engage in a computer or visual equipment are some reasons for a low frequency of eye blink (Holland & Tarlow, 1975;Nakano et al., 2009).student and the rich conversation was also performed by the same e-Learner.As well as, the fifth student indicated the lowest eye blink rate and a poor conversation in comparison with the other students.Although the eye blink frequencies reduced during the experiment, the top eye blink frequencies appeared among the e-Learners who had a rich conversation.

Conclusion
The eye blink and the head pose of a real user are detected from a web-camera utilizing the geometric method and non-verbal information are transferred to the virtual learning environment.The behaviors of non-verbal features of the e-Learner are established in the virtual learning environment.When the real e-Learner blinks his/her eyes and/or moves the head that information appears in the virtual environment through an avatar during the e-Learning activities as a real time continuous process.
The rate of eye blink reduced by 35% during the problem based Learning activity in the virtual class than the relaxing time due to the several reasons such as the attractiveness of the virtual environment, low stress and engaging in the activities which are visually attractive.Although the eye blink frequency reduces during the virtual learning activities, the highest eye blink rate of e-Learner has rich conversation during the e-Learning.It reflects the eye blink depend on internal state of the e-Learner.
The reflection of the real user's non-verbal behavior through the avatar can change the appearance of the avatar and it has a notable impact on the viewer's impression.The avatar performs a reasonable representation to the real user with visualizing the non-verbal behavior in the virtual environment.Further, the rich communication is established in the virtual environment through a successful establishment of the real user's non-verbal behavior.
The overall effectiveness of the virtual learning enhances the implementation of the real user non-verbal information in the virtual environment.

Figure 1 .
Figure 1.Architecture of the whole system

Figure 2 .
Figure 2. Procedure for eye blinks detection

Figure 3 .
Figure 3. Procedure for head poses estimation

Figure 4 .
Figure 4.The biggest white contour represent the hair of the user (a) when the user turns to the left and the left side of the rectangle has the highest number of white pixel (b) when the user looks forward (c) when the user turns to the right and the right half of the rectangle has the highest number of white pixels

Figure 5 .
Figure 5. Avatar modification to represent the eye blink

Figure 7 .
Figure 7. Analysis of the responses of the e-Learners for the factors in questionnaire

Figure 9 .
Figure 9. Eye blink frequencies of each student during relax and group discussion

Table 1 .
Time duration for transfer student information

Table 2 .
Time for transfer data based on network condition