Statistical Evaluation of Face Recognition Techniques under Variable Environmental Constraints

Experiments have shown that, even one to three day old babies are able to distinguish between known faces (Chiara, Viola, Macchi, Cassia, & Leo, 2006). So how hard could it be for a computer? It has been established that face recognition is a dedicated process in the brain (Marque ś, 2010). Thus the idea of imitating this skill inherent in human beings by machines can be very rewarding though the idea of developing an intelligent and self-learning system may require supply of sufficient information to the machine. This study proposes multivariate statistical evaluation of the recognition performance of Principal Component Analysis and Singular Value Decomposition (PCA/SVD) and a Whitened Principal Component Analysis and Singular Value Decomposition algorithms (Whitened PCA/SVD) under varying environmental constraints. The Repeated Measures Design, Paired Comparison test, Box’s M test and Profile Analysis were used for performance evaluation of the algorithms on the merit of efficiency and consistency in recognizing face images with variable facial expressions. The study results showed that, PCA/SVD is consistent and computationally efficient when compared to Whitened PCA/SVD.


Introduction
Face recognition is an easy task for humans.Although the ability to infer the intelligence or character from facial appearance is suspect, the human ability to recognize faces is remarkable (Turk & Pentland, 1991).According to Rahman (2013), the intricacy of a face features originate from continuous changes in the facial features that take place over time.Regardless of these changes, we are able to recognize a person very easily.
In recent years, face recognition techniques have gained significant attention from researchers partly because face recognition is non-invasive with a sense of primary identification.One of the main driving factors for face recognition is the ever growing number of applications that an efficient and resilient recognition technique addresses; for example, security systems based on biometric data, criminal identification, missing children identification, passport/driver license, voter identification and user-friendly human-machine interfaces.An example of the later category is smart rooms, which use cameras and microphones arrays to detect the presence of humans, decide on their identity and then react according to the predefined set of preferences for each person.
Factorization (NMF).These are all dimensionality reduction algorithms that seek to reduce the large dimensional face image data to small dimension for matching.Viola and Jones (2001) proposed a multi-stage classification procedure for face recognition that reduces the processing time substantially while achieving almost the same accuracy as compared to a much slower and more complex single stage classifier.Lienhart and Maydt (2002) extends their rapid object detection framework in two important ways: Firstly, their basic and over-complete set of haar-like feature was extended by an efficient set of 45° rotated features, which added additional domain-knowledge to the learning framework.Secondly, they derive a new post optimization procedure for a given boosted classifier that improves its performance significantly.Zhang, Ding and Liu (2015), also proposed an improved approach of PCA based on facial expression recognition algorithm using Fast Fourier Transform (FFT) during the preprocessing stage.They combined the amplitude spectrum of one image with phase spectrum of another image as a mixed image.
An important goal in image recognition is the ability to rate face recognition algorithms on the merit of efficiency and consistency in recognizing face images under variable environmental constraints.Until now, a face recognition algorithm's rate, runtime, sensitivity and descriptive statistics are the basic means of rating face recognition algorithms' performance.Delac and Grgic (2005) used some descriptive statistics to measure performance of face recognition algorithms.In their paper, they introduced measures of central tendencies, measures of dispersion, skewness and kurtosis of some template-based recognition algorithms and subsequently analysed the probability distribution of these algorithms.Beveridge et al., (2001) also investigated only Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) in not as much detail using descriptive statistics.
This work focuses on statistical evaluation of the recognition performance of PCA/SVD and Whitened PCA/SVD under variable environmental constraints (variable facial expressions).This research explores and compares techniques for automatically recognizing facial actions in sequence of images or detecting an "unknown" human face in input imagery and recognizing the faces under various environmental constraints.This paper uses more intrinsic statistical methods (Multivariate methods) to assess the performance of face recognition algorithms under variable environmental constraints.The research methods, results, discussion and conclusions are presented in subsequent sections.

Data Acquisition
A real time face image database is created for the purpose of benchmarking the face recognition system.Two hundred and ninety four (42 individuals) labeled frontal facial images were randomly acquired from Cohn Kanade, Japanese Female Facial Expressions database (JAFFE) at labeled faces in the wild and some local Ghanaian students facial database.Of Two hundred and ninety four images, one hundred and eighty two facial images from 26 individuals were collected from the Cohn-Kanade AU-Coded Facial Expression Database along the seven universally accepted principal emotions (Neutral, Angry, Happy, Fear, Disgust, Sad, and Surprise).Subjects in the available portion of the database were 26 university students enrolled in introductory psychology classes.They ranged in age from 18 to 30 years.Forty two (6 individuals) images were also from the Local Ghanaian database.In the creation of the database, the observation room was equipped with a chair for the subject and one canon camera.Only image data from the frontal camera were captured.Subjects were instructed by an experimenter to perform a series of 7 facial displays that included single action units.Subject began and ended each display from a neutral face.Before performing each display, an experimenter described and modeled the desired display.Six of the displays were based on descriptions of prototypic basic emotions (happy, surprise, anger, fear, disgust, and sadness).Image sequences from neutral to target display were digitized into 256 by 256 or with 8-bit precision for grayscale values.Seventy frontal face images (10 individuals) were also collected from Japanese Female Facial Expressions database (JAFFE) along the principal emotional constraints.All three databases were combined in the study.This helped to evaluate the face recognition algorithms on large and different databases.The new created GFD accounted for the originality of the study database.The study database was divided into two subsets, training database and testing database.The training database comprised all 42 neutral poses and testing database comprised the remaining 210 expressions (Angry, Disgust, Fear, Happy, Sad and Surprise).Figure 1 shows a section of the study database.

Recognition Procedure
The study focused on running PCA/SVD and Whitened PCA/SVD recognition algorithms on a created face database.The research evaluated the recognition performance of the algorithms and subsequently compared their results on the created face database.
Face image data were passed to face recognition modules as input for the system.The face images passed were transformed into operationally compatible format (resizing images into uniform dimension).The data type of the image samples were also changed into double precision and passed for preprocessing.The entire recognition exercise comprises a preprocessing stage, feature extraction stage and recognition stage.The adopted preprocessing procedures are basically, mean centering and whitening.This is to help reduce the noise level and make the estimation process simpler and better conditioned.
The selected template based algorithms were used to train the created image database.In the extraction unit, unique face image features were extracted and stored for recognition.The obtained facial features were passed to the classifier unit for classification of a given face query with the knowledge created for the available database.
For the implementation of the facial recognition, a real time database was created.For the implementation of the proposed recognition design, the database samples were trained for the knowledge creation and classification.In the course of the training phase, when a new facial image was added to the system, the features were calculated according to a particular recognition algorithm's procedure and aligned for the dataset information.The test face weight and the known weight in the database are compared by finding the norm of the difference between the test and known weights.A maximum and minimum difference signifies poor and close match respectively.Figure 2 is a design of the entire face recognition process.

Preprocessing of Frontal Face Image
Before applying any template-based algorithm on image data to be trained, it is useful to do some preprocessing.In this work, preprocessing is basically, Mean Centering and Whitening.This as indicated earlier on, is to help reduce the noise level and make the estimation process simpler and better conditioned.
As an illustration of preprocessing, Figure 3 shows six images selected from Japanese Female Face Expression database (JAFFE).Now from equation (1.0), clearly,   is a column vector of dimension  =  ×  and can be written as; where   replaces the   position wise.
The preprocessing steps are based on the sample  = ( 1 ,  2 , … ,   ) whose elements are the vectorised form of the individual images in the study.

Whitening
Whitening is a preprocessing technique that removes the noise factors in the observed image data,  so as to obtain a new image,  ⃑⃑ with uncorrelated components but equal unit variance.This is to say, the covariance of  ⃑⃑ is the identity matrix, .A simple way to whiten images is to find the eigenvectors and eigenvalues of the observed images through eigenvalue decomposition (for symmetric image matrix) or singular value decomposition (for asymmetric image matrix) of the covariance matrix.Suppose the covariance matrix,  is given by; (5.0) Define matrix  = ( 1 ,  2 , . . .  ) where   ,  = 1, 2, . . .,  is the  eigenvector of the covariance matrix  .Let  (  × ) be the diagonal matrix whose entries (  ,  = 1, 2, . . ., ) are the eigenvalues corresponding to the eigenvectors   ,  = 1,2, … , .The whitened images  ⃑⃑ are given by;  ⃑⃑ =  ; 1 2     (6.0) The covariance matrix  ⃑ ⃑ of  ⃑⃑ is given by; Figure 5 shows the whitened outcome of the six images shown in Figure 3.The whitened matrix,  ⃑⃑ , built from the eigenvalue decomposition of the covariance matrix  of the zero-mean observation, , creates a set of uncorrelated unit image variables.

Singular Value Decomposition (SVD)
Singular Value Decomposition (SVD) transforms correlated variables into a set of uncorrelated ones that better expose the various relationships among the original data items, while at the same time, identifying and ordering the dimensions along which data p o ints exhibit the most variation.Once SVD has identified the most variation, it is p o ssib le to find the best approximation of the original data points using fewer dimensions (Baker, 2005).Hence, SVD can be seen as a method fo r data reduction or dimensionality reduction.Consider an arbitrary real  ×  matrix , then there are orthogonal matrices  and  and a diagonal matrix , such that,  =   , where  is an  ×  matrix,  is an  ×  matrix and  is an  ×  diagonal matrix with diagonal entries   ≥ 0, ∀ = 1, 2, . . .,  and  11 ≥  22 ≥ • • • ≥   .In practice, the components of  are unknown and are to be estimated.The columns of  and  are called the singular vectors corresponding to the positive values (singular values) in the diagonal matrix .When these are used to represent vectors in the domain and range of transformation, the transformation simply dilates and contracts some components according to the magnitude of the singular values and possibly discards values and appends zeros as needed to account for a change in dimension.It is therefore clear that SVD tells how to choose orthonormal bases so that the transformation is represented by a matrix with the simplest possible form.

Principal Component Analysis (PCA)
PCA is concerned with elucidating the covariance structure of a set of variables.It seeks to find a set of basis images which are uncorrelated, that is, they cannot be linearly predicted from each other and also yield projection directions that maximize the total scatter across all classes or across all face images.According to Barlett et al., (2002), PCA can thus be seen as partially implementing Barlow's ideas: Dependencies that s h o w up in the joint distribution of pixels are separated out into marginal distribution of PCA coefficients.Most of the successful representations for face recognition, suc h as eigenface and local feature analysis are based on PCA.

Feature Extraction
Having these algorithms in mind, it is now time to seek a set of  orthonormal vectors,   , which best describes the distribution of the data.The  ℎ vector   is chosen such that is maximum subject to the orthonormality constraints.
=   = { 1 ,  =  0, The vectors   and scalars   are the eigenvectors and eigenvalues respectively of the covariance matrix .
The size of  ( × ) could be enormous and determining the eigenvectors and eigenvalues is an intractable task for typical image sizes.A known theorem in linear algebra states that t h e vectors   and the scalars   can be obtained by solving for the eigenvalues of   , respectively.
This means that t h e first  − 1 eigenvectors,   , and eigenvalues,   , of   are given by   and   respectively.  needs to be normalized in order to be equal to   . Hence, where   and   are the columns from  and  respectively.The principal components of the trained image set are determined by computing;

Results and Discussion
This section presents the statistical procedures used to evaluate the fore-mentioned recognition algorithms.The results of running these statistical tests on the study dataset are also presented and discussed.

Statistical Evaluation of the Face Recognition Algorithms
The recognition algorithms under study are PCA and SVD with Mean Centering as the preprocessing step (Algorithm 1) and PCA and SVD with Mean Centering and Whitening as the preprocessing step (Algorithm 2).
From the study database, 6-variates are collected per each algorithm from the Euclidean distance between the universally accepted principal emotions (Angry, Disgust, Fear, Happy, Sad and Surprise) and their neutral pose.
In assessing multivariate normality, a chi-square plot of the datasets (Algorithm specific) is done by plotting the generalized squared distances of the datasets against the chi-square quantiles.Figure 7 and Figure 8 show the chi-square plots of the datasets from the study algorithm 1 and Algorithm 2 respectively.The correlation, , values are 0.91359 and 0.95846 for algorithm 1 and algorithm 2 respectively are close to 1.These satisfy the assumption of a unit slope of the chi-square plot.Multivariate normality exists and hence can be assumed in subsequent statistical test that will be performed on the datasets.

Repeated Measures Design
The purpose of the test is to determine whether for each of the recognition algorithms under study, there exist significant differences between the average distances of the various poses from their neutral pose.
The Bonferroni 95% simultaneous confidence intervals for the individual mean difference is given by; , where  ̅  is the  element of  ̅ ,    2 is the  diagonal of   and ) is the upper 100 (  2 )  percentile of the t-distribution.
These confidence intervals will reveal specifically which constraints have significant differences in Euclidean distances when the different face recognition algorithms are used.Table 2 below shows the confidence intervals of estimates for the average of the difference in distances.This means for the two algorithms (Algorithm 1 and Algorithm 2) under study, there exist significant difference in their poses (Disgust, Fear, Happy, Sad and Surprise) recognition except their recognition of the angry pose. 1 = [−2764.2480, 971.3904] means, there is no significant difference in the average recognition distance on Angry pose between Algorithm 1 and Algorithm 2. It can therefore be inferred that, at 5% level of significance, both algorithms have significantly different average recognition distances for all poses except angry pose.

Test of Equality of Covariance Matrices (Box's M-Test)
This test will be used as a measure of consistency between the recognition algorithms.The test will reveal whether the variations in distances across algorithms in recognizing face images in the study database are equal or significantly different.The most consistent algorithm should have lower variation in recognition distances.The Box's test is based on the  2 approximation to the sampling distribution of .2095.3 > 32.671, hence the assertion of equality of covariance is not tenable at 5% level of significance.
We can therefore conclude that, the covariance of Algorithm 1 and Algorithm 2 are not equal.This means, the variations in the Algorithm 1 and Algorithm 2 recognition distances are significantly different.

Profile Analysis
For small sample size, profile analysis depends on the normality assumption (Johnson, & Wichern, 2007).The datasets under study are multivariate normal; hence this assumption of normality is satisfied.Profile analysis also works on the premise of equality of covariance matrices.Here, the pooled covariance is then used as the common covariance for the populations under study.The Box's M test revealed that, the covariance matrices of the algorithms under study are unequal.According Mettle, Yeboah and Asiedu (2014), the profile analysis is still feasible when the assertion of equality of covariance matrix is not tenable.That is, profile analysis can continue when unequal covariance exist.In this case the separate covariance matrices are used in the computation.
In this study, two independent normal populations each from the different study algorithms are collected.For example, angry pose data from algorithm 1 tested against angry pose data from algorithm 2.
A 95% confidence interval is given by; The 95% confidence intervals for the estimates of the ratio of variances are shown in Table 3 below.Clearly from Table 3, the confidence interval for angry poses [0.8676, 3.0029] contains 1 and hence the assertion of equality of variance of the two algorithms is tenable at 5% significance level.The remaining constraints (Disgust, Fear, Happy, Sad and Surprise) have confidence intervals that do not contain 1.Here, assertion of equality of variance is not tenable.This means the variances of the recognition distances for these poses are not equal.Now considering the constraint for which equality of variance is not tenable (Disgust, Fear, Happy, Surprise and Sad), estimate of the ratio of variance are given as 0.42141, 0.1479, 0.4697, 0.2283 and 0.2274 respectively.All these ratios are less than 1 and hence we can reach the conclusion that, the variations in Algorithm 2 are greater than that of Algorithm 1 in the recognition of these constraints.Subsequently, Algorithm 1 is considered as comparatively consistent in the recognition of Disgust, Fear, Happy, Sad and Surprise poses.

Conclusion
The runtime of Algorithm 1 and Algorithm 2 in the recognition of the 252 images is 70.470 seconds and 191.79 seconds respectively.The time used by algorithm 2 in the whitening process accounts for the differences in the algorithms' runtime (speed).The recognition rates of Algorithm 1 and Algorithm 2 are 92.86% and 88.10% respectively.It is evident from the above statistical methods that, the algorithms considered are significantly different in recognizing all poses except the angry pose.Although both algorithms are equally consistent in recognizing angry pose, Algorithm 1 (PCA with SVD and mean centering as preprocessing step) is comparatively efficient (from recognition rate) and consistent (from variation) in recognizing all other constraints under study.Algorithm 1 is therefore adjudged as comparatively better in recognizing face images under the variable environmental constraints.

Figure 1 .
Figure 1.Sample of Research Database

Figure
Figure 2. Research Design

Figure 4 .
Figure 4. Six mean centered images from JAFFE

Figure 5 .
Figure 5. Whitened images 9.0) where  = [ 1 ,  2 , … ,   ] The large correlated image dimensions are finally reduced to uncorrelated smaller intrinsic dimensions which display important characteristics of the image set.An unknown input face is passed through the steps below before identification.Following the steps in the feature extraction stage, a new face from the test image database is transformed into its eigenface components.First the input image is compared with the mean image (trained images mean) in memory and their difference is multiplied with each eigenvector from   .Each value represents a weight and is saved on a vector .This is done by looking for the face class that minimizes the Euclidean;   = | −   |.Figure 6 is a flow diagram of the study algorithms.

Figure 6 .
Figure 6.Flow diagram of study algorithms.

Table 1 .
Simultaneous Confidence Intervals.Measurements are often recorded under different sets of experimental conditions to see whether the responses differ significantly over these sets.In the case of this study, the Euclidean norms of various poses (Angry, Disgust, Fear, Happy, Sad and Surprise) along with their neutral pose are recorded by using two different recognition algorithms.Specifically for this study, 42 individuals were tested on the different recognition algorithms.The paired responses are analyzed by computing their differences, thereby eliminating much of the influence of extraneous unit to unit variation.

Table 3 .
Confidence interval for the ration of variance