Two-dimensional Heteroscedastic Discriminant Analysis for Facial Gender Classification

In this paper, a novel discriminant analysis named two-dimensional Heteroscedastic Discriminant Analysis (2DHDA) is presented, and used for gender classification. In 2DHDA, equal within-class covariance constraint is removed. Firstly, the criterion of 2DHDA is defined according to that of 2DLDA. Secondly, the criterion of 2DHDA, log and rearranging terms are taken, and then the optimal projection matrix is solved by gradient descent algorithm. Thirdly, face images are projected onto the optimal projection matrix, thus the 2DHDA features are extracted. Finally, Nearest Neighbor classifier is selected to perform gender classification. Experimental results show that higher recognition rate is obtained by way of 2DHDA compared with 2DLDA and HDA.


Introduction
Gender classification using face images is a challenging work due to the similarity between male and female face images.Thus, discriminant feature extraction is a key step to improve recognition rate.Linear Discriminant analysis (LDA) is a well-known approach for feature extraction and dimensional reduction.However, it often encounters the Small Sample Size problem (S3 problem) when the number of samples is less than the dimensionality of samples.Then, two-dimensional Linear Discriminant analysis (2DLDA) is proposed, in which discriminant features are extracted directly from 2-D images without a vectorization procedure, the computation cost is reduced and the S3 problem is overcome.However, in both of LDA and 2DLDA, it is assumed that the covariance matrices are equal for all sample classes.Thus, when the within-class covariance of each sample class is significantly unequal, optimal performances can not be gained by LDA and 2DLDA.
Heteroscedastic Discriminant analysis (HDA) is extended from LDA, in which equal within-class covariance constraint is removed.HDA can be viewed as a constrained Maximum likelihood (ML) projection, the constraint is given by the maximization of the projected between-class covariance volume and each class a single full covariance Gaussian model is satisfied.HDA is widely used in speech recognition and recognition rate is greatly increased than that of LDA.But in 1D-based approaches, the transformation matrix is difficult to calculate due to high dimensionality and extreme sparseness of the data.In this paper, based on 2DLDA and HDA, two-dimensional Heteroscedastic Discriminant analysis (2DHDA) is presented and used for gender classification.Firstly, the criterion of 2DHDA is defined, and log and rearranging terms are taken, then optimal projection matrix is solved by gradient descent algorithm.Secondly, face images are projected onto the optimal projection matrix, thus the discrimination features of face images are extracted.
Finally, Nearest Neighbor classifier is selected to perform gender classification.Experimental results show the validity of 2DHDA method.

Presented Approach
Suppose there are C sample classes, represented by 1 2 3 , , , , c A A A A L respectively.The total number of samples is N and each class includes n samples, that is nc N = .
× ∈ i ml j A R denotes the j th ( 1,2,3, , ) sample which belongs to the i th ( 1,2,3, , ) class.Thus, the mean of the i th sample class is

2DLDA Approach
2DLDA's criterion is defined as where w S is called within-class covariance matrix and b S is called between-class covariance matrix of training samples, expressed respectively as ( )( ) Transformation matrix 2DLDA θ is calculated by the solution of the eigenvalue and eigenvector problem of

2DHDA Approach
2DHDA is the heteroscedastic extension of 2DLDA.In 2DHDA, equal within-class covariance constraint is removed and the criterion is defined which maximizes the class discrimination in the projected subspace.The criterion of 2DHDA is defined as where denotes the covariance matrix of the i th sample class.Thus, According to equation ( 1) and ( 4), if covariance matrix W i of all sample classes is assumed equal, then ( ) ( ) is satisfied and 2DHDA is become 2DLDA.By taking log and rearranging terms, we get ( ) H has two useful properties of invariance [5].For every nonsingular matrix . This means that subsequent feature space transformations of the range of c will not affect the value of the criterion.The second is that the criterion is invariant to row or column scalings of 2DHDA θ or eigenvalue scalings of T 2DHDA 2DHDA θ θ .Using matrix differentiation, the derivative of H is given by However, there is no close-form solution for ( ) ( ) is satisfied, the testing sample test A is classified to the p th class, where Y p q represents the feature matrix of training sample A p q , and p , q are constants.

Experimental objects
Experiments are based on Feret color face database and face database from University of Essex, UK.In Feret color face database, the images are varying in position, lighting and expression.We selected 10 male individuals, 10 female individuals with each individual 20 face images.Thus, there are 400 face images for experiments.In experiments, the images are chopped and resized to 100×90, then transformed to gray-scale images, as shown in Fig. 1.
In the face database from university of Essex, the images are with a resolution of 200×180, and with each individual 20 face images that vary in position, rotation, expression and lighting.We select 19 male and 19 female individuals, totally 760 face images for gender classification experiments.The original images are color images, we transformed them to gray-scale images and chopped them with a resolution of 80×70, as shown in Fig. 2.

Experimental results and analysis
In gender classification, there are only two classes, that are male and female respectively, thus in equation ( 4 Based on Feret color face database, firstly, the former 5 individuals of male and female, totally 200 face images are selected as training samples, the remains as testing samples.Experimental results are shown in Fig. 4. Secondly, for male and female, the former 4, 6, 8 individuals, totally 160, 240, 320 face images are selected as training samples respectively, experimental results are listed in Table 1.Fig. 4 illustrates that, when totally 200 images are selected as training samples, the highest recognition rate of 2DHDA is 85.00%, which is 4.5% higher than that of 2DLDA.Table 1 shows that when 320 images are selected as training samples, the recognition rate of 2DHDA is 88.75%.However, the recognition rate of 2DLDA is only 83.75% and that of HDA is only 80.00%.When 160 and 240 images are selected as training samples respectively, we can know that the recognition rates of 2DHDA are also higher than that of 2DLDA and HDA.In table 1, when HDA is used for gender classification, PCA is used as a pretreatment step for dimensional reduction.
Based on face database from university of Essex, firstly, 20 individuals with 10 male and 10 female, totally 400 face images are selected as training samples and the remains as testing samples.When different numbers of feature dimension are selected, the results are shown in Fig. 5.Then, for male and female there are 9, 11 and 13 individuals for each class, totally 360, 440 and 520 face images are selected as training samples respectively, and the remains are selected as testing samples.Experimental results are listed in Table 2. Fig. 5 demonstrates that, when 400 images are selected as training samples, the highest recognition rate of 2DHDA is 79.44%.The highest recognition rate of 2DLDA 70.89%, which is 8.55% lower than that of 2DHDA.Table 2 expresses that, the recognition rate of 2DHDA is higher than that of 2DLDA and HDA when 360, 440 and 520 images are selected as training samples respectively.

Conclusions and Future Work
In this paper, we presented the 2DHDA algorithm for gender classification using face images.

.k
Instead, the gradient descent algorithm is used for the optimization of H and 2DHDA θ is solved.Usually, face images are projected onto the whole 2DHDA θ , the most discriminant features are could not extracted, thus, former d column vectors of 2DHDA θ are selected as projection axes, then, the extracted features expressed as 2DHDA former d column vectors of 2DHDA θ and Y represents the extracted feature matrix of sample A .2.3 Nearest Neighbor classifierAfter a transformation of 2DHDA, Nearest Neighbor classifier is selected to perform gender classification.Suppose test Y denotes the feature matrix of an arbitrary testing sample test A , Y i j denotes the feature matrix of training sample are the k th column vector of Y i j and test Y respectively.
descent algorithm is used for the optimization of H , 2DLDA θ is selected as the initial matrix of 2DHDA θ for iterations.Finally, Nearest Neighbor classifier is used for gender classification.The classification model is shown in Fig. 3.

Figure 2 .Figure 3 .
Figure 2. Face images in face database from university of Essex

Table 2 .
Correct recognition rates based on face database from universityof Essex when different numbers of training samples are selected Figure 1.Face images in Feret color face database