Three-Step Parameters Tuning Model for Time-Constrained Genetic Algorithms

In this paper a three-step parameters tuning model for time-constrained Genetic Algorithms (GAs) was presented. The first step involved modeling the objective function using multiple regression model where the fitness value was the response variable and the GA parameters were the regressors. The second step involved constraint modeling using the objective function found in the first step and using the upper and lower limits of the GA parameters along with an upper limit on the execution time as constraints. The third step involved optimizing the constraint model found in the second step using a suitable deterministic optimization method to determine the optimal GA parameters taking into consideration four aspects that affect the GA performance. These aspects were: the problem under consideration, the GA parameters used, the execution time, and the power of the computer used. The validation of this model was demonstrated using two capacitated lot sizing problems. The model was able to predict the fitness values and the optimal parameters of the GA for these problems to a high degree of precision. Moreover, the results showed that tuning the GA parameters using multiple regression along with a suitable deterministic optimization method was an effective and robust method that enhanced the performance of the GA. The statistical analysis showed that in order to do a proper tuning for a certain GA, the designer of the GA must take into consideration not only the type of problem but also the size of the problem, the allowable execution time, and the hardware used in executing the GA. Furthermore, the results agreed with the "No Free Lunch" theorem.


Introduction
Genetic algorithms are search algorithms based on the mechanism of the natural selection.The most basic concept is that strong individuals tend to survive while the weak individuals tend to extinguish.That is, optimization is based on evolution and the "Survival of the fittest" concepts.The basic algorithm is as follows: 1.1 Begin -Initialize population with random candidate solutions -Evaluate each candidate -Do while termination condition is not satisfied (1) Select parents (2) Recombine parents to generate offspring (3) Mute the resulting offspring (4) Evaluate new candidates (5)Select individuals for the next generation GA can have many parameters that need to be tuned to optimize its performance.These parameters include, and not limited to, population size, generation number, crossover rate, execution time, and mutation rate.
It is well known among the GA users that different problems required different strategies in their GAs.Moreover, for the same problem, the set of parameters differs with the complexity of the problem itself.This phenomenon is known in literature as "No Free Lunch" theorem (NFLT).This means that the GA parameters cannot be isolated from the characteristics of the class of functions being optimized, the matter that emphasizes the non-existing of a fixed set of parameters that can optimize the performance of GA for a certain class of functions.The knowledge about the problem integrated into the evolutionary algorithm can enhance considerably the effectiveness of the approach and improve significantly the convergence rate and the quality of the results.The optimization method has shown good efficiency for some applications, but its performance is dependent on the values selected for the parameters of the algorithm.For example, if a Travel Salesman Problem (TSP) is considered, the complexity of the problem increases as the number of nodes increases, the matter that calls for using a different set of parameters in the GA for each different TSP problem according to its complexity.This is true because using different combinations of parameters values with a certain GA gives it different exploration and exploitation power, which makes the underling GA more or less suitable for the specific problem.Generally speaking, high crossover rate (among other parameters) increases the exploration power of the GAs while high mutation rate (among other factors) increases their exploitation power.Moreover different parameters values may increase or decrease certain inherited problems in the GA such as premature convergence problem and high execution time problem.For example using a small population size may cause a premature convergence problem while using high population size may cause inefficient computational time problem.There is almost an agreement among the GA users about the non-existing of a universal GA.This matter calls the GA designers to tune their GAs to properly suit a specific problem.
It is worth to mention here that parameters tuning problem differs significantly than parameters control problem.In the first problem, the tuning of the parameters happens before the use of the GA and the values of the parameters stay fixed throughout the application of the GA, while in the second problem, the values of the parameters change during the application of the GA.For example Ramadan used a linearly adaptive crossover rate and mutation rate parameters in multiple objective GA to optimize the reliability testing.These adaptive rates are considered as parameters control not parameters tuning as the values of the parameters are changed according to the generation number.
Many attempts were done before to tackle the parameters tuning problem in GA.Meta-EA along with another EA is used to search for the best set of parameters values in.Results showed that the Meta-EA can find a set of parameters values that are better than the recommended default values found in the literatures.Unfortunately, Meta-EA has some inherited problems.These problems can be summarized into three major points as follows: First, not all parameters can be searched by EAs.For example if one of the parameters in the underlying GA is the type of crossover, then this parameter cannot be searched by the Meta-EA as there is no way to define a reasonable distance metric along this dimension.Second, there may be a situation that not all parameters are applicable for a certain GA.For example if there is a parameter for the number of generations at which the GA terminates and the underlying GA does not use this criterion for termination, then the number of generations parameter is not applicable.Finally, this method is time consuming due to its stochastic nature which renders it impractical.
Yuan and Gallagher used Meta-EA along with the Racing statistical method to deal with the tuning problem.Racing searching method is known for its time efficiency comparing to full enumeration method because it quickly identifies weak solutions and discards them from the sample space based on statistical tests.Unlike Meta-EA, the Racing searching method does not need a distance metric; consequently, it can be used to compare different GAs.
In this paper, the GA parameters tuning problem is solved in three steps: the first step involved modeling the objective function using multiple regression model where the fitness value was the response variable and the GA parameters were the regressors.The second step involved constraint modeling using the objective function found in the first step and using the upper and lower limits of the GA parameters along with an upper limit on the execution time as constraints.The third step involved optimizing the constraint model found in the second step using a suitable deterministic optimization method to determine the optimal GA parameters taking into consideration four aspects that affect the GA performance.
The rest of the paper will be organized as follows: Section 2 contains a discussion about the aspects affecting GA performance, section 3 presents the proposed model, section 4 describes the GA used for validation, section 5 validates the model and discusses the results, and finally, section 6 concludes the paper.

Aspects Affecting GA's Performance
The aspects affecting the GA's performance were divided into four main categories and were discussed individually.These aspects are: the problem under consideration, the GA parameters, the execution time, and finally, the power of the computer used.Tests of hypotheses were conducted to test whether there is a significance difference between the average performances of the GA under the associated aspect.

The Problem Aspect
The problem type and size are considered the major elements of this aspect.The NFLT guarantees that there is no universal GA; therefore, each type of problems must have its own GA's strategies to perform well.
The other element of the problem aspect is the size of the problem.To show whether there is a significant change in the average performance of the GA under different problem sizes or not, a TSP problem was solved using the same GA parameters, execution time, and computer power.The following hypothesis was tested: H 0.1 : There is no significance difference between the mean performance of the GA when solving a small TSP problem and the mean performance of the GA when solving the same problem with larger size at significance level of 0.05.H 1.1 : There is a significance difference between the mean performance of the GA when solving a small TSP problem and the mean performance of the GA when solving the same problem with larger size at significance level of 0.05.A100 simulations were performed on the standard Eil51 TSP problem and another 100 simulations were performed on standard Eil101 TSP problem.The fitness values were standardized using the following standardization equation: Where is the standardized fitness value for simulation i, is the fitness value for simulation i, and BFV is the best solution ever found in the literature for this problem.T-Test of difference = 0 (vs not =): T-value = 33.14P-value = 0.000 DF = 117.
The results show that the P-value is less that 0.05, which means that the null hypothesis is rejected and the alternative one is accepted.Therefore, the mean performance of the GA differs significantly when changing the size of the problem, which means that the size element of the problem aspect does indeed affect the performance of the GA.

The GA Parameters Aspect
To show whether there is a significant change in the average performance of the GA under different GA parameters or not, the Eil101 TSP problem was solved twice using different set of GA parameters while keeping the other three aspects fixed.The following hypothesis was tested: H 0.2 : There is no significance difference between the mean performances of the GA when changing the GA's parameters at significance level of 0.05.H 1.2 : There is a significance difference between the mean performances of the GA when changing the GA's parameters at significance level of 0.05.Two sets of 100 simulations were performed on the Eil101 problem.Each set had a different parameters for the GA.The results show that the P-value is less that 0.05, the matter that allowed us to reject the null hypothesis and to accept the alternative one.This means that the mean performance of the GA differs significantly with respect to the GA parameters used.This result shows clearly that GA parameters aspect has an effect on its performance.

The Execution Time Aspect
To show whether there is a significant change in the average performance of the GA under different allowed execution times or not, the Eil101 TSP problem was solved twice using 5 seconds and 60 seconds while keeping the other three aspects fixed.The following hypothesis was tested: H 0.3 : There is no significance difference between the mean performance of the GA when changing the allowed execution times at significance level of 0.05.H 1.3 : There is a significance difference between the mean performance of the GA when changing the allowed execution times at significance level of 0.05.Two sets of 100 simulations were performed on the Eil101 problem.The first set had 5 seconds and the other one had 60 seconds.Table (3 The results show that the P-value is less that 0.05, which means that the null hypothesis is rejected and the alternative one is accepted, hence, the mean performance of the GA differs with respect to the execution time used, and therefore, the execution time affects the performance of the GA.

The Power of the Computer Used Aspect
To show whether there is a significant change in the average performance of the GA under different computers powers or not, the Eil101 TSP problem was solved twice using two computers: One with i7 core computer and the other one with AMD Phenom (tm) IIX6 while keeping the other three aspects fixed.The following hypothesis was tested: H 0.4 : There is no significance difference between the mean performances of the GA when changing the power of the computer used at significance level of 0.05.H 1.4 : There is no significance difference between the mean performances of the GA when changing the power of the computer used at significance level of 0.05.Two sets of 100 simulations were performed on the Eil101 problem.The first set using the AMD Phenom (tm) IIX6 computer and the other one using the i7 core computer.The results show that the P-value is less that 0.05, which means that the null hypothesis is rejected and the alternative one is accepted.This means that the mean performance of the GA differs with respect to the power of the computer used.This result shows clearly that the power of the computer used to run the GA affects the performance of the GA.
The overall result of this section suggests that in order to do a proper tuning for a certain GA, the designer must take into consideration not only the type of problem but also the size of the problem, the allowable execution time, and the hardware used in executing the GA.The next section proposes a tuning model that accounts for the four different aspects affecting the performance of any GA.

GA Parameters Tuning Model
The GA parameter tuning problem is the problem of tuning the GA's parameters values to optimize the performance of the GA under consideration.In this model the objective function represents the estimated fitness function of the GA at a certain set of GA's parameters values (decision variables), while the constraints are the upper and lower limits of these parameters along with an upper cap on the execution time.The objective function will be found using multiple regression analysis where the response variable is the fitness value of the GA and the regressors are the GA's parameters values."Maximum Execution Time" termination strategy will be formulated and used as a constraint on the maximum allowable execution time for the underlying GA to account for the execution time aspect.The complete model can be expressed mathematically as follows: ≥ 0 (4) Some parameters can be integers such that is the estimated fitness, φ .and ϑ .are two multiple regression models corresponding to and the execution time respectively.is the set of GA parameters, and are two sets containing the lower and the upper limits of the GA parameters respectively, is the maximum allowable execution time.
The effectiveness of this model was demonstrated using a GA for Capacitated Lot Sizing Problem (CLSP) with five parameters.In the CLSP the quantity and timing of the orders are selected under the resource availability and minimum total cost constraints.
The lot size for a certain period depends on the demand for that period and the demands of the subsequent periods.In general, the objective of the CLSP is to find the production schedule that minimizes the total production cost given the unit production cost, the inventory holding cost, the customer demand, the set up cost, and the discrete and finite production horizon under the constraint of production capacity.The lot sizing decision will be made at the beginning of each period in the time horizon such that the customer needs are satisfied by the end of the time horizon.
GA will be used to solve this problem.In this GA, a binary chromosome representation is used where "1" in a certain period means that this period will cover exactly the need for this period and the needs for all other "0" periods that come to the right of it until the first period that has "1" is again reached.Figure 1 shows a demonstration of the chromosome representation for three products and 4 periods CLSP.
Figure 1.Chromosome representation for 3 products and 4 periods Moreover, Mode crossover strategy is proposed and used where the mode of the corresponding genes for certain number of randomly selected chromosomes (NCHC) is found (from the best 25% chromosomes of that generation) and then copied to the corresponding gene in the offspring.It must be noticed here that the NCHC must be an odd integer number which is less than or equal to the number of chromosomes involved in the crossover to assure that there is only one mode for each gene.Moreover, This crossover strategy guarantees that the offspring is feasible.The probability of performing crossover is governed by the crossover rate (P c ). Figure 2 shows how the Mode Crossover strategy works.

Figure 2. Example for the Mode Crossover strategy
Random mutation strategy is used in this GA in which a random gene in the offspring is located and its value is changed such that if the original value is "1" it will be changed to "0" and vice versa.Caution should be taken here such that the first gene for each product must always be "1" to guarantee that the demand for the first period is always covered.The probability of applying mutation to a certain offspring is governed by the offspring mutation rate (OffP m ) while the probability of mutating a gene in that offspring is governed by the mutation rate (P m ).The population size used in this GA is (P S ).
Often in real life situations, the GA has to run under the pressure of time constraint, so the GA parameters must be tuned to reach the optimal parameters combination that will give the best fitness value within the allowed execution time.Therefore, this GA will terminate when seconds of execution time is elapsed.
The fitness function for CLSP is described in details in Suer et al. [14] and is repeated here for convenience.
( ) ( ) I it : Inventory for product i in period t.
X it : Binary variable with 0 if no production in period t of product i and 1 otherwise.

Results and Discussion
Two CLSPs from Suer et al. [14] are used to demonstrate the effectiveness of the tuning model, one with 5 products and 12 periods and the other one with 15 products and 6 periods.The relevant data for the two CLSPs is summarized in Table (5).As this paper discusses the tuning problem of the GA parameters, it is irrelevant to discuss in details the different GA strategies used as we are only interested on the effect of the parameters values on the performance of the GA.Therefore, the GA will be considered as a black box where the inputs to this box are the GA parameters values and the output is the fitness value.Thus, only the values of the parameters will be presented.Table (6) represents the upper and lower limits for the GA parameters that have been used in the experimentation.

Experimentation on Problem 1
One hundred random and distinct combinations (within the parameters limits) were generated for the parameters values.For each combination, the GA was simulated 100 times using i7 core computer and the average fitness value at that combination was recorded.
Multiple linear regression model was used to model the fitness function φ of this problem.as the fitness function of the model and all its constraints are linear, the mixed linear programming (MLP) optimization method was used to optimize the model.Verification of using the multiple linear regression model to model the φ for this problem will follow.
Minitab software was used to carry out the statistical analysis and Microsoft Excel solver was used to optimize the model.Table ( 7) shows the relevant statistical analysis for the regression model.
The significance of the model was 0.000.This clearly shows that our model is significant at 5% significance level; hence, the GA parameters can indeed predict the fitness value with high precision.The adjusted R-square value for the regression model was 84.6%, which indicates that our model has accounted for 84.6% of the variation in the fitness values.Moreover, the table also shows that P m , OffP m , and NCHC have the biggest impact on the model as their absolute t-statistics values are large and their p-values are small.The results showed that the other parameters (P c and P s ) have little impact on the fitness value as their p-values are relatively high.
The Collinearity diagnostics shows that model 1, which included only p m parameter accounted for 59.2% of the variation in the fitness values (Adjusted R-square =0.592).The inclusion of the OffP m into model 2 resulted in an additional 22.8% of the variation in the fitness values being explained with a total Adjusted R-square of 82.%.The third model 3 also included NCHC, and this model resulted of an additional 2.4% of the variation in the fitness values being explained with a total of 84.40% of the variation in the fitness values being accounted for.
The forth model, model 4, also included Ps and Pc together and this model resulted of an additional 0.2% of the variation in the fitness values being explained with a total of 84.6% of the variation in the fitness values being accounted for.Moreover, this Collinearity diagnostics shows that the parameters can be considered independent as their VIF values are close to 1.
To test the linearity and homoscedasticity assumptions made in using the multiple regression model, the standardized residual errors versus the fitted values are given in Figure 3.The figure shows that there is a linear relationship between the GA parameters and the fitness value.Moreover, the figure shows that the assumption of homoscedasticity is also applied, therefore, the variance errors are the same across all levels of GA parameters.The Kolmogorov-Smirnov tests of normality shows a value of 0.081 which indicates that the normality assumption is valid.These results along with the high value of R 2 justifies the use of multiple linear regression model to model the relationship between the GA parameters and the fitness value.The objective function shows that as the values of , , and increase, the average fitness value decreases(enhances).Furthermore, as the values of and increase the average fitness value increases (deteriorates).This result can be understood on the light of the termination criterion used, which is, the GA terminates when 5 seconds of execution time is elapsed.As the values of and increase, the computation time for a single generation increases, consequently, the overall number of generations that can be done within the 5 seconds time frame decreases.This matter deteriorates the fitness value.Table (8) shows the effectiveness of the multiple regression models in predicting the actual fitness value.The table shows the estimated fitness values and the actual fitness values for 10 different randomly selected combinations of parameters values along with their absolute residuals.It is clear from the table that the multiple linear regression equation predicts the fitness values precisely with a maximum absolute residual of 0.166% and minimum absolute residual of 0.001%.14) By solving this model using Excel solver, the estimated optimal fitness value using the model was 19015.7 with optimal parameters values of [3 0.1 1 0.1 66] respectively.These parameters were used in the underlying GA where the GA was simulated for 100 times.The average fitness value was 19027.16 and the average number of generations was 33.1.This value is better than any of the other fitness values that were found in the 100 combinations evaluated earlier.This indicates that the MLP method is able to tune the underlying GA parameters to give the best performance within the 5 seconds time frame.
The performance of the tuned GA was compared with the performance of the GA when one parameter's value was changed at a time.5) shows a comparison between the average fitness value for the tuned GA and the average fitness value for the GA when the OffP m value was changed from its optimal value of 1 to 0.5.6) shows a comparison between the average fitness value for the tuned GA and the average fitness value for the GA when the P m value was changed from its optimal value of 0.1 to 0.5.When the number of chromosomes used in the crossover changed from the optimal value 3 to 7 the average number of generations dropped from 33.1 to 30.1 the matter that deteriorated the fitness value as less exploration, for the sample space, happened.In the case where the rate of mutation of the offspring reduced from 1 to 0.5, the exploitation power of the GA reduced and in the same time the exploration power increased due to the increase in the number of generations from 33.1 to 42.4.The overall result was deterioration in the fitness value.
In the case where the mutation rate increased from 0.1 to 0.5, the fitness function deteriorated.In one hand, this increase in the crossover rate allowed for more exploitation (the matter that should enhance the fitness function), in the other hand, this increase reduced the number of generations from 33.1 to 29.8 the matter that reduced the exploration power of the GA and thus the net effect was a deterioration in the fitness value.Finally, changing all the parameters values deteriorated the fitness value due to different effects that affected the exploration and the exploitation powers of the GA.
To assess the sensitivity of the optimal parameters values to the changes in the GA strategies, the mating strategy was changed from mating among the Best 25% of the population to mate the Best 10% of the population with the Worst 10% of the population in the generation.The same problem was solved 100 times under the optimal conditions ([3 0.1 1 0.1 66]) found earlier and the average fitness value was 19038 while the average number of generations was 30.8.This clearly shows that the optimal values of the parameters found under the old mating strategy are not optimal under the new mating strategy.
To appreciate the power of the computer used aspect on the performance of the GA, a 100 simulations were done using AMD Phenom (tm) IIX6 computer using the same optimal conditions [3 0.1 1 0.1 66] found earlier and the average fitness value was 19059.1 while the average number of generations was 9.This clearly shows that the optimal values of the parameters found using the i7 core computer are not optimal under the AMD Phenom (tm) IIX6 computer as the average fitness value deteriorated due to the sever decrease in the average number of generation.

Experimentation on Problem 2
The same procedure used to optimize problem 1 was used to optimize problem 2 that contains 6 periods and 15 products.The only difference is that the best regression model for the fitness function was nonlinear.It is worth to mention here that the only parameter involved in the fitness function regression model for this problem was parameter OffPm.This shows clearly that as the size of the problem changes, the behavior of the GA differs significantly, consequently, the whole schema of the regression model changes.This agrees with what was concluded in section 2 about the problem aspect and the NFLT.The resulting model is: : integer odd.
Solving this model using Excel solver, the optimal values for the parameters were [3 0.1 1 0.1 63] respectively.Using these parameters in the underlying GA and simulating the GA for 100 times gave an average fitness value of 32453.12 and variance of 2190.7.The average number of generation was 26.1 generations.
To see whether the optimal values obtained for problem 1 were valid for this problem also, problem 2 was evaluated using the same optimal set of parameters values for problem 1 [3 0.1 1 0.1 66].The average fitness value was 32491.1 with variance of 1908.5.The average number of generation was 22.9 generation.The following test of hypothesis was conducted:   The results show that the P-value is less that 0.05, the matter that allowed us to conclude that the mean performance of the GA under parameters values [3 0.1 1 0.1 63] is less than the mean performance of the GA under

Figure ( 4
Figure (4)  shows a comparison between the average fitness value for the tuned GA and the average fitness value for the GA when NCHC value changes from its optimal value of 3 to 7.

Figure 4 .
Figure 4.The performance of the GA when changing NCHC from 3 to 7

Figure 5 .
Figure 5.The performance of the GA when changing OffP m from 1 to 0.5

Figure 6 .
Figure 6.The performance of the GA when P m value was changed from 0.1 to 0.5 Figure (7) shows a comparison between the average fitness value for the tuned GA and the average fitness value for the GA when the parameters values were changed from their optimal values to the following values [7 0.1 0.5 0.2 60].

Figure 7 .
Figure 7.The performance of the GA with the following parameters [7 0.1 0.5 0.2 60]

H 0 :
The mean performance of the GA for problem 2 under parameters values [3 0.1 1 0.1 63] is the same as the mean performance of the GA for problem 2 under parameters values [3 0.1 1 0.1 66].H 1 : The mean performance of the GA for problem 2 under parameters values [3 0.1 1 0.1 63] is less than the mean performance of the GA for problem 2 under parameters values [3 0.1 1 0.1 66].

Table 2 .
Table (2) gives a summary of the data Summary of data for hypothesis H 0.2

Table 4 .
Table (4)gives a summary of the data.Summary of data for hypothesis H 0.4

Table 5 .
Relevant data for the two problems used in the experimentation

Table 8 .
Effectiveness of the multiple regression model for problem 1MLP optimization can find the optimal combination for the parameters values that optimizes the performance of the underlying GA.The complete MLP model is given as:

Table ( 10
) gives a summary of the data.

Table 10 .
Summary of data for hypothesis H 0