Memetic Elitist Pareto Evolutionary Algorithm for Virtual Network Embedding

Assigning virtual network resources to physical network components, called Virtual Network Embedding, is a major challenge in cloud computing platforms. In this paper, we propose a memetic elitist pareto evolutionary algorithm for virtual network embedding problem, which is called MEPE-VNE. MEPE-VNE applies a non-dominated sorting-based multi-objective evolutionary algorithm, called NSGA-II, to reduce computational complexity of constructing a hierarchy of non-dominated Pareto fronts and assign a rank value to each virtual network embedding solution based on its dominance level and crowding distance value. Local search is applied to enhance virtual network embedding solutions and speed up convergence of the proposed algorithm. To reduce loss of good solutions, MEPE-VNE ensures elitism by passing virtual network embedding solutions with best fitness values to next generation. Performance of the proposed algorithm is evaluated and compared with existing algorithms using extensive simulations, which show that the proposed algorithm improves virtual network embedding by increasing acceptance ratio and revenue while decreasing the cost incurred by substrate network.


Introduction
Network virtualization plays an important role in cloud data centers by allowing multiple simultaneous virtual networks (VNs) to share single substrate network (SN). Network virtualization increases revenue of cloud data centers and improves utilization of physical resources by increasing number of accommodated virtual networks.
However, mapping virtual networks' resources to physical network's resources is known to be nondeterministic polynomial-time hard (NP-hard) even if all incoming requests are previously known . This problem is usually referred to as the Virtual Network Embedding (VNE) problem . Figure 1 shows an example of VNE. VNE solution maps each virtual node to a substrate node and each virtual link to a loop free substrate path, which consists of substrate links. For convenience, in the rest of this paper, we do not differentiate between virtual network embedding and virtual network mapping.
In the last few years, many studies have proposed several optimal techniques to solve small instances of the problem in reasonable time (Lischka & Karl, 2009;Chowdhury et al., 2009;Chowdhury et al., 2012;Cheng et al., 2011). However, current cloud data centers contain thousands of servers and using traditional optimal techniques to find the optimal solution in cloud data centers becomes unaffordable.
Therefore, metaheuristic-based techniques have received more attention to be used for finding near optimal solution in short execution time Cheng et al., 2012;Yuan et al., 2013;Wang et al., 2013). Genetic algorithm (GA) is one of the widely used metaheuristic-based techniques. Genetic algorithms have been successfully applied to many optimization problems in different areas. However, genetic algorithms have some drawbacks like the possible premature convergence. Due to the loss of diversity within the population, GAs can converge even though the near optimal solution is not yet found.
A memetic algorithm is an extension of the traditional genetic algorithm to reduce the problem of premature convergence. Memetic algorithms are a combination of population-based global search and heuristic-based local search. Global search finds the near global optimal and local search finds the best solution. Another important factor that affects the acceptance ratio of virtual network requests is substrate resources fragmentation. Substrate resources become fragmented due to the dynamic arrival and departure of virtual network requests over time.
In this paper, we propose a memetic elitist pareto evolutionary algorithm for embedding virtual networks (MEPE-VNE). MEPE-VNE reduces the loss of good solutions by employing elitist selection. Elitist selection passes a limited number of individuals with the best fitness values to the next generation without applying crossover or mutation operations on them. MEPE-VNE combines Pareto-based multi-objective algorithms with local search to optimize virtual network embedding. The set of Pareto optimal front solutions is gained using non-dominated sorting genetic algorithm-II (NSGA-II), which is one of the most famous Pareto optimal solution algorithms (Deb et al., 2002). The performance of the proposed algorithm is evaluated and compared with some of existing algorithms using extensive simulations, which show that the proposed algorithm outperform existing algorithms by improving virtual networks' acceptance ratio and revenue.
The rest of this paper is organized as follows. Section 2 gives a brief explanation of the key basic concepts of multiobjective optimization. Section 3 gives a short overview of related work. Section 4 presents virtual network embedding model and problem formulation. Section 5 describes the proposed algorithm. Section 6 evaluates the proposed virtual network-embedding algorithm using extensive simulations. Finally, in Section 7 we conclude this paper.

Background
Multi-objective optimization: multi-objective optimization problem can be depicted as following (Reyes-sierra and Carlos, 2006 Where = , , . . . , ⊂ ℝ is a decision vector consists of decision variables, is a decision space, = , , . . , ! ⊂ ℝ , is an objective vector consists of " objective functions, ! is an objective space, : ℝ → ℝ, = 1,2, … , " are objective functions, : ℝ → ℝ, = 1,2, … , , ℎ : ℝ → ℝ, = 1,2, … , are inequality and equality constraints functions of the problem. Multi-objective optimization problem tries to find decision vector in decision space that will optimize objective vector . Definition 1 (Feasible Solution Set). Feasible solution set is the set of all decisions from a decision space that satisfy all inequality and equality constraints. Feasible solution set is denoted by . We say that a decision vector

Definition 4 (Pareto optimality). A decision vector
Definition 5 (Pareto optimal set). The set of all Pareto optimal decision vectors is called Pareto optimal set and is denoted by 3 * .

Related work
Solving VNE problem is NP-hard even if all incoming requests are previously known. Even if all virtual nodes are mapped, finding the optimal virtual links mapping is still NP-hard. Therefore, optimal solution can only be found for small problem instances (Fischer et al., 2013b). Consequently, several heuristic algorithms have been proposed to find a near optimal solution Sun et al., 2013;Ghribi et al., 2013;Beloglazov & Buyya, 2012;Nogueira et al., 2011). Some algorithms have been proposed to find optimal VNE solutions to be used as optimal bound for the heuristic based VNE solutions (Cheng et al., 2011;Botero et al., 2012). Shahin (2015b) proposed two VNE algorithms to find optimal VNE solutions. The proposed algorithms increase the acceptability of fragmented substrate networks by embedding virtual networks on best-fit sub-substrate networks. Substrate networks are fragmented due to dynamic arrival and departure of virtual network requests over time. Fragmentation of substrate networks reduces the possibility of accepting more virtual network requests in the future. The proposed algorithms are one stage, online, and backtracking algorithms. Both algorithms exploit the virtualization technology to consolidated more than one virtual node from the same virtual network to one substrate node. The second algorithm coarsens VNs using Heavy Edge Matching (HEM) technique to minimize the total required bandwidth. In (Shahin, 2015a), Shahin proposed another VNE technique, which coarsens virtual network using Heavy Clique matching technique and optimizes the coarsened virtual networks using a refined Kernighan-Lin algorithm. Zhu and Ammar (2006) proposed two VNE algorithms to find exact VNE solutions. In the first algorithm, allocated substrate resources are fixed throughout the VN lifetime. To reduce the required computational to embed VNs, Zhu and Ammar divided each VN into a number of connected sub-VNs. In the second algorithm, allocated substrate resources are reallocated to increase the utilization of the underlying substrate resources. Substrate resources reallocations are prioritized for the most important VNs. Selective scheme is proposed to select the required substrate resources reallocations after prioritizing the required reallocations for the most important VNs. However, the proposed algorithms deal only with offline version of VNE problem, where all VNRs are previously known and do not deal with VNRs that dynamically arrive over time. Lischka and Karl (2009) proposed online one stage VNE algorithm, which deals with dynamic arrival of VNRs and maps virtual nodes and virtual links during the same stage. The proposed algorithm maps VN to substrate network by finding an isomorphic subgraph inside the substrate network that achieves the CPU and connectivity requirements for the VN. During nodes mapping process, virtual nodes are sorted in descending order based on its required CPU and mapped sequentially to substrate nodes. Although, allowing multiple virtual nodes from the same www.ccsenet.org/cis Computer and Information Science Vol. 8, No. 2;2015 VN to be coexisted on the same substrate node leverages the advantage of ram data switch between virtual nodes coexisted on same substrate node instead of using substrate link bandwidth, the proposed algorithm prevents virtual nodes from the same VN to be co-resident on the same substrate node. Lischka and Karl incrementally increase the maximum hop limit to find the shortest substrate path that is required to embed each virtual link. However, the computational complexity of the proposed algorithm is high due to multiple operations.
To reduce complexity of the proposed algorithm in (Lischka & Karl, 2009), Di et al. (2010) sorted virtual nodes and specified the maximum hop limit based on virtual links mapping cost. Fischer et al. (2013a) proposed another improvement for the proposed algorithm in (Lischka & Karl, 2009). Fischer et al. considered energy efficiency during nodes and links mapping and allowed several virtual nodes of the same virtual network to be coexisted on the same substrate node. However, they did not take into account the mapping cost. Yuan et al. (2013) modeled the VNE problem as an integer linear programming and proposed VNE algorithm based on discrete particle swarm optimization to solve the proposed model. The proposed algorithm allows more the one virtual node to be coexist in the same substrate node to leverage the advantages of inter ram switch techniques. Wang et al. (2013) proposed VNE algorithm similar to the proposed algorithm in . Furthermore, Wang et al. presented an enhanced position assign method to make the solution space more narrow and increase the number of consolidated virtual nodes. Cheng et al. (2011) proposed topology-aware node ranking technique, called NodeRank, to reflect resources and quality of connections of a node. NodeRank technique is inspired by PageRank technique used by Google's search engine, which ranks web pages to measure its popularity by applying the Markov Random Walk (RW) model. Using the proposed NodeRank technique, Cheng et al. proposed two VNE algorithms: RW-MaxMatch, and RW-BFS. RW-MaxMatch is two-stage VN embedding algorithm, which maps virtual nodes according to their ranks and maps virtual links between the mapped nodes to shortest substrate paths. However, RW-MaxMatch algorithm maps nodes without taking into consideration its relation to the link mapping, which leads to high SN's resources consumption rate. This is due to mapping neighboring virtual nodes widely separated in the SN. RW-BFS algorithm is a backtracking one-stage VN embedding algorithm based on breadth-first Search. RW-BFS improves the coordination between nodes and links mapping by embedding nodes and links at the same stage.

Virtual network embedding model and problem formulation
Substrate network (SN): as in our previous work (Shahin, 2015a;Shahin, 2015b), SN is modeled as a graph 7 8 = 9 8 , : 8 , which is a weighted undirected graph. 9 8 is the set of all substrate nodes and : 8 is the set of all substrate links. Substrate nodes are weighted by available CPU, and substrate links are weighted by available bandwidth. Figure 1(b) shows substrate network example. The numbers in rectangles represent CPU capacities and the numbers over links represent bandwidth capacities.
Virtual network (VN): virtual network ;9 is represented as a weighted undirected graph 7 < = = 9 < = , : < = . The set 9 < = contains all virtual nodes and each virtual node is weighted by the required CPU. The set : < = contains all virtual links and each virtual link is weighted by required bandwidth. Figure 1(a) shows an example of virtual network, where numbers in rectangles represent required CPU and the numbers over links represent required bandwidth. Virtual Network Embedding Revenue: as in (Shahin, 2015a;Shahin, 2015b;Cheng et al., 2011;Zhang et al., 2013), where KEL < = represents the required CPU for the virtual node < = , PQ R < = is the required BW for the virtual link R < = , and : I > ? , A = 1 if > ? is in its lifetime, otherwise : I 7 < = , A = 0. Substrate network 7 8 is considered fragmented if there are at least two independent sub-substrate networks 7 8 = , 7 8 T ⊂ 7 8 , such that 9 8 = ∩ 9 8 T = ∅ and there is no loop-free substrate path W 8 satisfies all connectivity constraints and connects two substrate nodes from 9 8 = and 9 8 T .
To measure substrate network fragmentation (SNF) at time A, we use the following formula: where is the number of fragments in the substrate network, q is a positive integer number greater than 1 to reduce the influence of small negligible fragments as long as one large fragment exits, and @Ih ijkR 7 8 = , A is the total residual substrate resources in sub-substrate network 7 8 T at time A. @Ih ijkR 7 8 = , A is calculated as following: The substrate network fragmentation formula in equation (2) is inspired by the fragmentation measure proposed by Gehr and Schneider in (Gehr & Schneider, 2009).
Virtual Network Embedding Cost: as in (Shahin, 2015a;Shahin, 2015b), the cost of embedding > ? at time A is defined as total allocated substrate resources to > ? at time where :I Aℎ D S M = R < = is the length of substrate path that is allocated to the virtual link R < = .
Objectives: the main objectives are to increase the revenue of VNE by increasing virtual network acceptance ratio and decreasing the cost of embedding each virtual network. Virtual network acceptance ratio can be increased by decreasing substrate resources fragmentation. The following metrics are used to evaluate and compare VNE algorithms: -The long-term average revenue, which is defined by where } =∥ ;9@ ∥, and • is the total time.

78
-The VNR acceptance ratio, which is defined by where ;9@ 8 is the set of all accepted virtual network requests.
-The long term R/Cost ratio, which is defined by -The long-term average substrate network fragmentation, which is defined by

The proposed MEPDE-VNE algorithm
Algorithm 1 shows the steps of the proposed MEPDE-VNE algorithm. The first step (from line 1 to line 21) is creating the initial population E 0 , which is a set of mappings from 7 < to 7 8 . To collect these mappings, the MEPDE-VNE algorithm constructs breadth-first searching tree for the 7 < graph. The root of the constructed tree is the virtual node with the largest resources. Nodes in each level in the created breadth-first searching tree are sorted in descending order based on their required resources. Candidate substrate nodes list K 8 for the root virtual node is created by collecting all substrate nodes that have enough resources to embed the root virtual node. All substrate nodes in K 8 are sorted in descending order based on their available resources. MEPDE-VNE algorithm uses ‡ ˆIi function to collect mappings from 7 < to 7 8 . In order to maximize the number of found elements of the Pareto optimal set and to maximize spread of VNE solutions, MEPDE-VNE algorithm visits substrate nodes in the candidate substrate nodes list K 8 sequentially and incrementally increases the maximum hop limit in each iteration until the population E 0 is completed. If ‡ ˆIi function cannot generate the required number of individuals (due to lack of substrate resources), Es jRkA s … ‰\ is reset to the number of generated individuals.
In line 28, each individual in the constructed population is locally improved using Š A ‹I function. Š A ‹I function visits virtual nodes and virtual links in each individual in round-robin fashion and uses breadth-first search to find an enhanced remapping. Optimize() function terminates when no further improvements can be made.
In line 29, the proposed MEPDE-VNE algorithm uses 4khA_ s _is kAIi_hs?A function, which applies Fast Nondominated Sorting technique proposed in (Deb et al., 2002) to order the constructed population into a hierarchy of non-dominated Pareto fronts and assign a rank value to each individual based on its dominance level and crowding distance value. Crowding-distance values are computed after sorting population according to objective functions in equation 2, and equation 3.
In lines 31-35, the new offspring population • A is generated using @I ?sijŽA s function, and optimized using Š A ‹I function. A combined population @ A = • A ∪ E A is created and sorted according to nondomination using 4khA_ s _is kAIi_hs?A function. To ensure elitism, a new population E A + 1 is created by retrieving the best nondominated solutions from the combined population @ A using Pareto_optimal_front() function. The new population E A + 1 of size Es jRkA s … ‰\ is now used for the next iteration. After reaching }AI?kA s h •B' , the best individual in the Pareto optimal front of the final population is returned as a suggested solution.
MEPDE-VNE algorithm uses ‡ ˆIi function to generate individuals of the initial population. Algorithm 2 shows the steps of the ‡ ˆIi function. ‡ ˆIi function creates candidate substrate nodes list, which contains all substrate nodes that have enough CPU to embed current virtual node and have loop free substrate paths with enough bandwidths to embed virtual links between current virtual node and its previously mapped neighbors. The length of each substrate path is less than or equal the maximum substrate path length 's h. Current virtual node is sequentially mapped to substrate nodes in its candidate substrate node list. If there is no suitable substrate node in candidate nodes list, we backtrack to the previously mapped node, re-map it to the next candidate node, and continue to the next node. After mapping the last virtual node in the virtual network, if the generated individual already exists in the current population, ‡ ˆIi function backtrack to the previously mapped node, and re-map it to the next candidate substrate node to generate a new individual. Reproduction() function is used to generate new individuals from current population. In this paper, we represent a virtual nodes mapping solution as a chromosome. The number of genes in the chromosome equals to the number of virtual nodes in the virtual network. Each gene corresponds to a virtual node and contains its substrate node.
New individual is generated by performing single-point crossover for two parents, which are picked using roulette wheel selection from the current population. Crossover is performed by picking a random midpoint, picking the first part of the nodes mappings from the first parent, and picking the remaining nodes mappings from the second parent. If two connected virtual nodes are picked from the same parent, mapping of the virtual link that connects these two virtual nodes is also picked from the same parent.
If two connected virtual nodes are picked from different parents, virtual link between these virtual nodes is mapped to a shortest loop-free substrate path that satisfies the bandwidth constraint. Although this process does not grantee satisfaction of the CPU and connectivity constraints, Optimize() function will remap virtual nodes and virtual links to satisfy the CPU and connectivity constraints and to find the local optimal solution. If Optimize() function fails in improving the generated individual, the generated individual will be deleted during creation of the new population E A + 1 . The population E A + 1 is created by retrieving best nondominated solutions from the combined population @ A . Remapping process performed by Optimize() function maximizes solutions' spreading by moving generated VNE solution in substrate network to find valid solution located between its parents.
After generating new individual, mutation is applied on the new individual. Mutation is performed by remapping virtual node to a randomly selected substrate node from a set of all substrate nodes that are not used in the current population and have enough substrate resources to embed virtual node. Again, virtual links of the mutated virtual node are mapped to shortest loop-free substrate paths that satisfy bandwidth constraints. If no such substrate paths are found, mutation is rolled back. This mutation allows MEPDE-VNE algorithm to investigate new areas in substrate network, so that we can have distributed VNE solutions as smooth and uniform as possible.

Performance Evaluation
The proposed MEPDE-VNE algorithm is online one stage virtual network embedding algorithm, which embeds virtual nodes with considering the required connectivity constraints. MEPDE-VNE algorithm deals with online version of virtual network embedding problem and does not require any previous knowledge. Virtual links and virtual nodes are embedded at the same stage with high co-ordination to avoid embedding neighboring virtual nodes on widely separated substrate nodes.
Most of current algorithms search one place in solution spaces and return first solution without considering remaining solutions, while MEPDE concurrently searches different parts in the solution space to collect nondominated solutions and returns best Pareto optimal solution.
One of the main difficulties that faces virtual network embedding algorithms is finding sub-substrate network with topology same as virtual network topology. MEPDE-VNE algorithm overcomes the ossification of the current virtual network embedding algorithms by allowing multiple virtual nodes from the same virtual network to coexist on the same substrate node. Virtual network embedding cost is reduced by eliminating the cost of embedding virtual links between virtual nodes that share same substrate node.

Evaluation environment settings
Performance is evaluated using two substrate network topologies, which are generated using Waxman generator. The first SN topology is generated with 50 nodes and 250 links. BW of the substrate links are uniformly distributed between 50 and 100 with average 75. The second SN topology is configured with 200 nodes and 1000 links. BW of the substrate links are uniformly distributed between 50 and 150 with average 100.
We generated 1000 Virtual network topologies using Waxman generator with connectivity 50%. The number of www.ccsenet.org/cis Computer and Information Science Vol. 8, No. 2;2015 virtual nodes in each VN is variant from 2 to 20. Each virtual node is randomly assigned one of the following CPU: 2500 MIPS, 2000 MIPS, 1000 MIPS, and 500 MIPS, which are correspond to the CPU of Amazon EC2 instance types. Bandwidths of the virtual links are uniformly distributed between 1 and 50. Arrival times of the virtual network requests are generated randomly with arrival rate 10 VNs per 100 time units. VNR's lifetime are generated randomly between 300 and 700 time units with average 500. Generated SN and VNs topologies are stored in brite format and used as inputs for all algorithms. For all algorithms, we set the maximum allowed substrate path length ('s h •B' ) to 2, and the upper bound of backtracking process (Dk _ˆkŽ"A?kŽ") to 3n, where n is the number of nodes in each VNR. Iterations «q¬ and Population°n ±m of the MEPDE-VNE algorithm are set to 5 and 10. Finally, we compared the results from the implemented algorithms.

Evaluation results
The long-term average revenue is evaluated using equation 4. Figure 2 shows the long-term average revenue comparison between the proposed algorithm and existing algorithms using the first substrate network (50 nodes and 250 links), and figure 3 shows the long-term average revenue comparison using the second substrate network (200 nodes and 1000 links).
As shown in figure 2 and figure 3, the proposed MEPDE-VNE algorithm increases the revenue among other algorithms. This is expected because most of existing algorithms return the first solution while MEPDE-VNE algorithm tries to get a Pareto optimal set and returns the best one. Figure 4 and figure 5 show the VNR acceptance ratio, which is defined by equation 5. MEPDE-VNE algorithm increases number of accepted virtual network requests. Although, VNR acceptance ratio defined by equation 5 was used by many researchers to evaluate performance (Cheng et al., 2011;Cheng et al., 2012), it does not consider the VNR resource size. For example, in figure 4, algorithm vnmFlib accepted around 29% of the VNRs but it accepted only around 13% of the total VNR resources as shown in figure 6. This variation means that vnmFlib algorithm rejected virtual networks with large resources and accepted virtual networks with few resources. MEPDE-VNE algorithm accepted 30% and 80% of the VNRs (figure 4 and figure 5) and accepted 19% and 71% from the total resources of virtual network requests ( figure 6 and figure 7). To be more specific, MEPDE-VNE algorithm accepted 19% and 70% of the VNRs' CPU ( figure 8 and figure 9) and accepted 12% and 63% of the VNRs' BW (figure 10 and figure 11).
Rejecting virtual network requests is not only due to the lack of substrate resources but also due to the specified maximum number of node remapping operations, which is specified to avoid exponential explosion of node remapping operations. For example, RW-BFS rejected 94% and 92% of the VNR resources ( figure 12 and figure 13) while there are 76% and 92% of substrate resources are available ( figure 14 and figure 15).
Another reason behind rejecting virtual network requests is preventing more than one virtual node from the same virtual network to be accommodated by the same substrate node. This restriction makes virtual network embedding algorithm try to embed virtual network on sub-substrate network with the same network topology, which is very difficult to find especially in new substrate network topologies (e.g. DCell, DCube, Fat-Tree).
Virtual network requests maybe rejected due to substrate resources fragmentation. MEPDE-VNE algorithm does not fragment substrate resources, as some of existing algorithms do. By using MEPDE-VNE algorithm, long-term average substrate network fragmentation, which is defined by equation 7, stays zero while some of existing algorithms fragment substrate resources as shown in figure 16. Figure 17 and figure 18 show comparison of long-term R/Cost ratio, which is defined by equation 6. In both figures, long-term R/Cost ratio of the MEPDE-VNE algorithm exceeds 100% because cost of embedding virtual networks is less than the revenue gained from embedding them. Figures 19-22 compare virtual network embedding time consumed by different VNE algorithms. MEPDE-VNE algorithm consumes more time to find the best Pareto optimal solution due to the large number of solutions. As shown in figure 23 and figure 24, MEPDE-VNE algorithm increases substrate resources utilization by accommodating virtual resources as much as possible. Figures 25-28 show substrate resources utilization in more details.

Conclusion
Virtual network embedding problem is one of the major challenges that cloud-computing platforms currently face.
In the last few years, several optimal virtual network-embedding algorithms have been proposed to find optimal solutions in small instances. However, finding optimal virtual network embedding solution using traditional virtual network embedding algorithms in large-scale cloud computing platforms becomes unaffordable. This observation motivates us to propose memetic elitist pareto evolutionary algorithm for virtual network embedding, called MEPE-VNE. MEPE-VNE algorithm increases the acceptance ratio of virtual network requests by minimizing substrate resources fragmentation. The proposed algorithm reduces virtual network embedding cost by increasing number of virtual links that are achieved by inter ram switch techniques (virtual links between virtual nodes that coexist in the same substrate node). MEPE-VNE provides near optimal virtual network embedding solution in reasonable time in large-scale data centers by simultaneously explores different parts in Pareto optimal set. Each solution in the Pareto optimal solutions is enhanced using local search algorithm to increase Pareto optimal set quality. Non-dominated Pareto fronts are sorted using non-dominated sorting genetic algorithm-II (NSGA-II) after assigning a rank value to each individual based on its dominance level and crowding distance value. MEPE-VNE passes solutions with best fitness values to the offspring population to prevent random destruction of best solutions by crossover or mutation operators. Extensive simulations show that our algorithm outperforms existing algorithms in terms of the long-term average revenue and the acceptance ratio of the virtual network requests. In our future work, we intend to extend the proposed algorithm to solve virtual network embedding problem in distributed cloud computing platforms. In addition, migration of virtual nodes and virtual links will be employed to minimize the rejecting rate of virtual network requests.