Review on Reformulation of the Mean-Variance Model with Real-life Trading Restrictions

In this paper, we consider a class of portfolio selection problems with cardinality and minimum buy-in threshold constraints in real-life which can be formulated as mixed-integer quadratic programming (MIQP). Two reformulation methods that generate the same tight continuous relaxation of original problem are compared in the context under the branch-and-bound algorithm, one is the Perspective Reformulation and another is the Lift-and-Convexification Reformulation (LCR). Computational results show that the (PC) is more competitive than the (LCR) method in terms of computing time and nodes in MIQP solver CPLEX 12.7, what's more, this outperformance becomes more obvious as the size of instances grows.


Introduction
Since Markowitz first proposed the mean-variance model in (Markowitz, 1952), the portfolio selection problem has become a popular research topic in the field of modern financial research.Considering there are n risky assets in the market.The expected return vector and the covariance matrix for the assets are µ and Q respectively.Supposing investor has amount of money to invest in a certain period of time and he has to decide which risky assets to buy and how to allocate his capitals so that raising the revenue and reducing the risk as much as possible.The resulting investment MV model revealing the problem that of trading returns versus risk which can be formulated as: (MV) min x T Qx s.t. e T x=1, (1) (2) where x=(x 1 , x 2 , ..., x n ) represents the proportion of the investment in risky asset i, i=1,...,n, the objective function x T Qx on behalf of the associated risk, constraint (1) depicts a feasible allocation of the available resources over assets, constraint (2) meant that the expected return exceeds prescribed expected return level set by investor.
Motivated by Markowitz's work, many researchers have conducted a series of extension work based on mean-variance model.Ogryczak and Ruszczynski (1997); Ouederni and Sullivan (1991); Green and Hollifield (1992); Mao (1970) measured the risk by semivariance and established a mean-semivariance model of portfolio selection.Markowitz (1959) also proved that the mean-semivariance model has better performance than MV model.Konno and Yamazaki (1991) proposed a mean absolute deviation (MAD) model which can remove the most disadvantages of classical MV model and be solved by linear program with largely reduced computational effort.With the rapid development of computer technology, many scholars turned to investigate the skew model for the reason that skewness plays an important role if the distribution of the rate of assets return is asymmetric around the mean.Konno, Shirakawa and Yamazaki (1993) introduced skewness into the mean-absolute deviation model.Then, Konno and Suzuki (1995) introduced the skewness to the mean-variance model, established the mean-variance-skew model and solved the model by integer programming method quickly and efficiently.Value-at-Risk (VaR) was proposed as a risk measure tool to evaluate the maximum loss of assets at a certain probability level for a specific period of time.When using the VaR, however, the original optimization problem will be transformed into a nonconvex problem which is difficult to solve.To overcome this difficulty, Rockafellar and Uryasev proposed an improved risk measurement method in (Rockafellar & Uryasev, 2000;Uryasev, 2000) based on VaR-Conditional Value at Risk (CVaR).Since then, many scholars have demonstrated that the CVaR method is superior to the VaR method in the risk measurement of various financial optimization problems through empirical analysis, such as Andersson, Mausser, Rosen and Uryasev (2001); Rockafellar and Uryasev (2002); Mansini, Ogryczak and Grazia Speranza (2003); Topaloglou, Vladimirou and Zenios (2002).
Compared with above models, the standard MV problem is a convex quadratic programming and easy to solve.However, in many real cases, rational investor would not completely follow the classical MV model due to many practical factors, for example, a lot of very small holdings will yield extra transaction and management costs, so the minimum and maximum buy-in thresholds can not be negligible in practice with the expression form of {0} [ , ], 1,..., This type of variable is mathematically called semi-continuous variable.We have another expression of this structure with the binary variable: i , {0,1}, 1,..., What's more, the MV model often appears in many practical applications along with the cardinality constraint to limit the number of asset to invest.The mathematical expression of cardinality constraints can be formulated as following: card(x)  K Where card(x) explains the number of nonzero variables x i , K is an integer with 1  K  n.Moreover, the cardinality constraints are usually expressed as e T y  K with binary variable {0,1}, 1,..., when modeling the real-word optimization problems, where e is an vector of all ones.So we focus on the mean-variance portfolio optimization problems with cardinality and minimum buy-in threshold constraints in this paper which can be formulated as following: e T y  K (7) where Q is an n n  positive semi definite symmetric matrix, , , 6) is referred as minimum buy-in threshold constraint to prevent the investors from holding some assets with a very small amount and constraint (7) is the cardinality constraint to limit the total number of different assets in the optimal portfolio.With the joint of cardinality and minimum threshold constraints, the original quadratic programming problem turned into a mixed-integer quadratic programming (MIQP) problem with semi-continuous variables.Various solution methods for this class of portfolio optimization problems with cardinality and minimum threshold constrains have been investigated in the literature by many researchers, such as Bertsimas and Shioda (2009); Bienstock (1996); Bonami and Lejeune (2009); Cesarone, Scozzari and Tardella (2009); Cui, Zheng, Zhu and Sun (2013); Gao and Li (2013); Li, Sun and Wang (2006).Many heuristic algorithms also have been used to solve such MV problems such as genetic algorithms, tabu search, andsimulated annealing (see, e.g., (Chang, Meade, Beasley, & Sharaiha, 2000;Blog, Hoek, & Timmer, 1983;Jacob, 1974;Maringer & Kellerer, 2003;Mitra, Ellison, & Scowcroft, 2007)).Despite the various virtues of heuristic algorithm, it cannot guarantee to find the optimal solution or a satisfactory solution of (QP).
In general, problem (QP) is NP-hard problem.The difficulty of finding an efficient solution for this class of MIQP problems arises from the discrete structure induced by the semi-continuous variables in the model so that it has become an active area of research.Typical solutions for this class of MIQP are based on branch-and-bound algorithm.Thus estimating a tighter lower bound which can substantially improve the efficiency of the branch-and-bound algorithm is of much importance.In real applications, however, the lower bound generated by the standard continuous relaxation is too loose to improve the computation efficiency.Therefore many equivalent reformulations have been proposed for the tighter lower bound.
Perspective reformulation is a well-performed reformulation method since it depends on replacing the original convex function in the formulation with its so-called perspective function which is related to the convex envelope of the objective function in original problem.Frangioni and Gentile added the perspective cuts to the reformulation which objective function is either separable in (Frangioni & Gentile, 2006) or nonseparable in (Frangioni, & Gentile, 2007).Four types reformulations of tractable perspective relaxation have been proposed to solve the high nonlinearity of objective function due to the fractional term in perspective function.Frangioni and Gentile (2009) compared two of them, one is the second-order cone program (SOCP) reformulation, another is perspective cuts (PC) reformulation, and then a new method called projected perspective relaxation (P 2 R) had been put up in (Frangioni, Gentile, Grande, & Pacifici, 2011) under three further restrict assumptions so that the perspective relaxation of MIQP can be reformulated as a piecewise linear-quadratic problem and then the consequent model can be simplified as roughly the same size of the original standard continuous relaxation.However, this method has many limitations in application.To solve it, Frangioni, Furini, & Gentile (2016) invented another P 2 R based approach-Approximated Projected Perspective Reformulation (AP 2 R) which is approximated to the P 2 R approach.Other contribution work about perspective reformulation can be seen in (Günlük & Linderoth, 2012;Zheng, Sun, & Li, 2014).
Besides the perspective reformulation, another type of reformulation method called lift-and-convexification reformulation (LCR) has been put forward in (Wu, Sun, Li, & Zheng, 2015).The substance of this approach is to add a quadratic equivalent term multiplied by a parameter to the objective function and to convexify the objective function so that the resulting formulation equivalent to the original problem.Moreover, the continuous relaxation of reformulation obtains a lower bound the same as that obtained from the perspective reformulation in (Zheng et al., 2014) with dramatically reduced computational time of (SDP).This approach originates from the idea of quadratic convex reformulation (QCR) method which was first introduced for binary quadratic programming in (Hammer & Rubin, 1970) to show that by adding a quadratic equivalent term to the objective function, the original problem is equivalent to one convex quadratic programming that has positive semidefinite matrix.Billionnet, Elloumi andPlateau (2008, 2009) improved this method further by adding an equality constraint in the reformulation and found the best convex reformulation by semidefinite programming.After that Billionnet, Elloumi andLambert (2012, 2015) extended the QCR method to general mixed-integer programs.
The paper is organized as follows: In section 2, we review the current state-of-the-art perspective reformulation for the MIQP with semicontinuous variables and section 3 gives a review of the recent work LCR for the MIQP with semi-continuous variables.In section 4, we conduct numerical experiment to compare the effectiveness of perspective reformulation and LCR.Conclusions are made in section 6.
Notation: Throughout the paper, we denote by ( ) v  the optimal value of problem (•), and n the nonnegative orthant of n .For any n a  , we denote by diag(a)=diag(a 1 ,...,a n ) the diagonal matrix with a i being the i th diagonal element.For any matrix A, we denote 0 A  as A is a semidefinite matrix in our paper.

Perspective Reformulation Review
In this section, we make a review of perspective reformulation of mixed-integer quadratic programming with semi-continuous variables and cardinality constraint.Firstly, the perspective reformulation requires the objective functionis separable while the objective function f(x) in (QP) is usually nonseparable.Frangioni and Gentile (2007) proposed a diagonal decomposition method to decompose the quadratic form x T Qx as x T (Q−diag(d))x+ x T diag(d)x so that extracting the separable terms, where and diag(d) is a diagonal n n  matrix with the elements of d on the diagonal.Replacing the separable term x T diag(d)x with its convex envelope over the semi-continuous variables, which is the sum of the perspective functions of , then the perspective reformulation of (QP) can be expressed as the following: 4), ( 5), ( 6), (7) Since the high nonlinearity of objective function due to the fractional term in perspective function of (PR(d)), efficient solution methods can not be directly applied to solve (PR(d)).Then two tractable reformulation methods were proposed to deal with it.One is the second-order cone programming (SOCP) reformulation proposed in (Aktürk, Atamtürk, & Gürel, 2009;Günlük, & Linderoth, 2010)  as an SOCP constraint.The resulting SOCP reformulation of (PR(d)) can be rewrite as the following form: (4), ( 5), ( 6), ( 7) Another is the perspective cut (P/C) reformulation proposed in (Frangioni and Gentile, 2007).Representing the epigraph of 2 / i i x y by a set of perspective cut inequalities which can be expressed as the following form: (4), ( 5), ( 6), (7) A key point is how to choose the best parameter vector d when implementing the SOCP reformulation (SOCP(d)) and (P/C) reformulation (PC(d)) so that the lower bound is as tight as possible.One simple way is to select the smallest eigenvalue of Q as the elements of vector d.Then Frangioni and Gentile (2007) proposed a heuristic method to find a "better" d compared with the smallest eigenvalue method by solving the following "small" semidefinite programming (SDP): Zheng et al. (2014) proposed a large SDP approach to find the "best" parameter vector d in the perspective reformulation.Since the continuous relaxations (PR( )) d , (SOCP( )) d , (PC( )) d have the same continuous bounds, the best parameter vector d that maximize the (PR( )) v d can be found by solving the following problem: , According to the conclusion in (Zheng et al., 2014), problem (MAX d ) is equivalent to the following SDP problem: , Although the computation time of solving the SDP problem (SDP l ) is longer than solving (SDP s ) due to the large dimension, the dramatically time reduction in computing SOCP or P/C reformulations pays off the long time consumption in (SDP l ).Moreover, computational results in (Zheng et al.,2014) were shown that using the parameter vector d computed by the large SDP formulation can considerably improve the performance of the perspective reformulations, largely due to the improvement of the continuous bounds.


, where , to the objective function and to convexify the objective function at the same time so that the resulting formulation equivalent to the original problem.The reformulated problem (P(u, v)) can be expressed as following: 9), (10) The reformulated problem (P(u, v)) can be solved by the classical branch-and-bound algorithm based on its lower bound, which is the optimal value of its continuous relaxation.Let u = (u 1 ,...,u n ) T and v = (v 1 ,...,v n ) T , (P( , )) u v denote the continuous relaxation of (P(u, v)) by relaxing y {0,1} n to y [0,1] n .Moreover, we define the value of (P(u, v)) as (P( , )) v u v .The best parameters (u, v) can be found by solving the following problem: Theorem 1 Problem (MAX uv ) is equivalent to the following semidefinite programming SDP problem: ( , , , , ) , ( , , )

Computational results
In this section, we conduct a series of computational experiments for the mean-variance portfolio selection problem (QP) with cardinality and minimum threshold constraints in real-life described in section 1.The aim of our numerical tests is to compare the performance of the perspective reformulation and the LCR proposed in section 2 and section 3 respectively under the branch-and-bound algorithm.Since Frangioni and Gentile (2009) has drew the conclusion that the (PC(d)) is more competitive than the (SOCP(d)) if probably processed, then we focus on the following two reformulation method: •(PC): the perspective cut reformulation (PC(d)) with d = d * , where d * is obtained by solving (SDP l ).
To conduct the test of our approach for above MV problem, we randomly generated 5 instances for each test problem with the same size (n = 200, 300, 400).For each instance, test was conducted under K = 4, 6, 8, 12 and without cardinality constraint.The diagonal elements and non diagonal elements of the real symmetric matrix Q are randomly generated in the interval [4, 1000] and [1, 10], respectively.The elements of matrices µ and the interval of expected return level ρ are all randomly set in [0.002, 0.01].The intervals of minimum and maximum buy-in thresholds α i and β i have been randomly set in [0.075, 0.125] and [0.375, 0.425], respectively.The two SDP are all implemented in Matlab R2016b and run on a PC (2.5GHz, 8GB RAM).The computational results are interface as the lower bounds used in CPLEX 12.7 where the MIQP are solved by the MIQP solver.The CPU time limit is set at 10000 seconds and the CPLEX 12.7 is operated with the default setting.
Table 1 reports the computational results of two reformulation methods (PC and LCR) for the MV problem where the column "time l " is the computational time for solving (SDP l ) and "time q " is the computational time for solving (SDP q ).Each line reports the average results for the 5 instances in a subset.The column "gap" refers to the relative gap between the objective value of the exiting solution and the best lower bound which is expressed in percentage.The column "time" is the computing time (in seconds) and the column "nodes" is the number of nodes when solved in CPLEX.The "nonK" is expressed as the instances without cardinality constraint.From the Table 1 we can see that the average computing time of (SDP l ) is far less than the computing time of (SDP q ), what's more, the computing time gap between (SDP l ) and (SDP q ) increases dramatically as the dimension increases.On the other hand, when comparing the time of MIQP solver CPLEX, the computing time for (PC) is longer than (LCR) as n = 200.However, the computing time for (PC) is less than (LCR) when n = 300, 400.From the data in column "gap", we can see that the lower bound obtained from (SDP l ) and (SDP q ) is nearly the same as the objective value obtained by CPLEX 12.7.Moreover, for the total computing time (the sum of time for solving SDP and corresponding PC/LCR), (PC) performs better than (LCR) for all the instances.For the number of nodes explored in (PC) and (LCR), (PC) performs better than (LCR) for 12 out of the total 15 instances.The computational result is well as we expected because the relaxations in (LCR) are in general looser than the ones in (PC) at children nodes.

Table 1 .
Comparison Results of Reformulations for (MV)