Estimating Smooth and Convex Functions

Abstract We propose a new method for estimating an unknown regression function f∗ : [α, β]→ R from a dataset (X1,Y1), . . . , (Xn, Yn) when the only information available on f∗ is the fact that f∗ is convex and twice differentiable. In the proposed method, we fit a convex function to the dataset that minimizes the sum of the roughness of the fitted function and the average squared differences between the fitted function and f∗. We prove that the proposed estimator can be computed by solving a convex quadratic programming problem with linear constraints. Numerical results illustrate the superior performance of the proposed estimator compared to existing methods when i) f∗ is the price of a stock option as a function of the strike price and ii) f∗ is the steady-state mean waiting time of a customer in a single server queue.


Introduction
In this paper, we propose a new method for estimating an unknown regression function f * : [α, β] → R from a dataset (X 1 , Y 1 ), · · · (X n , Y n ), satisfying α < X 1 < · · · < X n < β and Y i = f * (X i ) + i for 1 ≤ i ≤ n and some error terms ( i : 1 ≤ i ≤ n), when the only information available on f * is the fact that f * is convex and twice differentiable.
In many applications, the unknown regression function f * is not assumed to have any particular functional form, but is known to have certain shape characteristics, such as convexity, twice differentiability, or both. For example, the price of a call option is known to be a twice differentiable convex function as a function of the option's strike price; see Section 2 of Aït- Sahalia and Duarte (2003) for details. In such a setting, incorporating both shape restrictions, convexity and twice differentiability, on the estimator of f * seems to be a natural step to follow.
When f * is only known to be convex, an intuitive way to estimate f * is to fit a convex function f minimizing the average of squared errors n i=1 (Y i − f (X i )) 2 /n. This estimator can be found by solving the following problem: Subject to f is convex, which is equivalent to the following finite-dimensional quadratic programming problem in the decision variables f (X 1 ), · · · , f (X n ): Subject to ( f (X 2 ) − f (X 1 ))/(X 2 − X 1 ) ≤ · · · ≤ ( f (X n ) − f (X n−1 ))/(X n − X n−1 ); see Hildreth (1954) for details. The solution to (2) is referred to as the "convex regression estimator," and is typically not smooth, but piecewise linear.
On the other hand, when f * is only known to be smooth enough to be twice differentiable, one can notice that the roughness of f * can be measured by β α f (2) * (x) 2 dx, where f (2) * denotes the second derivative of f * . We can then estimate f * by finding the solution to http://ijsp.ccsenet.org International Journal of Statistics and Probability Vol. 9, No. 5; for some positive constant λ, where F = f is differentiable on [a, b] and has absolutely continuous first derivative .
It should be noted that f ∈ F if and only if f is differentiable everywhere on [α, β] with derivative f (1) and there is an integrable function f (2) such that The first term in (3) is called the goodness-of-fit term, and measures how close the fitted function f is to the dataset (X 1 , Y 1 ), · · · , (X n , Y n ), while the second term in (3) is called the roughness term, and measures how "wiggly" f is. There is a trade-off between the goodness-of-fit and roughness terms because as the function f gets close to the data points, its roughness increases. The constant λ appearing in (3) is called the smoothing constant, and makes a balance between the goodness-of-fit and roughness terms. The solution to (3) is called the penalized least squares estimator of f * and can be computed by solving a system of linear equations; see, for example, Theorem 2.4 on page 19 of Green & Silverman (1994).
The convex regression estimator and the penalized least square estimator only incorporate one of the shape characteristics of f * ; the former incorporates convexity only, while the latter incorporates twice differentiability only. In many settings, f * is known to be both convex and twice differentiable, and hence, one wishes to incorporate both convexity and twice differentiability into an estimator of f * . With this goal in mind, our proposed estimator is motivated as follows. Since f * is twice differentiable, the solution to (3) provides a twice differentiable estimator of f * , which is not necessarily convex. We notice that the solutionf n to (3) exists uniquely and has the following properties: P1.f n is twice continuously differentiable, P2.f n is a polynomial of degree 3 on [X i−1 , X i ] for 2 ≤ i ≤ n, and P3.f n is a polynomial of degree 1 on [α, X 1 ] and [X n , β]; see, for example, Green & Silverman (1994) or Exercise 5.7 of Hastie et al. (2009). A function satisfying P1, P2, and P3 is called a natural spline of degree 3. In particular,f n can be represented aŝ for some constants a 0 , a 1 , b 1 , · · · , b n ∈ R and x ∈ [α, β], where x + max(0, x) for x ∈ R; see, for example, page 3 of Greville (1969).
Inspired by this, we consider the following variation of (3): where Additionally, when f * is known to be convex, i.e., f (2) (x) ≥ 0 for all x ∈ [α, β], we can incorporate the convexity requirement by modifying (4) as follows: Problem (5) appears to have an infinite number of constraints. However, one can realize that for any f ∈ G, f (2) is linear over [X i−1 , X i ] for i = 2, · · · , n, [a, X 1 ], and [X n , b]. Hence, the condition f (2) (x) ≥ 0 for x ∈ [α, β] can be replaced by f (2) (α) ≥ 0, f (2) (X i ) ≥ 0 for 1 ≤ i ≤ n, and f (2) (β) ≥ 0. We can therefore suggest the following formulation, which is equivalent to (5): Subject to Proposition 1 in Section 2 establishes that Problem (6) can be converted to a finite-dimensional convex quadratic programming problem. The solutionĝ n to Problem (6) is our proposed estimator of f * . Figure 1 displays instances of the convex regression estimator, the penalized least squares estimator, and the proposed estimator when f * (x) = x 2 for x ∈ [0, 1] and Y i = f (X i ) + i for 1 ≤ i ≤ n, where the i 's are independent and identically distributed (iid) normal random variables with a mean of 0 and a standard deviation of 0.5. Figure 1 shows that the convex regression estimator is not smooth, the penalized least squares estimator is not convex, but the proposed estimator is both convex and smooth.
Proposition 1 in Section 2 ensures that the proposed estimatorĝ n exists always and can be found by solving a convex quadratic programming problem with linear constraints. There exist numerous algorithms that can solve a convex quadratic programming problem in polynomial time; see, for example, Ye & Tse (1984). Thus, Proposition 1 provides a framework that enables us to compute the proposed estimator with numerical efficiency. The numerical results in Section 3 demonstrate thatĝ n converges to f * as n → ∞, thereby justifying the consistency of the proposed estimator numerically.
They also indicate thatĝ n shows superior performance compared to the convex regression estimator and the penalized least squares estimator.
The problem of enforcing the smoothness condition to the shape constrained estimators has received a considerable amount of attention in the literature. However, most work in the literature has focused on incorporating monotonicity and smoothness conditions (Mammen (1991), Hall and Huang (2001) 2020)). When incorporating convexity and smoothness conditions, most work has proposed methods for the case where f * is both convex and monotone, and these methods cannot be easily generalized to the case where f * is convex and not necessarily monotone, because they used basis functions that are both monotone and convex (Dole (1999) and Meyer (2008)). A number of researchers considered the case where f * is convex and not necessarily monotone, but their methods are two-step procedures, in which the first step computes the smoothed version of the Y i 's using nonparametric smoothing methods, and the second step finds the shape restricted estimator of the smoothed Y i values; see, for example, Mammen and Thomas-agnan (1999). Aït-Sahalia and Duarte (2003) also consider a two-step procedure, in which the first step computes a shape restricted estimator, and the second step smoothes the shape restricted estimator using local polynomial regression. In our proposed method, we propose an estimator for a smooth convex function f * that is not necessarily monotone. Our method uses a single-step, in which convexity and smoothness are incorporated simultaneously. In the numerical experiments conducted in Section 3, we compared our proposed estimator to an estimator computed from a two-step procedure, and observed that the proposed estimator shows superior performance. To our knowledge, this paper is the first to incorporate the convexity condition directly into the framework of the penalized least squares estimator in order to estimate a smooth convex function.
The rest of this paper is organized as follows. Section 2 establishes that the proposed estimator can be computed by solving a convex quadratic programming problem. The numerical performance of the proposed estimator is examined in Section 3 in three settings: i) f * is a quadratic function, ii) f * is the price of a European call option as a function of the strike price, and iii) f * is the steady-state mean waiting time of a customer in a single server queue as a function of the service rate. Concluding remarks are included in Section 4.

Proposed Estimator
In this section, we establish Proposition 1, which proves that the solution to (6) exists and can be found by solving a convex quadratic programming problem. In Proposition 1, we view vectors as columns, and write x T to denote the transpose of a vector x.
It remains to show that Problem (7) is a convex quadratic programming problem and that there exists a solution to Problem (7). We will show that 1 n n i=1 (Y i − y i ) 2 + λb T Qb is convex and coersive in (a, b, y) . First, note that Q is positive definite because is convex in (a, b, y).

Numerical Results
In this section, we examine the numerical performance of the proposed estimator in three settings: is the price of a European call option when the strike price is x ∈ [0, 1], and iii) f * (x) is the steady-state mean waiting time of a customer in a single server queue when the service rate is x ∈ [1.2, 1.7].
In each of these settings, we compute the proposed estimator along with three competing estimators, i.e., the convex regression estimator, the penalized least squares estimatorf n , and an estimatorĥ n that combines convex regression and penalized least squares regression in a two-step procedure. To computeĥ n , we take the penalized least squares estimator f n , which is a smooth estimator of f * , and compute its convex regression estimator by solving (2) with the Y i 's replaced by thef n (X i )'s. The proposed estimator, the convex regression estimator, and the penalized least squares estimator are computed by solving (7), (2), and (3). When solving the quadratic programming formulations (7) and (2), we use CVX (Grant and Boyd (2014)), a software package designed to solve convex programs. When solving (3) and (7), we choose λ using the cross validation method. In particular, to find λ in (3), we define f [k] λ by the minimizer of over f ∈ F . The cross validation function V(λ) is defined by and λ * is chosen as the minimizer of V(λ) over λ ∈ 10 −6 , 10 −4 , 10 −2 , 1, 10 2 , 10 4 , 10 6 ; see p. 47 of Wabha (1990) for details. λ = λ * is used in (3) when computing the penalized least squares estimator. The smoothing parameter λ in (7) for the proposed estimator is chosen in a similar fashion.
Numerical results in Sections 3.1, 3.2, and 3.3 indicate that the proposed estimator converges to the true function f * as n → ∞ and that the proposed estimator displays superior performance compared to other competing estimators.
We conducted all simulations using a 64-bit computer with an Intel(R) Core(TM) i7-6700K CPU at 4 GHz and a RAM of 32 GB, and programmed all simulations in MATLAB R2018a.

Stylized Model
We consider the case where f * : [0, 1] → R is defined by f * (x) = x 2 for x ∈ [0, 1], X i = i/n − 1/(2n) for 1 ≤ i ≤ n, and Y i = f * (X i ) + i for 1 ≤ i ≤ n, where the i 's are iid normal random variables with a mean of 0 and a standard deviation of 0.15.
We generated ((X i , Y i ) : 1 ≤ i ≤ n), and computed the proposed estimator, the convex regression estimator, the penalized least squares estimator, andĥ n . To measure the accuracy of the proposed estimator, we computed the empirical integrated mean square error (EIMSE) between the true function f * and the proposed estimatorĝ n as follows: The EIMSE values for other competing estimators are computed in a similar fashion. We repeated this procedure 100 times independently, generating 100 copies of the EIMSE value for each of the estimators. Using these 100 copies, we then computed the 95% confidence interval of the EIMSE value. The 95% confidence intervals are reported in Table 1 for a variety of n values. The proposed estimator produces lower EIMSE values than other competing estimators.

Call Option
We consider the case where f * (x) is the price of a European call option with the strike price x ∈ (0, 1) when the current price of the underlying stock is 0.5, the risk-free annual interest rate is 0.05, the stock price volatility is 0.3, and the time to maturity of the option is 1 year. We set X i = i/n − 1/(2n) for 1 ≤ i ≤ n. For each i ∈ {1, · · · , n}, we assumed that the strike price of the option is X i , generated a sample path (S t : 0 ≤ t ≤ 1) of a geometric Brownian motion up to time 1 with a drift of 0.05 and a volatility of 0.3, and computed the price of the option Y i by computing exp(−r) max(0, S 1 − X i ), where S 1 is the value of the geometric Brownian motion we generated at time 1.
Using ((X i , Y i ) : 1 ≤ i ≤ n), we computed the proposed estimator, the convex regression estimator, the penalized least squares estimator,ĥ n , and their EIMSE values. The 95% confidence intervals of these EIMSE values, based on 100 iid replications, are reported in Table 2 for a variety of n values. The proposed estimator produces lower EIMSE values than other competing estimators.

M/M/1 Queue
We consider the case where f * (x) is the steady-state mean waiting time of a customer in a single server first come/first serve queue with infinite capacity buffer, unit arrival rate, and a service rate of x ∈ [1.2, 1.7] when the service times are iid exponential random variables, and the interarrival times are iid exponential random variables. We set X i = 1.2 + i/(2n) − 1/(4n) for 1 ≤ i ≤ n. For each i ∈ {1, · · · , n}, we computed the average of the waiting times of the first 500 customers Vol. 9, No. 5; arriving at the queue when the service rate is X i and the queue is initialized empty and idle. We repeated this procedure 10 times independently, generated 10 averages, and set Y i equal to the average of the 10 averages.
Using ((X i , Y i ) : 1 ≤ i ≤ n), we computed the proposed estimator, the convex regression estimator, the penalized least squares estimator,ĥ n , and their EIMSE values. The 95% confidence intervals of these EIMSE values, based on 200 iid replications, are reported in Table 3 for a variety of n values. The proposed estimator produces lower EIMSE values than other competing estimators.

Conclusions
In this paper, we proposed a new method for estimating a smooth and convex regression function. The proposed estimator can be easily computed by solving a convex quadratic programming problem and shows superior numerical performance compared to existing estimators. Possible future research topics include establishing the consistency of the proposed estimator as n → ∞ and incorporating other shape restrictions, such as monotonicity, into the proposed estimator.