Nonparametric Tests for Convexity / Monotonicity / Positivity of Multivariate Functions with Noisy Observations

We propose a new method of testing for a function’s convexity, monotonicity, or positivity, based on some noisy observations of the function made over a finite set T of points in the domain, where the observations can be made multiple times at each point in T . One of the traditional approaches to the test of a function’s shape characteristic is to fit a convex, a monotone, or a positive function, depending on the shape characteristic we wish to test for, to the data set minimizing the sum of squared errors, and to compute the sum of squared differences (SSD) between the fit and the data set. While the traditional approach proceeds by observing the SSD as the number of points in T increases to infinity, we propose observing the SSD as r, the number of observations taken at each point in T , increases to infinity. This new way of observing the asymptotic behavior of the SSD leads to a test procedure that does not require the estimation of any additional parameters, and hence, is easy to implement. The proposed test procedure is proven to achieve a prescribed power as r → ∞. Numerical examples illustrate that the proposed test successfully detects the convexity/monotonicity/positivity of a function, as well as the non-convexity/non-monotonicity/non-positivity of a function.


Introduction
The goal of this paper is to develop a hypothesis test for determining whether an unknown function f * : R d → R is convex, monotone, or positive using some noisy measurements of f * at points X 1 , . . ., X n ∈ R d .
The convexity/monotonicity/positivity of a function has significant implications in many areas of applications.In economics, the concavity of the utility curve as a function of a person's income implies that the marginal utility diminishes as income increases (p. 31 of Keynes, 1935).In the context of statistical inference, the monotonicity of a function that one wishes to estimate from noisy data indicates that one can fit a monotone function to the data set to reconstruct the underlying function (Barlow et al., 1972).
In this paper, we assume that we can observe noisy measurements of f * (X i ) for 1 ≤ i ≤ n, and these observations can be made multiple times independently of each other.Thus, we are able to obtain the data set ((X i , Y i j ) : 1 ≤ i ≤ n, 1 ≤ j ≤ r), where the X i 's are either R d -valued random vectors or deterministic points in R d , and the ϵ i j 's are independent and identically distributed (iid) random variables with E(ϵ i j | X 1 , . . ., X n ) = 0 and E(ϵ 2 i j | X 1 , . . ., X n ) = σ 2 for some σ < ∞.To determine whether the function f * is convex/monotone/positive or not, we consider the following pairs of the null and alternative hypotheses:

Null Hypothesis
Alternative Hypothesis  where . . , v(d)) and w = (w(1), . . ., w(d When the context is clear, H 0 c , H 0 m , and H 0 p will be referred to as the null hypotheses and H a c , H a m , and H a p the alternative hypotheses.
The proposed test procedure is described as follows.
Test of Convexity: When one suspects that the unknown function f * is convex, a natural step to take in order to estimate f * is to fit a convex function fc : ∑ r j=1 Y i j /r for 1 ≤ i ≤ n, which minimizes the sum of squared distances between the fit and the data set.The fitted function fc can be defined by the solution to the following infinite-dimensional minimization problem: over g ∈ F c .Problem (1) turns out to be equivalent to the following finite-dimensional quadratic programming problem: , where ξ T i denotes the transpose of ξ i (Lemma 2.5 of Seijo & Sen, 2011).
When f * is truly convex, the mean square error (MSE), defined by by the minimizing property of fc and the strong law of large numbers, where εi = ∑ r j=1 ϵ i j /r for 1 ≤ i ≤ n.Proposition 1 of this paper also shows that the rate of convergence is of order 1/r.However, when f * is not convex, the MSE will possibly converge to a certain positive number as r → ∞ because fc converges to a function that is different from f * .Figure 2 shows a graph of f * , which is not convex, and the function to which fc converges as r → ∞.
The test statistic of our proposed procedure is therefore the MSE multiplied by r as follows: The asymptotic behavior of the MSE suggests that we do not reject H 0 c if the test statistic TS c diverges to infinity as r → ∞.The critical value will be derived from Propositions 1 and 2 in Section 2 at a prescribed value of the Type II error.

Test of Monotonicity:
For a test of monotonicity, we will use a similar procedure.When one suspects that f * is monotone, one can fit a monotone function fm : R d → R to the data set, which minimizes the sum of squared errors.The fitted function fm is the solution to the following quadratic program: in the decision variables g(X 1 ), . . ., g(X n ) ∈ R. The proposed test statistic is then defined by and H 0 m is not rejected if the test statistic TS m diverges to infinity as r → ∞.Test of Positivity: For the test of positivity, the proposed test statistic is given by where fp is the solution to the following quadratic program: in the decision variables g(X i ) for 1 ≤ i ≤ n.H 0 p is not rejected if the test statistic TS p diverges to infinity as r → ∞.Tests of convexity/monotonicity/positivity have been widely studied in the statistics literature.Various types of hypothesis tests are proposed with different test statistics.However, most work in the literature has focused on observing the behavior of the test statistic as n → ∞ with a fixed value of r (Yatchew, 1992;Hall & Jeckman, 2000;Baraud et al., 2005) or imposed a condition that requires the normality of the ϵ i j 's (Bartholomew, 1959;Shapiro, 1988;Baraud et al., 2005).For example, the MSE has been studied as a test statistics in Shapiro (1988), but the behavior of the test statistic is studied only for the case where n → ∞ with r fixed.Empirical studies suggest that the weights used in Shapiro (1988) are difficult to compute exactly, so the test procedure proposed by Shapiro (1988) is computationally burdensome (Sen & Silvapulle, 2002).Bartholomew (1959) used the MSE as a test statistic, but he assumed that the ϵ i j 's are normally distributed and did not consider the case where r → ∞.Yetchew (1992) also considered the MSE as a test statistic, but did not consider the case where r → ∞, and focused only on the case where f * is defined on the one-dimensional set R. Others have used various types of test statistics to test a function's convexity/monotonicity/positivity.For example, Ghosal et al. (2000) use a locally weighted version of Kendall's tau statistic as a test statistic, whereas Wang & Meyer (2011) use regression splines and their derivatives to define a test statistic.
In this paper, we take a different point of view from the existing literature.Even though increasing n to infinity may result in a good estimator of the true function f * (x) over all x ∈ R d , increasing r to infinity can provide simpler and more practical tests for convexity/monotonicity/positivity detection that are easier to implement.We thus observe the asymptotic behavior of our test statistic as r → ∞ and derive the critical value accordingly.The critical value turns out to be a percentile of (σ 2 /n)χ 2 n , where χ 2 n follows the chi-square distribution with n degrees of freedom.Considering the fact that σ 2 can be easily estimated from the sample variance of Y 11 , . . ., Y 1r , the critical value can be readily computed from the data set.
The situation where r is large arises frequently in practice.In particular, this situation arises when f * is an unknown function that we want to estimate using "computer simulation" and when we are able to select any point x in the domain of f * , get an estimate of f * (x) through computer simulation, and repeat this procedure as many times as we wish.For example, when f * is the price of a stock option that is contingent on the price x ∈ R of the underlying stock, we can use computer simulation to get an estimate of f * (x) at any point x as many times as we wish, and hence, r can be made as large as we wish.
Our main contribution is therefore proposing a simple and practical test procedure that is based on the idea that the MSE converges to 0 as r → ∞ if f * is convex, and diverges to infinity as r → ∞ if f * is not convex.The proposed procedure does not require estimation of any additional parameters.Furthermore, it does not rely on any assumptions regarding the distribution of the X i 's and the ϵ i j 's.The test statistic and the critical value can be easily computed from the data set.This paper is organized as follows.In Section 2, we prove that the proposed test achieves a prescribed power as r → ∞.We also describe the proposed test procedure in more detail.In Section 3, we apply the proposed test to different types of f * , and observe the conclusion of our test as r → ∞.The numerical results in Section 3 illustrate that the proposed test successfully rejects the null hypothesis when the alternative hypothesis is true for r sufficiently large.Concluding remarks are included in Section 4.

The Asymptotic Behavior of the Test Statistics and the Proposed Test Procedure
In order to analyze the behavior of the test statistics, we will impose some probabilistic assumptions on the ϵ i j 's.In particular, we require the following assumptions: We first establish in Proposition 1 the fact that the asymptotic distribution of the test statistics defined in (3), ( 5), and (6) as r → ∞ is similar to the distribution of (σ 2 /n)χ 2 n .Proposition 1 Assume A1-A3.Then, for a fixed n, for any τ ≥ 0, where χ 2 n follows the chi-squared distribution with n degrees of freedom.Proof.We begin by proving the existence and the uniqueness of fc , fm , and fp .The existence and the uniqueness of fc is proven in Lemma 2.3 of Seijo & Sen (2011).To prove the existence of fm , we note that Problem (4) is a minimization problem of a coercive function over a non-empty closed subset of R n .By Theorem 2.32 on page 25 of Beck (2014), the solution to Problem (4) exists.To prove the uniqueness of the solution, suppose on the contrary that Problem (4) has two distinct minimizers, say v = (v(1), . . ., v(n)) and w = (w(1), . . ., w(n)).Since φ : R n → R, defined by φ , which contradicts the fact that v is a minimizer of Problem (4).Therefore, Problem (4) has a unique minimizer.The existence and the uniqueness of fp follows using similar arguments.Now, we are ready to prove the main statement of Proposition 1. Suppose f * ∈ F c .Then, ) /n over all convex functions g.Next, we will prove that as r → ∞.To prove (9), we first note that, for any η ∈ R, almost surely as r → ∞ by the weak law of large numbers (A1, A2, and A3) and the continuous mapping theorem.
Applying the bounded convergence theorem to P((r/n) and hence, (9) follows.
Combining ( 8) and ( 9) yields as r → ∞, and hence, the first inequality of Proposition 1 follows.The rest of Proposition 1 uses similar arguments.2 Proposition 1 enables us to suggest the following test procedure.

Proposed Hypothesis Test
1. Using the data set ((X i , Y i ) : 1 ≤ i ≤ r), compute the test statistic TS c from (3) for a test of convexity, TS m from (5) for a test of monotonicity, and TS p from (5) for a test of positivity.
2. Let β be the prescribed value of the Type II error.In other words, β is the desired probability of not rejecting the null hypothesis when the alternative hypothesis is true.Let z 1−β be the 100(1 − β)th percentile of (σ 2 /n)χ 2 n .When σ 2 is not known, σ 2 can be estimated from the sample variance of Y 11 , . . ., Y 1r , i.e., ∑ r j=1 (Y 1 j − Y 1 ) 2 /(r − 1). 3. If the test statistic is less than or equal to z 1−β , then reject the null hypothesis in favor of the alternative hypothesis.Otherwise, do not reject the null hypothesis.
The following proposition shows that the proposed test achieves the prescribed power as r → ∞.

Numerical Results
In this section, we apply the proposed test procedure to various types of f * .We conduct the proposed hypothesis test for each case of f * , and observe whether the null hypothesis is rejected or not as we increase r.By repeating the procedure multiple times for each case of f * , we estimate the proportion of time that the null hypothesis is rejected.Numerical results in Sections 3.1, 3.2, and 3.3 display that the proportion of time that H 0 c , H 0 m , or H 0 p is rejected converges to 1 as r increases or f 9 from top left to bottom right.The solid lines are the centers of the 95% confidence intervals.The horizontal axis is r, the number of observations made at each point in the domain, in all graphs.n = 64.to infinity when f * is convex, monotone, or positive, respectively, whereas the proportion of time that H 0 c , H 0 m , or H 0 p is rejected converges to 0 as r increases to infinity when f * is not convex, not monotone, or not positive, respectively.These results support Proposition 2 in Section 3, which claims that the power of the proposed test converges to 1 as r → ∞.They also suggest that the type I error converges to 0 as r → ∞ for n sufficiently large.
We conducted all simulations using a 64-bit computer with an Intel(R) Core(TM) i7-6600U CPU at 6 GHz and a memory of 237 GB.We programmed all simulations in MATLAB R2010a.

Test of Convexity
We consider the case where f * is one of the following test functions: For f 1 , f 2 , and f 3 , we let {X 1 , . . ., X n } be given by } .
We then generate the Y i j 's from Y i j = f k (X i ) + U i j (−1, 1) for 1 ≤ i ≤ n, 1 ≤ j ≤ r, and 1 ≤ k ≤ 3, where the U i j (−1, 1)'s are iid random variables uniformly distributed between −1 and 1.We next compute fc by solving the quadratic programming problem in (2) using CVX, a package for specifying and solving convex programs (Grant & Boyd, 2014), and the test statistic TS c by using Equation (3).When conducting the proposed test procedure, β is set as 0.05.The proposed test procedure is repeated 2,000 times.The 95% confidence interval of the proportion of time that H 0 c is rejected is computed using these 2,000 trials and is reported in Table 1 when n = 8, in Table 2 when n = 27, and in Table 3 when n = 64 for a variety of r values.Figure 2 reports the 95% confidence interval of the proportion of time that H 0 c is rejected, based on 100 iid replications, when n = 64 for a variety of r values.
Tables 1, 2 and 3 and Figure 2 show that the proportion of time that H 0 c is rejected becomes close to 1 for the convex functions, f 1 and f 2 , and to 0 for the non-convex function f 3 (when n is sufficiently large) as r → ∞.
Table 1.The 95% confidence interval of the proportion of time rejecting H 0 c in the case where f * is f 1 , f 2 , and f 3 when n = 8.

Test of Monotonicity
We next consider the case where f * is one of the following test functions: f 5 : R 3 → R given by f 5 (x) = 0 for x = (x(1), x(2), x(3)) ∈ R 3 , and For f 4 , f 5 , and f 6 , we let {X 1 , . . ., X n } be given by We then generate the and 4 ≤ k ≤ 6, where the U i j (−1, 1)'s are iid random variables uniformly distributed between −1 and 1.We next compute fm by solving the quadratic programming problem in (4) using CVX, and the test statistic TS m by using Equation ( 5).When conducting the proposed test procedure, β is set as 0.05.The proposed test procedure is repeated 2,000 times.The 95% confidence interval of the proportion of time rejecting H 0 m is computed using these 2,000 trials and is reported in Table 4 when n = 8, in Table 5 when n = 27, and in Table 6 when n = 64 for a variety of r values.Figure 2 reports the 95% confidence interval of the proportion of time that H 0 m is rejected, based on 100 iid replications, when n = 64 for a variety of r values.Tables 4, 5 and 6, and Figure 2 show that the proportion of time rejecting H 0 m becomes close to 1 for the monotone functions, f 4 and f 5 , and to 0 for the non-monotone function f 6 (when n is sufficiently large) as r → ∞.

Test of Positivity
We consider the case where f * is one of the following test functions: Table 4.The 95% confidence interval of the proportion of time rejecting H 0 m in the case where f * is f 4 , f 5 , and f 6 when n = 8.
We then generate the Y i j 's from Y i j = f k (X i ) + U i j (−1, 1) for 1 ≤ i ≤ n, 1 ≤ j ≤ r, and 7 ≤ k ≤ 9, where the U i j (−1, 1)'s are iid random variables uniformly distributed between −1 and 1.We next compute fp by solving the quadratic programming problem in (7) using CVX, and the test statistic TS m by using Equation (6).When conducting the proposed test procedure, β is set as 0.05.The proposed test procedure is repeated 2,000 times.The 95% confidence interval of the proportion of time rejecting H 0 p is computed using these 2,000 trials and is reported in Table 7 when n = 8, in Table 8 when n = 27, and in Table 9 when n = 64 for a variety of r values.Figure 2 reports the 95% confidence interval of the proportion of time that H 0 p is rejected, based on 100 iid replications, when n = 64 for a variety of r values.Tables 7, 8 and 9, and Figure 2 show that the proportion of time rejecting H 0 p becomes close to 1 for the positive functions, f 7 and f 8 , and to 0 for the non-positive function f 9 (when n is sufficiently large) as r → ∞.

Comparisons with an Existing Method
In this section, we apply our proposed method and the method proposed by Yatchew (1992) to test for a function's convexity, and compare their performance.
We assume that f * is one of the following test functions: f 10 : R → R given by f 10 (x) = x 2 for x ∈ R, and f 11 : R → R given by f 11 (x) = 2.3(x − 0.5)x(x + 0.5) for x ∈ R.
We let To apply our proposed method, we generate the Y i j 's from Y i j = f * (X i ) + N i j (0, 1 2 ) when f * = f 10 or Y i j = f * (X i ) + N i j (0, 0.5 2 ) when f * = f 11 for 1 ≤ i ≤ n and 1 ≤ j ≤ r, where the N i j (0, 1 2 )'s and the N i j (0, 0.5 2 )'s are iid random variables normally distributed with a mean of 0 and variances of 1 and 0.5 2 , respectively.We next compute fc by solving the quadratic programming problem in (2) using CVX, and the test statistic TS c by using Equation (3).When conducting the proposed test procedure, β is set as 0.05.The proposed test procedure is repeated 100 times.The 95% confidence interval of the proportion of time that H 0 c is rejected is computed using these 100 trials and is reported in Tables 10 and  11 for a variety of r values when n = 5.
We next apply the method proposed by Yatchew (1992).In this method, only one observation of f * (x) is made at each Table 7.The 95% confidence interval of the proportion of time rejecting H 0 p in the case where f * is f 7 , f 8 , and f 9 when n = 8.
2 1.00 ± 0.00 0.41 ± 0.02 0.01 ± 0.00 10 1.00 ± 0.00 0.96 ± 0.01 0.00 ± 0.00 20 1.00 ± 0.00 1.00 ± 0.00 0.00 ± 0.00 point x in the domain of f * .So, we generate the where the N i (0, 1 2 )'s and the N i (0, 0.5 2 )'s are iid random variables normally distributed with a mean of 0 and variances of 1 and 0.5 2 , respectively.We then compute the following test statistic proposed by Yatchew (1992): where σ2 L 0 , L 1 , and L 2 are the upper bounds on | f * |, the absolute value of the first derivative of f * , and the absolute value of the second derivative of f * , respectively.We set L 0 = 40, L 1 = 40, and L 2 = 80 for f 10 , and L 0 = 3, L 1 = 9, and L 2 = 18 for f 11 .Once the test statistic is evaluated from Equation ( 10), H 0 c is rejected if the test statistic is between the 100(β/2)th percentile and the 100(1 − β/2)th percentile of the standard normal distribution with β = 0.05.This procedure is repeated 100 times.The 95% confidence interval of the proportion of time that H 0 c is rejected is computed using these 100 trials and is reported in Tables 10 and 11 for a variety of r and n values.
The results in Tables 10 and 11 show that the proposed method exhibits good performance in identifying the convexity of non-convexity of a function.

Conclusions
In this paper, we proposed a new method of testing for a function's convexity/monotonicity/positivity.The proposed method differs from the existing methods in the literature in that it observes the behavior of the test statistic as r → ∞ rather than as n → ∞.Propositions 1 and 2 establish that the proposed method successfully detects a function's convexity/monotonicity/positivity and achieves a prescribed value of the type II error as r → ∞.An interesting point that can be raised next is whether the proposed test procedure can successfully detect a function's non-convexity/nonmonotonicity/non-positivity.Our numerical results in Section 3 indicate successful detection of non-convexity/nonmonotonicity/non-positivity when n is large enough to capture the overall shape of the underlying function f * .Therefore, a promising research topic for the future is the study of the probability of the Type I error of the proposed test procedure when both r and n increase to infinity.

Figure 1 .
Figure 1.The solid line is f * and the dashed line is the function to which fc converges as r → ∞.

Figure 2 .
Figure2.The dotted lines are the lower and upper limits for the 95% confidence interval of the proportion of time rejecting H 0 c (upper three graphs), H 0 m (middle three graphs), or H 0 p (bottom three graphs) in the case where f * is f 1 , f 2 , f 3 , f 4 , f 5 , f 6 , f 7 , f 8 , or f 9 from top left to bottom right.The solid lines are the centers of the 95% confidence intervals.The horizontal axis is r, the number of observations made at each point in the domain, in all graphs.n = 64.
where min(a, b) denotes the minimum of a and b for a, b ∈ R.

Table 10 .
The 95% confidence interval of the proportion of time rejecting H 0 c in the case where f * = f 10 .

Table 11 .
The 95% confidence interval of the proportion of time rejecting H 0 c in the case where f * = f 11 .