A Note on Linear and Second Order SignificanceTesting in Nonlinear Models

A modified ANOVA setting is used to develop a one degree of freedom test for intrinsic curvature effect. The approach provides clear cutoffs for the measurement of intrinsic curvature effect. It is shown that the significance of the correction factor itself and its usefulness in correcting a global test based on linear approximation are distinct elements. They depend on the local frame of reference provided by the model-data combination and null hypothesis considered.


Introduction
Nonlinear models are applied in many areas of science, often reflecting scientific theory, for example in toxicology, cancer research and econometrics.Procedures for statistical inference are typically described in terms of Wald statistics, corresponding to the use of a local linear approximation of the regression surface, or likelihood based regions corresponding to use of the likelihood function and likelihood ratio (Seber and Wild, 1989).While likelihood ratio based inference procedures are preferred theoretically, use of Wald statistics in nonlinear regression is common as likelihood ratio based regions can be difficult to interpret (see Bates and Watts, 1988).
The presence of significant nonlinearity or local curvature in the regression surface when a linear approximation is used creates a type of model mis-specification and can affect the accuracy of p-values and confidence regions (Donaldson and Schnabel, 1987).This requires some correction to the relevant test statistics, typically based on squared lengths and related sums of squares.Note that nonlinear models often represent discipline specific theories or solutions to differential equations and have a specified structure and parameterization.They are often more difficult to analyze than linear models.A typical nonlinear regression model can be written; (1) i = 1, . . ., n where x i are fixed values of the explanatory variable x, the model function η is of known form depending on the parameter vector β ∈ R p and x i , and the ε i are independent error terms, each normally distributed with mean zero and variance element σ 2 .Some typical examples are given in Table 1.
To assess possible approximation error, Bates andWatts (1980, 1988), following the work of Beale (1960) developed relative measures of the intrinsic curvature of the regression surface based on comparing differences between linear and second order approximations.Details can be found in Seber and Wild (1989), Chapter 4. The work of Beale can be viewed as developing an average measure of relative curvature, while the Bates and Watts measures reflect a measure of maximum relative curvature, assessed over all possible directions of the response vector y in relation to η(x i , β) evaluated at the m.l.e..Note that intrinsic curvature effects in differential geometry average minimum and maximum projection lengths onto the principal axes at the point in question (Kreyszig, 1991), thus the values suggested by Bates and Watts are not truly geometric curvatures.They applied their measure to many nonlinear models, finding their measure insignificant in almost all (Bates and Watts, 1988).
Here the intrinsic curvature of a regression surface is revisited within a local frame of reference; the observed direction of departure, null hypothesis of interest and data based properties of the nonlinear model in the direction of departure only.Within this context, an intrinsic curvature correction can be given precise calibration, a result lacking in earlier initial development (Seber and Wild (1989), p. 218), and the results organized in an ANOVA table.Significance testing of both the intrinsic curvature effect and related corrected global test are developed.It is seen that less than significant intrinsic curvature effects may still lead to significant adjusted global tests, depending on the local frame of reference considered.
It is also the case that significant intrinsic curvature effect may only moderately correct the global test, implying that the intrinsic curvature effect need not be useful even when it exists and vice-versa.These elements are distinct in their significance, depending on the local frame of reference, model-data combination and null hypothesis considered.
Corrections for model mis-specification are often first stage adjustments of the assumed model using observed data.Such first stage adjustments in applied statistical modeling include propensity scores (Luellen et al., 2005), Box-Cox transformations (Box and Cox;1964, 1982), the selection of link functions (McCullagh and Nelder, 1972), the selection of ridge regression constants (Hoerl and Kennard, 1970), variance stabilizing transformations and log transformations to remove interaction effects (Kutner et al., 2005).Here we locally adjust, using the differential geometric curvature of the surface, the sums of squares in the ANOVA table and adjusting the projection length related to the observed data only.
There is no need to assess over all possible curvature related adjustments just as there is no need to average over all possible Box-Cox reparameterizations or variance stabilizing transformations.
Here intrinsic curvature effects are defined only in relation to the direction of the projection length of the observed response y onto the linear approximation by adding the projection onto the intrinsic curvature vector v defined at the point η(x; β 0 ).An ANOVA based testing approach is then developed for tests of H 0 : β = β 0 , using v to define an independent dimension in the table.It is argued here that first stage intrinsic curvature correction should reflect the observed direction of departure.
It is only in this direction that the (intrinsic) curvature of the regression surface is relevant to modification of the relevant sums of squares and the determination of the Wald statistic value based on the assumed null value η(x; β 0 ) and the m.l.e.η(x; β).The linear approximation here is taken at β = β 0 not at the maximum likelihood value, giving the analysis a focused local aspect.
Note that a "parameter-effects" curvature measure (Bates and Watts, 1980;Cook and Witmer, 1985) is sometimes used as a formal diagnostic for indicating when re-parameterisation of the nonlinear model should be considered.This measure is technically the leftover second order effect after subtracting the Bates and Watts intrinsic curvature measure from the Hessian matrix evaluated at the m.l.e..It can be interpreted as reflecting the degree of non-parallel parameter coordinates on the linear approximating surface examined over all directions (Cook and Witmer, 1985).It requires the assumption of normality and is useful only in smaller sample sizes, not agreeing very well with other measures of curvature defined for asymptotic likelihood calculations (Cook and Witmer, 1985).The standard errors in the ANOVA table here are based on the squared length of the projection orthogonal to the linear surface and are independent of parameter effects.As we do not examine all directions, the linearity of the parameter contours along the surface are not relevant to the analysis.
In this paper, a local intrinsic curvature measure is developed that can be viewed as a first stage modification and used to more accurately test global hypotheses.The inferential approach taken here is based on the Wald statistic, using linear approximation and squared projecting lengths with intrinsic curvature based correction factors.In the sections that follow, the basic geometry of the nonlinear regression model related to first and second order approximation is briefly reviewed.A breakdown of orthogonal projections available to assess significance of local surface curvature and related ANOVA tables is given.A corrected global test corrected for intrinsic curvature and one-degree-of-freedom test for curvature effect are then obtained in an easily interpreted ANOVA table.It is seen that less than significant intrinsic curvature effects may still lead to significant adjusted global tests, depending on the observed data y and null hypothesis in question.Locally defined intrinsic curvature effects are unique to the specific nonlinear model, null hypothesis and dataset considered.In this sense the significance of first and second order elements are distinct.Several examples are discussed.

Method
Consider the standard nonlinear regression model given above y i = η(x i , β) + ε i .The set of possible mean values generated by considering all possible β values defines the surface where Ω is the parameter space, the explanatory variable x i is suppressed in the present notation and η(β) is the n × 1 column vector with i th component given by η(x i , β).This surface is assumed to be a locally smooth manifold, differentiable to the third order with respect to the coordinates β j , j = 1, . . ., p at each point on the surface.In particular it follows that the tangent plane at any point η(β 0 ) on N is uniquely defined as well as second derivatives, necessary to calculate surface curvature.

Linear and Quadratic Approximation
The geometry of local linear and second order approximation are briefly reviewed.Seee for example Seber and Wild (1989).

Second Order Properties and Local Curvature
The quadratic approximation to η(β) at β = β 0 using Taylor expansion is given by η 2 )θ ′ H 0 θ where H 0 is the Hessian p × p matrix with vector elements h i j = ∂ 2 η(β) ∂β i ∂β j evaluated at β = β 0 and θ = (β − β 0 ).The local second order properties of η(β) in the specific direction u are used here to define local relative curvature effects.The intrinsic acceleration vector in the direction u can be defined as (I − uu ′ )η ′′ (β) and represented as −v/ρ, where v is the unit vector chosen perpendicular to the acceleration vector in the direction u and ρ = ρ(β 0 ) is the radius of curvature at η(β 0 ).Note that ν depends by definition on u and is defined by; Taking the norm of this we have; where all matrices are evaluated at β = β 0 and the local intrinsic curvature κ, the inverse radius of η(β) at β = β 0 in the direction u, is given here by 1/ρ.Note that for statistical interpretation these need to be compared to the underlying S S E in the fitted model and data.Note further that the intrinsic curvature is invariant to 1 − 1 reparameterizations of the parameter space (Seber and Wild,p. 692).The intrinsic relative curvature measure of Bates and Watts is similar in basic definition to (2), see for example Seber and Wild, p. 131, however the tangent to the solution locus is taken at η( β),the m.l.e. for η(β) and their measure of curvature is a maximum measure of curvature measured over the entire set of directions out from η( β).
Here the relative curvature reflects a localized frame of reference in the specific direction provided by the global null hypothesis and related linear approximation about η(β 0 ).This better reflects the nature of the problem as η(β) is fixed, x is fixed, β = β 0 and the tangent plane to be used to calculate the squared length of the orthogonal projection of (y obs − η(β 0 )) is also fixed once y is observed.Averaging curvatures or considering them across many directions does not make sense in terms of the differential geometry nor the nature of these surfaces which can vary greatly in form and related curvature across different directions.

Modified Projection Lengths in ANOVA
As noted in Seber and Wild (1989) Chapter 2, inference based on the linear approximation where σ 2 is unknown uses the following exact α-level test procedure; and the set of β values satisfying (3) yields a 100(1 − α)% exact confidence region for β (with the underlying linear model itself being approximate).As noted in Section 2, P = ( ′ ) −1 ′ is the projection matrix to L( ) and (I − P) is the projection matrix to L ⊥ ( ).With the assumption of normality, the derivation of (3) is formally a consequence of considering the efficient scores (Cox and Hinkley,1974, p. 324) which are ′ (y − η(β))/σ 2 .Geometrically, (3) is a ratio of squared projection lengths; the projection of (y − η(β 0 )) onto respectively the tangent plane approximation L( ) and L ⊥ ( ) the space orthogonal to L( ) in R n .See Table 1.

Modified Orthogonal Decomposition
The usual orthogonal decomposition of regression and error can be replaced with a more detailed orthogonal decomposition y = z 1 u + z 2 v + Vz 3 where the residual space is viewed as being spanned by the curvature vector v and the column vectors of V, which are any set of orthonormal vectors spanning the remaining dimensions orthogonal to both the tangent plane and the curvature vector v evaluated at β = β 0 .The curvature in the direction u at β = β 0 can be assessed by considering the orthogonal projection(s) of (y − η(β 0 )) onto the linear and quadratic components in this direction.This is presented in Table 3 where P v = vv ′ is the projection matrix onto the (normed) intrinsic curvature vector v.
To test H 0 : β = β 0 within the approximating linear model, without reference to curvature, the Regression component of Table 2 can be used to generate an F-test with p and (n − p) degrees of freedom.Normality is assumed here and as the projections are orthogonal by construction, the usual theorems regarding squared lengths and quadratic forms apply (see for example Seber and Wild, p. 24).
In particular, assuming y ∼ N((η(β 0 ), σ 2 I) where σ 2 is unknown, we have under the null; with large values of the test statistic leading to rejection of H 0 : β = β 0 .Note again that in the terminology of nonlinear models, this is an exact result, but its accuracy depends on the relevance of the local approximating tangent plane.

One-Degree-of-Freedom Test for Curvature
With moderate sample sizes, we can determine whether curvature adjustment for the test applied in Table 2 is necessary and the orthogonal decomposition given in Table 3 can be used to obtain a test of significance for local curvature defined in the direction u based on the relative squared length of the orthogonal projection onto the curvature vector v.The arguments underlying (3) give; where a large value for the test statistic reflects a pronounced projection length on the curvature vector v in the direction u.In the context of the approximating linear model, this is an exact test when the error are independent and normally distributed.

Adjusted Sums of Squares and F-Test
The linear regression component can be adjusted to account for second order effect by "borrowing" the squared length projection onto the local curvature vector v from the residual space projection and adding it to the squared length projection of the linear component, giving an adjusted global test; See Table 3.This uses the properties of normality, orthogonality implying independence and the sum of independent chi-square distributed variables having a chi-squared distribution.
Both the curvature test and adjusted global test can be carried out as their respective projection lengths and significance levels are not directly correlated.This implies that even if not significant, the curvature related projection used as a correction factor may affect the significance of the linear regression component.If a specific sequence of global tests is to be examined, for example to study the curvature properties of the nonlinear regression surface over a specific region, then for each respective null hypothesis a new ANOVA table should be generated as the correction to the projected sum of squares components will alter depending on the direction u and associated intrinsic curvature vector v.This will depend where on the actual regression surface the linear approximation is developed and the relative position of the response y in relation to the linear approximation.Here we work in relation to local properties defined about η(β 0 ).

Example 1
We apply these concepts in the context of an asymptotic growth model applied to the BOD dataset found in Bates and Watts (1988).This displays curvature related properties and is given by; The dataset is given in Table 4. for i = 1, ..., n.The related 2 by 2 by n second order Hessian matrix is given by; where each h jk , j = 1, 2 and k = 1, 2 is an n−dimensional vector.The 0 value denotes a linear aspect to the model in certain directions.with the regression model here sometimes termed as partially linear.
The maximum likelihood or least squares value for (β 1, β 2 ) is given by ( β 1 = 19.143,β 2 = 0.5311) with s = 6.498 on 4 degrees of freedom.The Bates and Watts overall maximum relative curvature is given by 0.184 and can be interpreted versus an F-statistic based cutoff of 0.49, implying overall curvature effects exist in some direction.Figure 2 shows the presence of curvature primarily in the β 2 direction as expected.
Three global null hypotheses are considered here to examine the effects of nonlinearity at various points on the regression surface.We take β 1 = 18 for all three, a minor perturbation from the observed β 1 least squares value, and β 2 = 1.5, 3.5, 4.0 respectively.This gives a set of values for which the intrinsic curvature increases sequentially.The applied results of the ANOVA based testing in Tables 2 and 3 are summarized in Table 5.These include the uncorrected linear test, the test for curvature effect and the curvature corrected global test.
The test for curvature is also non-significant and the adjusted regression does not attain significance.For the hypothesis (β 1, β 2 ) = (18, 3.5) the intrinsic curvature alters (ρ = 8.79 to ρ = 25.58).While the regression S S based on linear approximation alone remains non-significant, the curvature test is close to significant and the curvature adjusted regression is also close to significant.For the third hypothesis (β 1, β 2 ) = (18, 4.0) the relative curvature is ρ = 49.79, the curvature test remains close to significant and the curvature adjusted regression S S becomes significant.The S S E value also alters as the hypothesized value moves away from the least squares value, as does the overall S S T or squared length of (y − η(β 0 )),reflecting the importance of the relative positioning of the response y in relation to the regression surface η(β).
These considerations serve as useful, detailed diagnostics for curvature related effects and can be applied to any specified η(β 0 ) value.

Example 2
A simulated dataset in Bates and Watts (1980) is used to examine the Michaelis-Menten model given by; where ε i are i.i.d.N(0, σ 2 ).The data is given in Table 6.The first order derivatives are given here by an n by 2 matrix; ] for i = 1, ..., n.The related 2 by 2 by n second order Hessian matrix is given by; where each h jk , j = 1, 2 and k = 1, 2 is an n−dimensional vector.The parameter estimates are given by ( β 1 , β 2 ) = (212.68,0.064).The hypotheses value (β 1 , β 2 ) = (200, 0.3) is examined which is relatively far from the m.l.e.value.We obtain Table 7.The S S R element here is clearly significant.We can also see that there is a small but significant relative curvature effect for the η(x i , β 0 ) value considered and this only moderately corrects the significant regression element, implying that the intrinsic curvature effect need not be useful even when it exists.This analysis can also be applied for various values and regions of interest along the regression surface.It is worth noting that the Michaelis-Menten model can be re-expressed and re-parameterized as; x i ((β 2 /β 1 ) + (x i /β 1 )) 1/y i = 1/β 1 + (β 2 /β 1 )(1/x i ) Letting y * i = 1/y i and x * i = 1/x i this has a linear form.Note that this implies the restrictions θ 1 > 0, y i > 0 and x i > 0, which may not be acceptable in a given scientific setting, limiting re-expression of the regression model.

Discussion
First stage modifications of sums of squares are used here to improve the global testing of parameters in a nonlinear regression model with potential instrinsic curvature effects.A one degree of freedom test for intrinsic curvature effect is also developed and the sums of squares adjustments presented in an easily interpreted ANOVA format.The local nature of intrinsic curvature is emphasized with a focus on the specific linear approximation and related observed direction of departure.The linear approximation here is taken at β = β 0 not at the maximum likelihood value, giving the analysis a more focused local aspect.The use of ANOVA and focus on observed direction of departure provide clearly defined cutoffs for significant testing of the intrinsic curvature effect and corrected sums of squares.
Each individual model-data combination and selected null hypothesis may reflect intrinsic curvature effects which can correct and yield global test significance, even when they themselves are not formally significant.It is also the case that significant intrinsic curvature effect may only moderately correct the global test, implying that the intrinsic curvature effect need not be useful even when it exists.This was not considered in earlier work on intrinsic curvature corrections in nonlinear models which did not consider the local frame of reference provided by model-data combination and null hypothesis.Intrinsic curvature on its own does not tell the whole story and computing these across many examples is not particularly useful.Each model-data and null hypothesis combination must be analyzed on its own merits.Note that the approach given here can also be used as a diagnostic defining regions of intrinsic curvature on a regression surface in relation to the observed response.
the original data is shown in Diagram 1. Diagram 1. Plot of Data and Fitted Nonlinear Regression Line The log-likelihood is plotted in Diagram 2. Diagram 2. Log-Likelihood for BOD Model and Data As discussed in Bates and Watts (1988), p. 203, the non-standard behaviour of this model can be observed by considering the model as β 2 → ∞.In this case the model converges to β 1 which is estimated by y.It follows that likelihood regions are open at confidence levels above 94% in the β 2 direction.The first order derivatives are given here by an n by 2 matrix; = [(1 − e −β 2 x i ), β 1 x i e −β 2 x i ]

Table 2 .
Basic ANOVA Decomposition Using Linear Approximation

Table 3 .
Anova with Local Second Order Adjusted Sum of Squares

Table 5 .
ANOVA Tables for Specified Global Hypotheses with BOD Model and Data

Table 7 .
ANOVA Table for Specified Global Hypothesis