### The Hannan-Quinn Proposition for Linear Regression

#### Abstract

We consider the variable selection problem in linear regression. Suppose that we have a set of random variables $X_1,\cdots,X_m,Y,\epsilon$ ($m\geq 1$) such that $Y=\sum_{k\in \pi}\alpha_kX_k+\epsilon$ with $\pi\subseteq \{1,\cdots,m\}$ and reals $\{\alpha_k\}_{k=1}^m$, assuming that $\epsilon$ is independent of any linear combination of $X_1,\cdots,X_m$. Given $n$ examples $\{(x_{i,1}\cdots,x_{i,m},y_i)\}_{i=1}^n$ actually independently emitted from $(X_1,\cdots,X_m, Y)$, we wish to estimate the true $\pi$ based on information criteria in the form of $H+(k/2)d_n$, where $H$ is the likelihood with respect to $\pi$ multiplied by $-1$, and $\{d_n\}$ is a positive real sequence. If $d_n$ is too small, we cannot obtain consistency because of overestimation. For autoregression, Hannan-Quinn proved that the rate $d_n=2\log\log n$ is the minimum satisfying strong consistency. This paper solves the statement affirmative for linear regression. Thus far, there was no proof for the proposition while $d_n=c\log\log n$ for some $c>0$ was shown to be sufficient.

#### Full Text:

PDFDOI: http://dx.doi.org/10.5539/ijsp.v1n2p179

International Journal of Statistics and Probability ISSN 1927-7032(Print) ISSN 1927-7040(Online)

Copyright © Canadian Center of Science and Education

To make sure that you can receive messages from us, please add the 'ccsenet.org' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.

----------------------------------------------------------------------------------------------------------------------------------------------------------------------