The Hannan-Quinn Proposition for Linear Regression

  •  Joe Suzuki    


We consider the variable selection problem in linear regression. Suppose that we have a set of random variables  $X_1,\cdots,X_m,Y,\epsilon$ ($m\geq 1$) such that $Y=\sum_{k\in \pi}\alpha_kX_k+\epsilon$ with $\pi\subseteq \{1,\cdots,m\}$ and reals $\{\alpha_k\}_{k=1}^m$, assuming that $\epsilon$ is independent of any linear combination of $X_1,\cdots,X_m$. Given $n$ examples $\{(x_{i,1}\cdots,x_{i,m},y_i)\}_{i=1}^n$ actually independently emitted from $(X_1,\cdots,X_m, Y)$, we wish to estimate the true $\pi$ based on information criteria in the form of $H+(k/2)d_n$, where $H$ is the likelihood with respect to $\pi$ multiplied by $-1$, and $\{d_n\}$ is a positive real sequence. If $d_n$ is too small, we cannot obtain consistency because of overestimation. For  autoregression, Hannan-Quinn proved that the rate $d_n=2\log\log n$ is the minimum  satisfying strong consistency. This paper solves the statement affirmative for linear regression. Thus far, there was no proof for the proposition while $d_n=c\log\log n$ for some $c>0$ was shown to be sufficient.

This work is licensed under a Creative Commons Attribution 4.0 License.
  • ISSN(Print): 1927-7032
  • ISSN(Online): 1927-7040
  • Started: 2012
  • Frequency: bimonthly

Journal Metrics

  • h-index (December 2021): 20
  • i10-index (December 2021): 51
  • h5-index (December 2021): N/A
  • h5-median(December 2021): N/A

( The data was calculated based on Google Scholar Citations. Click Here to Learn More. )