7

Consider the model [1]

$$y_n=X_n\beta_n+\epsilon_n$$ $$\beta_i|\sigma^2,v_i \sim \mathcal{N}(0,\sigma^2 v_i), i=1,\ldots,p$$ $$v_i \sim \beta^\prime(a,b)$$ $$\sigma^2 \sim \mathcal{IG}(c,d)$$

where $\beta^\prime$ is the beta prime distribution and $\mathcal{IG}$ the inverse gamma distribution.

In [1], the authors prove posterior consistency in the high-dimensionality regimes were $p\rightarrow \infty$ as $n\rightarrow \infty$. Is there a way to show posterior consistency for fixed/small $p$ as $n\rightarrow \infty$?

[1] Bai, R. and Ghosh, M., 2021. On the beta prime prior for scale parameters in high-dimensional bayesian regression models. Statistica Sinica, 31(2), pp.843-865.

MrDi
  • 129
  • @SextusEmpiricus, in https://arxiv.org/pdf/1807.06539.pdf the authors set $f$ to be a beta prime prior and prove posterior consistency in high dimension. My question is there a similar prove for fixed/small $p$ with the same prior and or different priors? – MrDi Mar 06 '23 at 21:54
  • @SextusEmpiricus, I think it is very familiar to anyone following the latest research of normal-scale mixture models and posterior consistency. – MrDi Mar 06 '23 at 22:11
  • Rather than answering in the comments, please [edit] your question to include all of this information in the question, so the question is self-contained and people don't have to read the comments, then flag the comments as no longer needed. Ideally, questions should be self-contained so we don't need to find and read some other paper to understand the question. I still don't understand what $v_i \sim f(v_i)$ means or why this is not a circular definition of $v_i$. – D.W. Mar 07 '23 at 00:48
  • Unless the $\beta_i$ values are your data (in which case posterior consistency is asking about $p_n \rightarrow \infty$ by definition) then you appear to be missing a layer of your model. Please amend your question accordingly. – Ben Mar 07 '23 at 01:07
  • 1
    @Ben, I have edited my question. – MrDi Mar 08 '23 at 06:59
  • 1
    @D.W., $f$ is the beta prime distribution. – MrDi Mar 08 '23 at 07:00
  • 1
    @SextusEmpiricus $\beta$ are the coefficients. – MrDi Mar 08 '23 at 07:02
  • The use of $\beta_n$ and $\beta_i$ in the first two equations is still confusing. One needs to read the article to understand that $\beta_n$ is a vector and not just the same as $\beta_i$ with a different subscript. – Sextus Empiricus Mar 08 '23 at 09:02

1 Answers1

0

Based on your reference I believe that you are estimating the vector $\boldsymbol{\beta}$ of size $p_n$ with a posterior distribution based on the observation of the vector $\mathbf{Y}$ of size $n$ in the model $$\mathbf{Y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}$$ where $\mathbf{X}$ is a fixed regression matrix $n \times p_n$ matrix. The components $\beta_i$ from the vector $\boldsymbol{\beta}$ have a prior distribution that follows the model that you described.


If you have a fixed $p$ then consistency seems guaranteed by a Bayesian law of large numbers, or consistency of likelihood/posterior for an i.i.d sample (When do posteriors converge to a point mass?). Or maybe I am missing something?

Say we use some fixed $p$ while letting $n$ increase, then the Bayesian posterior density $f(\boldsymbol{\beta}|\mathbf{Y})$ will concentrate near the true $\boldsymbol{\beta}$ (if the prior is not zero).

The same is true when $p$ is not fixed, but still with a finite upper bound. We can consider all $p$'s below the bound together while letting $n \to \infty$ and if they are all individually consistent, then we will also have consistency when we change $p$ while letting $n$ increase.

  • 2
    I don't think this answers OP question. – Isaac Mar 08 '23 at 07:15
  • 1
    @Isaac I answered the question "Is there a way to show posterior consistency for fixed/small $p$" by explaining a way that it can be shown. Could you tell what is wrong with it? – Sextus Empiricus Mar 08 '23 at 07:40
  • @SextusEmpiricus, could you elaborate more about this? How will the theorem in [1] change with fixed/small $p$? – MrDi Mar 08 '23 at 09:09
  • @MrDi for a fixed $p$ one can use the law of large numbers. When $p$ is not fixed then that approach becomes difficult because while we can have that for a larger $p$ the 'error' in the posterior may become larger. Then consistency is not obtained, if the increase in error due to increase in $p$ is faster than the decrease in error due to increase in $n$. – Sextus Empiricus Mar 08 '23 at 09:20
  • @SextusEmpiricus, could you please reference some source where the law of large numbers is used to prove posterior consistency for the normal-scale mixture model showen the question? – MrDi Mar 08 '23 at 09:45
  • @MrDi I believe that the more formal approach uses martingales and was applied by Doob in 1949 in "Applications of the theory of martingales" but I can not find that source. You can also think of Lorraine Schwartz in on Bayes procedures or On Consistency of Bayes Procedures. The use of the law of large numbers I came up with (with help of a comment) 2 years ago, I do not know anymore how I did that. Yesterday I came across a similar approach in some sources that I was scanning but I lost it... – Sextus Empiricus Mar 08 '23 at 10:02
  • ... before using the law of large numbers in that linked question, I applied something that resembles a central limit theorem and that relates to the Bernstein-von Mises theorem. – Sextus Empiricus Mar 08 '23 at 10:04
  • Thank you for your answer. I just have two more subquestions so I can fully understand the full picture. When authors talk about "high-dimensionality" do they mean $p$ is large or $p$ is large with respect to $n$ (i.e. if $p$ is large but $n>p$ is that a high-dimensional regime)?. Secondly, could you please elaborate more on how "When p is not fixed then that approach becomes difficult because while we can have that for a larger p the 'error' in the posterior may become larger."? – user3879021 Mar 10 '23 at 08:25
  • @user3879021 Analogous example: Let $x_{mn} = \frac{m^2}{n}$. Then if $m$ is fixed we have that $\lim_{n\to \infty} x_{mn} = 0$ also if $m$ is not fixed but a function of $n$ that is bounded by a value $m_c$ then we have $0= \lim_{n\to \infty} x_{0n}\leq \lim_{n\to \infty} x_{mn} \leq \lim_{n\to \infty} x_{m_cn} =0 $ . However if $m \to \infty$ then it might be that the limit does not exist. – Sextus Empiricus Mar 10 '23 at 08:54
  • @SextusEmpiricus, Ok thank you. what about the first question? – user3879021 Mar 10 '23 at 08:57
  • @user3879021 I am not sure whether 'high dimensionality' has a clear definition like $p >n$. But in this question I assumed that $p$ has a finite bound (because it is contrasted with $p \to \infty$). In the setting of OLS/ridge/lasso regression then people do speak about high dimensionality if $p>n$, but I am not sure whether the same language is being used with these Bayesian problems. In the linked article form the question they speak about $p \gg n$ as 'the high dimensional setting'. – Sextus Empiricus Mar 10 '23 at 09:04