Instructions

Please make sure your final output file is a pdf document. You can submit handwritten solutions for non-programming exercises or type them using R Markdown, LaTeX or any other word processor. All programming exercises MUST be done in R, typed up clearly and with all code attached. Submissions should be made on gradescope: go to Assignments \(\rightarrow\) Homework 8.

Questions

  1. Biased coin problem. Suppose we have univariate data \(y_1, \ldots, y_n | \theta \sim \text{Bernoulli}(\theta)\) and wish to test: \(\mathcal{H}_0: \theta = 0.5 \ \ \text{vs. } \mathcal{H}_1: \theta \neq 0.5\).
    • Part (a): Formulate this hypothesis testing problem in a Bayesian way. Specify all the necessary steps and come up with your own priors where necessary.
    • Part (b): Derive and simplify the marginal likelihoods \(\mathcal{L}[Y | \mathcal{H}_0]\) and \(\mathcal{L}[Y | \mathcal{H}_1]\).
      For each marginal likelihood, you’ll need to start with the joint posterior between hypothesis and model parameter, and integrate out the parameter. For example, \(\mathcal{L}[Y | \mathcal{H}_1] = \int_{\Theta} p(Y, \theta | \mathcal{H}_1) \text{d}\theta = \int_{\Theta} \mathcal{L}[Y | \mathcal{H}_1, \theta] \cdot \pi(\theta | \mathcal{H}_1) \text{d}\theta\), where \(\mathcal{L}[Y | \mathcal{H}_1, \theta]\) is the likelihood for the data under the alternative hypothesis and \(\pi(\theta | \mathcal{H}_1)\) is your prior under the alternative hypothesis. Keep all normalizing factors when integrating.
    • Part (c): Derive the Bayes factor in favor of \(\mathcal{H}_1\). Also, derive the posterior probability of \(\mathcal{H}_1\) being true. Simplify both as much as possible.
    • Part (d): Study the asymptotic behavior of the Bayes factor in favor of \(\mathcal{H}_1\). For \(\theta \in \{0.25, 0.46, 0.5, 0.54\}\), make a plot of the Bayes factor (y-axis) against sample size (x-axis). You should have four plots. Comment (in detail) on the implications of the true value of \(\theta\) on the behavior of the Bayes factor in favor of \(\mathcal{H}_1\), as a function of sample size. When answering this question, remind yourself of what Bayes factors actually mean and represent! One line answers will not be sufficient here; explain clearly and in detail what you think the plots mean or represent.
      When you are making the plots, generate the data sequentially. That is, the way the data will be generated as Bernoulli outcomes 0 or 1 for each \(y_i\) in a real experiment. For example, for \(n = 1\), sample \(y_1 = 0\) or \(y_1 = 1\). For \(n = 2\), sample \(y_2 = 0\) or \(y_2 = 1\) and add it to the existing \(y_1\). For \(n = 3\), sample \(y_3 = 0\) or \(y_3 = 1\) and add it to the existing \(y_1\) and \(y_2\). Keep doing that for a sequence of \(n\) values, as \(n\) gets very large, until you see a very discernible pattern in your plots.
  2. Metropolis-Hastings. Consider the following sampling model: \[y_1, \ldots, y_n | \theta_1, \theta_2 \sim p(y | \theta_1, \theta_2),\] with the priors on \(\theta_1\) and \(\theta_2\) set as \(\pi_1(\theta_1)\) and \(\pi_2(\theta_2)\) respectively, where \(\theta_1, \theta_2 \in \mathbb{R}\).

    Suppose we are interested in generating random samples from the posterior distribution \(\pi(\theta_1, \theta_2 | y_1, \ldots, y_n)\). For each of the following proposal distributions, write down the acceptance ratio for using Metropolis-Hastings to generate the samples we desire. Make sure to simplify the ratios as much as possible for each proposal!
    • Part (a): Full conditionals: \[ \begin{split} g_{\theta_1}[\theta_1^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = p(\theta_1^\star | y_1, \ldots, y_n, \theta_2^{(s)});\\ g_{\theta_2}[\theta_2^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = p(\theta_2^\star | y_1, \ldots, y_n, \theta_1^{(s)}). \end{split} \]
    • Part (b): Priors: \[ \begin{split} g_{\theta_1}[\theta_1^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = \pi_1(\theta_1^\star);\\ g_{\theta_2}[\theta_2^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = \pi_2(\theta_2^\star). \end{split} \]
    • Part (c): Random walk: \[ \begin{split} g_{\theta_1}[\theta_1^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = \mathcal{N}(\theta_1^{(s)}, \delta^2);\\ g_{\theta_2}[\theta_2^\star | \theta_1^{(s)}, \theta_2^{(s)}] & = \mathcal{N}(\theta_2^{(s)}, \delta^2). \end{split} \]

    In each case, you only need to spend time working through the acceptance ratio for one of the two parameters. The other one should become obvious once you’ve completed the first.

Grading

20 points.