Reducing Uncertainty with Bayesian Regression

Regression

Regression

\[ Y = \boldsymbol{X}\boldsymbol{\beta} + \boldsymbol{\epsilon} \]

\[ \begin{bmatrix} Y_1 \\ Y_2 \\ Y_3 \\ \vdots \\ Y_n \end{bmatrix} = \begin{bmatrix} 1 & X_{11} & X_{12} & \dots & X_{1p} \\ 1 & X_{21} & X_{22} & \dots & X_{2p} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & X_{n1} & X_{n2} & \dots & X_{np} \end{bmatrix} \begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \\ \vdots \\ \beta_p \end{bmatrix} + \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \epsilon_3 \\ \vdots \\ \epsilon_n \end{bmatrix} \]

Beta

\[ \boldsymbol{\hat{\beta}} = (\boldsymbol{X}^T\boldsymbol{X})^{-1}\boldsymbol{X}^T\boldsymbol{Y} \]

Curse of Dimensionality

Solutions

Weakly Informative Priors

Penalized Regression

Weakly Informative Priors

Voting Data

Bayes' Theorem

\[ \text{P}(A|B) = \frac{\text{P}(B|A)\text{P}(A)}{\text{P}(B)} \]

Bayes' Theorem

\[ \text{P}(AB) = \text{P}(A)\text{P}(B|A) = \text{P}(B)\text{P}(A|B) \]

\[ \text{P}(A|B) = \frac{\text{P}(B|A)\text{P}(A)}{\text{P}(B)} \]

Bayes' Theorem

\[ \pi(\theta | x) = \frac{f(x|\theta)\pi(\theta)}{m(x)} \]

Bayes' Theorem

\[ \pi(\theta|x) = \frac{f(x|\theta)\pi(\theta)}{\int_\theta f(x|\theta)\pi(\theta)d\theta} \]

Bayes' Theorem

\[ \color{red}{\pi(\theta|x)} = \frac{\color{blue}{f(x|\theta)}\color{green}{\pi(\theta)}}{\color{gray}{\int_\theta f(x|\theta)\pi(\theta)d\theta}} \]

Bayes' Theorem

\[ \color{red}{\text{Posterior}} = \frac{\color{blue}{\text{Likelihood}} * \color{green}{\text{Prior}}}{\color{gray}{\text{Normalizing Constant}}} \]

Bayes' Theorem

\[ \color{red}{\text{Posterior}} \propto \color{blue}{\text{Likelihood}} * \color{green}{\text{Prior}} \]

Bayesian Regression

\[ Y \sim{} \text{N}(\boldsymbol{X}\boldsymbol{\beta}, \sigma) \]

\[ \beta \sim{} \text{cauchy}(l, s) \]

Priors

Posterior

\[ \pi(\theta | x) \propto \frac{1}{\sigma}e^{\frac{-(x-\theta)^2}{2\sigma^2}} \dfrac{s}{\big(s^2 + (\theta - l)^2\big)} \]

Coefficient Posterior Density

Coefplot

Secret Weapon

Stan

data
{
    int<lower=0> N;
    ...
}

parameters
{
    real alpha_std;
    ...
}

model
{
    alpha ~ normal(0, 10);
    ...
}

Stan GLM

library(rstanarm)
stan64 <- stan_glm(Vote ~ Race + Income + Gender + Education, 
                   family=binomial(), data=ideo, subset=Year==1964, iter=200)

Priors

Jared P. Lander

Chief Data Scientist of Lander Analytics
Author of R for Everyone
Adjunct Professor at Columbia University
Organizer of New York Open Statistical Programming (The R) Meetup
Website: http://www.jaredlander.com

Regression

Regression

Regression

Beta

Curse of Dimensionality

Curse of Dimensionality

Solutions

Voting Data

Bayes' Theorem

Bayes' Theorem

Bayes' Theorem

Bayes' Theorem

Bayes' Theorem

Bayes' Theorem

Bayes' Theorem

Bayesian Regression

Priors

Posterior

Coefficient Posterior Density

Coefplot

Secret Weapon

Secret Weapon

Secret Weapon

Stan

Stan

Stan GLM

Priors

Further Reading

Jared P. Lander

The Tools