As I said before, I firmly side with Andrew Gelman (see e.g. here) in that model checking is dangerously neglected in Bayesian practice. The philosophical criticism against “rejecting” models (double-using data etc. etc.) is all well, but when using Bayesian methods in practice, I see few sensible alternatives to residual checks (both guessing a model and stick with it, and fitting a maximally complex model are no workable alternatives imo).
One potential reason for the low popularity of residual checks in Bayesian analysis may be that one has to code them by hand. I therefore wanted to point out that the DHARMa package (disclosure, I’m the author), which essentially creates the equivalent of posterior predictive simulations for a large number of (G)LM(M)s fitted with MLE, can also be used for Bayesian analysis (see also my earlier post about the package). When using the package, the only step required for Bayesian residual analysis is creating the posterior predictive simulations. The rest (calculating the Bayesian p-values, plotting, and tests) is taken care of by DHARMa.
I want to demonstrate the approach with a synthetic dataset of Beetle counts across an altitudinal gradient. Apart from the altitudinal preference of the species (in ecology called the niche), the data was created with a random intercept on year and additional zero-inflation (full code of everything I do at the end of the post).
Now, we might start in JAGS (or another Bayesian software for that matter) with a simple Poisson GLM, testing for counts ~ alt + alt^2, thus specifying a likelihood such as this one
for (i in 1:nobs) { lambda[i] <- exp(intercept + alt * altitude[i] + alt2 * altitude[i] * altitude[i]) beetles[i]~dpois(lambda[i]) }
To create posterior predictive simulations, add (and observe) the following chunk to the JAGS model code
for (i in 1:nobs) {
beetlesPred[i]~dpois(lambda[i])
}
The nodes beetlesPred are unconnected, so this will cause JAGS to simulate new observations, based on the current model prediction lambda (i.e. posterior predictive simulations).
We can now convert these simulations into into Bayesian p-values with the createDHARMa function. What this essentially does is to measure where the observed data falls on the distribution of simulated data (see code below, and the DHARMa vignette for more explanations of what that does). The resulting residuals are scaled between 0 and 1, and should be roughly uniformly distributed (the non-asymptotic distribution under H0:(model correct) might not be is not entirely uniform, but in simulations so far, I have not seen a single example where you would go seriously wrong assuming it is). Plotting the calculated residuals, we get:
As explained in the DHARMa vignette , this is how overdispersion looks like. As explained in the Vignette, we should now usually first investigate if there is a model misspecification problem, e.g. by plotting residuals against predictors per group. To speed things up, however, knowing that the issue is both a missing random intercept on year and zero-inflation, I have created a model that corrects for both issues.
So, here’s the residual check with the corrected (true model) – now, things looks fine.
When doing this in practice, I highly recommend to not only rely on the overview plots I used here, but also check
- Residuals against all predictors (use plotResiduals with myDHARMaObject$scaledResiduals)
- All previous plots split up across all grouping factors (e.g. plot, year)
- Spatial / temporal autocorrelation
which is all supported by DHARMa.
There is one additional subtlety, which is the question of how to create the posterior predictive simulations for multi-level models. In the example below, I create the simulations conditional on the fitted random effects and the fitted zero-inflation terms. Most textbook examples of posterior predictive simulations I have seen use this approach. There is nothing wrong with this, but one has to be aware that this doesn’t check the full model, but only the final random level, i.e. the Poisson part. The default for the MLE GLMMs in DHARMa is to re-simulate all random effects. I plan to discuss the difference between the two options in more detail in an extra post, but for the moment, let me say that, as a default, I recommend re-simulating the entire model also in a Bayesian analysis. An example for this extended simulations is here.
The full code for the simple (conditional posterior predictive simulation) example is here
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
############ CREATE ZERO-INFLATED GLMM DATA ################# | |
# This first part creates a dataset with beetles counts across an altitudinal gradient (several plots each observed several years), with a random intercept on year and zero-inflation. | |
altitude = rep(seq(0,1,len = 50), each = 20) | |
dataID = 1:1000 | |
spatialCoordinate = rep(seq(0,30, len = 50), each = 20) | |
# random effects + zeroinflation | |
plot = rep(1:50, each = 20) | |
year = rep(1:20, times = 50) | |
yearRandom = rnorm(20, 0, 1) | |
plotRandom = rnorm(50, 0, 1) | |
overdispersion = rnorm(1000, sd = 0.5) | |
zeroinflation = rbinom(1000,1,0.6) | |
beetles <- rpois(1000, exp( 0 + 12*altitude – 12*altitude^2 | |
# + overdispersion + plotRandom[plot] | |
+ yearRandom[year]) * zeroinflation ) | |
data = data.frame(dataID, beetles, altitude, plot, year, spatialCoordinate) | |
plot(year, altitude, cex = beetles/50, pch =2, main = "Beetle counts across altitudinal gradient\n triangle is proportional to counts") | |
############## Analysis with JAGS ############################ | |
library(R2jags) | |
modelData=as.list(data) | |
modelData = append(data, list(nobs=1000, nplots = 50, nyears = 20)) | |
head(data) | |
# 1) Fit GLM only | |
modelstring=" | |
model { | |
# Likelihood | |
for (i in 1:nobs) { | |
lambda[i] <- exp(intercept + alt * altitude[i] + alt2 * altitude[i] * altitude[i]) | |
beetles[i]~dpois(lambda[i]) | |
} | |
# Fixed effect priors | |
intercept ~ dnorm(0,0.0001) | |
alt ~ dnorm(0,0.0001) | |
alt2 ~ dnorm(0,0.0001) | |
# Posterior predictive simulations | |
for (i in 1:nobs) { | |
beetlesPred[i]~dpois(lambda[i]) | |
} | |
Prediction <- sum(beetlesPred) | |
} | |
" | |
model=jags(model.file = textConnection(modelstring), data=modelData, n.iter=10000, parameters.to.save = c("intercept", "alt", "alt2", "beetlesPred", "lambda"), DIC = F) | |
library(DHARMa) | |
simulations = model$BUGSoutput$sims.list$beetlesPred | |
pred = apply(model$BUGSoutput$sims.list$lambda, 2, median) | |
dim(simulations) | |
sim = createDHARMa(simulatedResponse = t(simulations), observedResponse = data$beetles, fittedPredictedResponse = pred, integerResponse = T) | |
plotSimulatedResiduals(sim) | |
# 2) GLMM with random intercept on year, observation-level RE for overdispersion, and zero-inflation | |
modelstring=" | |
model { | |
# Likelihood | |
for (i in 1:nobs) { | |
lambda[i] <- exp(intercept + alt * altitude[i] + alt2 * altitude[i] * altitude[i] + Ryear[year[i]] + RID[i] ) * Zero[i] + 0.00000001 | |
beetles[i]~dpois(lambda[i]) | |
} | |
# Fixed effect priors | |
intercept ~ dnorm(0,0.0001) | |
alt ~ dnorm(0,0.0001) | |
alt2 ~ dnorm(0,0.0001) | |
# Random effects | |
for (i in 1:nyears) { | |
Ryear[i]~dnorm(0,sigmaYear) | |
} | |
for (i in 1:nobs) { | |
RID[i]~dnorm(0,sigmaID) | |
} | |
# Variance priors | |
sigmaYear~dgamma(1,2) | |
sigmaID~dgamma(0.001,0.001) | |
# Zeroinflation | |
for (i in 1:nobs) { | |
Zero[i]~dbern(zeroMu + altZero * altitude[i]) | |
} | |
zeroMu ~ dunif(0,1) | |
altZero ~ dnorm(0,0.0001) | |
# Posterior predictive simulations | |
for (i in 1:nobs) { | |
beetlesPred[i]~dpois(lambda[i]) | |
} | |
Prediction <- sum(beetlesPred) | |
} | |
" | |
model=jags(model.file = textConnection(modelstring), data=modelData, n.iter=10000, parameters.to.save = c("intercept", "alt", "alt2", "beetlesPred", "Ryear", "sigmaYear", "lambda", "altZero", "zeroMu"), DIC = F) | |
library(DHARMa) | |
simulations = model$BUGSoutput$sims.list$beetlesPred | |
pred = apply(model$BUGSoutput$sims.list$lambda, 2, median) | |
dim(simulations) | |
sim = createDHARMa(simulatedResponse = t(simulations), observedResponse = data$beetles, fittedPredictedResponse = pred, integerResponse = T) | |
plotSimulatedResiduals(sim) | |
Pingback: MCMC chain analysis and convergence diagnostics with coda in R | theoretical ecology
Thanks for this nice post, it is nice to see that DHARMa can also be used for bayesian models. I do have a small question, why do you add a tiny constant (0.00000001) line 75?
LikeLike
It’s a hack because Jags can’t handle a zero in the dpois. Shouldn’t have any consequences for the estimates. Not sure if there’s a more elegant way to code this.
LikeLike