This is a somewhat belated introduction of a package that we published on CRAN at the beginning of the year already, but I hadn’t found the time to blog about this earlier.

In the R environment and beyond, a large number of packages exist that estimate posterior distributions via MCMC sampling, either for specific statistical models (e.g. MCMCglmm, INLA), or via general model specifications languages (such as JAGS, STAN).

Most of these packages are not designed for sampling from an arbitrary target density provided as an R function. For good reasons, as statistical modelers will seldom have the need for such a tool – it’s hard to come up with a likelihood that cannot be specified in one of the existing packages, and if so, the problem is usually so complicated that it requires especially adopted MCMC strategies.

It’s another story, however, for people that work with process-based models (simulation models, differential equation models, …). These models are usually implemented in standalone code that cannot easily be integrated into JAGS or STAN (yes, I know STAN has an ODE solver, but generally speaking … ). What we can do, however, is to call any process model from R, calculate model predictions and then calculate a likelihood based on some distributional assumptions, e.g. as in the following pseudocode.

likelihood = function(param){ predicted = processmodel(param) ll = sum(dnorm(observed, mean = predicted, sd =param[x], log = T)) return(ll) }

This leaves us with a function likelihood(par) or posterior(par) that we want to sample from.

BayesianTools is a toolbox for this problem. It provides general-purpose MCMC and SMC samplers, as well as plot and diagnostic functions for Bayesian statistics, with a particular focus on calibrating complex system models. The samplers implemented in the package are optimized for problematic target functions (strong correlations, multimodal targets) that are more commonly found in process-based than in statistical models. Available samplers include various Metropolis MCMC variants (including adaptive and/or delayed rejection MH), the T-walk, two differential evolution MCMCs, two DREAM MCMCs, and a sequential Monte Carlo (SMC) particle filter.

Here an example how to run an MCMC on a likelihood function in BayesianTools

library(BayesianTools) setup = createBayesianSetup(likelihood, lower = c(-100,-100,0), upper = c(100,100,100)) out = runMCMC(setup, sampler = "DEzs", settings = NULL)

Obviously, you should read the help for details. The package supports most common summaries and diagnostics, including convergence checks, plots, and model selection indices, such as DIC, WAIC, or marginal likelihood (to calculate the Bayes factor).

gelmanDiagnostics(out) summary(out) plot(out, start = 1000) correlationPlot(out, start = 1000) marginalPlot(out, start = 1000) MAP(out) DIC(out) WAIC(out) # requires special definition of the likelihood, see help marginalLikelihood(out)

Below some example outputs obtained from calibrating the VSEM model, a simple ecosystem model that is provided as a test model in the package. The code to produce these plots is available when you type ?VSEM in R.

**References**

Hartig, F, Minunno, F, Paul, S (2017) BayesianTools: General-Purpose MCMC and SMC Samplers and Tools for Bayesian Statistics. R package version 0.1.3 [CRAN] [GitHub]

What is the processmodel function and the observed data?

It’s the processmodel you want to fit, and your observed data that you want to use to calibrate this model. As written in the text, this is pseudocode, it’s not supposed to run, you have to adjust this to your specific problem.