This is a somewhat belated introduction of a package that we published on CRAN at the beginning of the year already, but I hadn’t found the time to blog about this earlier.

In the R environment and beyond, a large number of packages exist that estimate posterior distributions via MCMC sampling, either for specific statistical models (e.g. MCMCglmm, INLA), or via general model specifications languages (such as JAGS, STAN).

Most of these packages are not designed for sampling from an arbitrary target density provided as an R function. For good reasons, as statistical modelers will seldom have the need for such a tool – it’s hard to come up with a likelihood that cannot be specified in one of the existing packages, and if so, the problem is usually so complicated that it requires especially adopted MCMC strategies.

It’s another story, however, for people that work with process-based models (simulation models, differential equation models, …). These models are usually implemented in standalone code that cannot easily be integrated into JAGS or STAN (yes, I know STAN has an ODE solver, but generally speaking … ). What we can do, however, is to call any process model from R, calculate model predictions and then calculate a likelihood based on some distributional assumptions, e.g. as in the following pseudocode.

likelihood = function(param){ predicted = processmodel(param) ll = sum(dnorm(observed, mean = predicted, sd =param[x], log = T)) return(ll) }

This leaves us with a function likelihood(par) or posterior(par) that we want to sample from.

BayesianTools is a toolbox for this problem. It provides general-purpose MCMC and SMC samplers, as well as plot and diagnostic functions for Bayesian statistics, with a particular focus on calibrating complex system models. The samplers implemented in the package are optimized for problematic target functions (strong correlations, multimodal targets) that are more commonly found in process-based than in statistical models. Available samplers include various Metropolis MCMC variants (including adaptive and/or delayed rejection MH), the T-walk, two differential evolution MCMCs, two DREAM MCMCs, and a sequential Monte Carlo (SMC) particle filter.

Here an example how to run an MCMC on a likelihood function in BayesianTools

library(BayesianTools) setup = createBayesianSetup(likelihood, lower = c(-100,-100,0), upper = c(100,100,100)) out = runMCMC(setup, sampler = "DEzs", settings = NULL)

Obviously, you should read the help for details. The package supports most common summaries and diagnostics, including convergence checks, plots, and model selection indices, such as DIC, WAIC, or marginal likelihood (to calculate the Bayes factor).

gelmanDiagnostics(out) summary(out) plot(out, start = 1000) correlationPlot(out, start = 1000) marginalPlot(out, start = 1000) MAP(out) DIC(out) WAIC(out) # requires special definition of the likelihood, see help marginalLikelihood(out)

Below some example outputs obtained from calibrating the VSEM model, a simple ecosystem model that is provided as a test model in the package. The code to produce these plots is available when you type ?VSEM in R.

**References**

Hartig, F, Minunno, F, Paul, S (2017) BayesianTools: General-Purpose MCMC and SMC Samplers and Tools for Bayesian Statistics. R package version 0.1.3 [CRAN] [GitHub]

What is the processmodel function and the observed data?

It’s the processmodel you want to fit, and your observed data that you want to use to calibrate this model. As written in the text, this is pseudocode, it’s not supposed to run, you have to adjust this to your specific problem.

Dear Florian Hartig,

Thank you for your informations and package. I have been using this BayesianTools for estimating Posterior distribution of non-communicable disease model. I have a question “How can I turn these parameter chains into outcome chains”? Do you have an example code in your package if this is possible directly from the chains or if we have to rerun the model with parameters from the chains.

Best regards,

Wiriya

See https://github.com/florianhartig/BayesianTools/issues/70

Is the lower plot using plotTimeSeriesResults()? I had trouble getting this to work because I do not have an observation at every time step, and it didn’t seem to like NA. Does the error function have to return a vector of absolute errors corresponding to the “mean”?

Yes. I had thought NAs should work, but maybe it doesn’t – I have filed this as a question at https://github.com/florianhartig/BayesianTools/issues/102

Thanks for the hint with the error function, it’s really not clear in the help. I updated the help to

@param error function with signature f(mean, par) that generates observations with error (error = stochasticity according to what is assumed in the likelihood) from mean model predictions. Par is a vector from the matrix with the parameter samples (full length). f needs to know which of these parameters are parameters of the error function. See example in \code{\link{VSEM}}