Approximate Bayesian Computing and similar techniques, which are based on calculating approximate likelihood values based on samples from a stochastic simulation model, have attracted a lot of attention in the last years, owing to their promise to provide a general statistical technique for stochastic processes of any complexity, without the limitations that apply to “traditional” statistical models due to the problem of maintaining “tractable” likelihood functions. If you are unsure what all this means, I recommend you our recent review on statistical inference for stochastic simulation models, which aims at giving a pedagogical introduction to this exciting topic.
A colleague asked me now for a simple example of the Approximate Bayesian Computation MCMC (ABC-MCMC) algorithm that we discussed in our review. If you want to have more background on this algorithm, read the excellent paper by Marjoram et al. who proposed this algorithm for the first time. Below, I provide a minimal example, similar to my example for a simple Metropolis-Hastings MCMC in R, where the only main difference is that the Metropolis-Hastings acceptance has been changed for an ABC acceptance.
library(coda) # assuming the data are 10 samples of a normal distribution # with mean 5.3 and sd 2.7 data = rnorm(10, mean =5.3, sd = 2.7) # we want to use ABC to infer the parameters that were used. # we sample from the same model and use mean and variance # as summary statstitics. We return true for ABC acceptance when # the difference to the data is smaller than a certain threshold meandata <- mean(data) standarddeviationdata <- sd(data) ABC_acceptance <- function(par){ # prior to avoid negative standard deviation if (par[2] <= 0) return(F) # stochastic model generates a sample for given par samples <- rnorm(10, mean =par[1], sd = par[2]) # comparison with the observed summary statistics diffmean <- abs(mean(samples) - meandata) diffsd <- abs(sd(samples) - standarddeviationdata) if((diffmean < 0.1) & (diffsd < 0.2)) return(T) else return(F) } # we plug this in in a standard metropolis hastings MCMC, # with the metropolis acceptance exchanged for the ABC acceptance run_MCMC_ABC <- function(startvalue, iterations){ chain = array(dim = c(iterations+1,2)) chain[1,] = startvalue for (i in 1:iterations){ # proposalfunction proposal = rnorm(2,mean = chain[i,], sd= c(0.7,0.7)) if(ABC_acceptance(proposal)){ chain[i+1,] = proposal }else{ chain[i+1,] = chain[i,] } } return(mcmc(chain)) } posterior <- run_MCMC_ABC(c(4,2.3),300000) plot(posterior)
The result should look something like that:
Figure: Trace and marginal plots for the posterior sample. From the marginal plots to the right, you see that we are approximately retrieving the original parameter values, which were 5.3 and 2.7. If you are unsure how to read these plots, look at this older post.
Hi Florian,
First let me say that I am a newbie in ABC approaches. I am considering adapting your code above (ABC-MCMC) to estimate parameters for a network individual-based mathematical model that returns year-specific prevalences. I just have 2 questions: 1) Since I need to estimate year-specific parameters for my model (which runs for 10 years and the number of parameters is 3 – so 30 in total), is the model sensitive to the number of parameters to be estimated. 2) For the proposal distribution part – I am considering specifying it as: proposal = runif(30, min=chain[i,], max=rep(1,times=30)) where chain consists of i+iterations rows and 30 columns – is this okay?
LikeLike
Hi Marshal,
first of all, instead of my code, you could also consider the EasyABC package http://cran.r-project.org/web/packages/EasyABC/index.html for R. It’s pretty new, and I haven’t tried it out, but it implements the same algorithm and apparently produces the code in a format that fits to the abc package http://cran.r-project.org/web/packages/abc/index.html , which might be an advantage.
Regarding your questions:
* In principle, you can estimate as many parameters as you want in a Bayesian analysis, ABC is not different from that, but remember that to uniquely identify 30 parameters you need at least 30 data points (unless your model imposes additional constraints), so you need at least a 30-dimensional summary statistic. And having 30 dimensions in the summary statistic might or might not lead to to computational problems with the ABC. If you have less summary statistics, your inference is still valid, but you’re bound to get a lot of trade-offs between parameters, which makes the analysis difficult (see e.g. here https://theoreticalecology.wordpress.com/2011/12/09/mcmc-chain-analysis-and-convergence-diagnostics-with-coda-in-r/).
* A uniform proposal function is not a good idea, because it has a limited maximum step length. Choose a function with infinite support such as normal. Also, your proposal, as it is at the moment, has the current state as the minimum, so you would only jump towards larger parameter values, which is clearly not deseriable.
* Have a look at this paper by Simon Wood http://www.nature.com/nature/journal/v466/n7310/full/nature09319.html, which discusses an approach that might be an alternative to the ABC algorithm above. We also discuss this in our review in EL, see https://theoreticalecology.wordpress.com/2011/06/21/statistical-inference-for-stochastic-simulation-models/ . Can’t say whether this is more suitable without knowing your problem in more detail.
LikeLike
I tried out EasyABC, see this recent post https://theoreticalecology.wordpress.com/2012/12/02/the-easyabc-package-for-approximate-bayesian-computation-in-r/ , bottomline: seems to work fine. But you can of course also used the code above, doesn’t affect your main questions.
LikeLike
Hi Florian,
Many thanks indeed for your input! I tried out the EasyABC package but I realised that it only allows one to obtain a sample of accepted parameters following which one has to use another package, say “abc” for “post-treatment” analysis – in the case of the rejection sampling approach. So it looks like its not yet possible to obtain the actual posterior estimates with the posterior credibility intervals as yet in the case of abc-mcmc or abc-smc…
LikeLike
As regards the number of parameters, I agree that since I do not have sufficient degrees of freedom (30), estimating the 30 parameters would statistically speaking, lead to an unidentifiable model. However, I guess that one can still estimate the parameters if for some of them informative priors are provided – which is actually my main problem since there’s actually no prior info available for them. So, I guess I just have to do with fewer parameters…..As regards, the uniform distribution…..I used it because the parameters I am estimating are probabilities which have to be bounded between 0 and 1….which might be a problem as you’ve rightly suggested….
LikeLike
If your parameter must be between 0 and 1, you should encode this as a prior from 0 to 1, not in the proposal – you are still free to choose your proposal as you like; typically you would want the proposal width to be substantially smaller than the prior interval.
It might well be that for the type of problem you have it doesn’t really matter that much if you choose a uniform or a normal proposal because you will get posterior support across the whole prior interval, so you could also consider a uniform distribution centered around your current value, but in general I would still prefer a normal distribution as in my example because it is not “cut off”.
As far as I can see, EasyABC produces the same output as my algorithm above, so you don’t have to use the abc package, althoug it might be advisable.
About your degrees of freedom: you can just try what you get when fitting all parameters. As I said, the Bayesian inference stays valid. Note, however, that a simple MCMC such as implemented above will need considerably more samples to converge if you have this type of correlations in the posterior, so you will have to run more simulations, and check convergence carefully.
LikeLike
Thanks Florian. I’ll give your suggestions a try!
LikeLike
Hi! .. (sorry for my english) I found your job seeking ABC and MCMC methods.
Currently working on my master’s thesis on “data association” for use in tracking the targets .. I’m at the stage of literature review and test some ideas.
One such idea is to implement the step of “association data” (in the track) use of an ABC method rather than a classic MCMC … I’ve been using the MCMC approaches (different algorithms or implementation as RBMCDA – Rao-Blackwellized Monte Carlo Data Association), but not an implementation with ABC .. attempt to define, find the pros and cons of attempting this kind of method to carry out the task of data association … have some documentation, thoughts or information relevant?
Thanks in advance
LikeLike
ABC makes sense when you are pretty sure about what the exact likelihood for a problem is (i.e. you know the data generating process), but can’t calculate it directly. So, I would say ABC is a sensible approach in fields where you have good knowledge about the model structure (e.g. genetics, sampling models, …).
If you don’t know the data-generating process exactly, it may often be easier to just “invent” some likelihoods / objectives functions instead of going the costly way of approximating them by simulating from a process that may or may not be correct. Hence, in this situation, I would rather go for machine learning techniques that typically make very little structural assumptions about the model.
I know too little about the problems that you are trying to know whether your problem belongs to the one or the other category.
In any case, if you want to understand better what ABC does, I recommend reading
Hartig, F.; Calabrese, J. M.; Reineking, B.; Wiegand, T. & Huth, A. (2011) Statistical inference for stochastic simulation models – theory and application Ecol. Lett., 14, 816-827
Beaumont, M. A. (2010) Approximate Bayesian computation in evolution and ecology Annu. Rev. Ecol. Evol. Syst., 41, 379-406
in that order, not at all because our review is better, but because it has a bit wider scope than the review by Marc Beaumont, while the latter gives a more detailed account of ABC techniques.
LikeLike
Pingback: Approximate Bayesian inference via synthetic likelihood for a process-based forest model | theoretical ecology
Pingback: Explaining the ABC-Rejection Algorithm in R | theoretical ecology
Thank you for the nice introduction. I have a question about setting priors. In your Metropolis-Hastings example, I can see how priors are incorporated in the method (via posteriors), but in this example (ABC-MCMC), I don’t see priors in your code (except for avoiding negative values). It would be very educational if the example contains the specific prior specifications (like you did in the Metropolis-Hastings example). I never used EasyABC, but it also asks users to set specific priors, so even if I were to use EasyABC, I like to know how they are incorporated.
In practice, if I were to code by myself, is it not necessary to use specific probability distributions for priors (e.g., just reject infeasible parameter values like you did in your code)?
I also want to know how to decide the values 0.1 and 0.2 in [(diffmean < 0.1) & (diffsd < 0.2)] and 0.7 in [proposal = rnorm(2,mean = chain[i,], sd= c(0.7,0.7))].
LikeLike
Hi Nori,
good point about the prior. OK, Bayes formula Post = prior * likelihood implies that prior and likelihood factor out, so you can write the probability of accepting a proposed new value as the (ABC acceptance the likelihood) * (the conventional metropolis ratio of the priors). So, what you want to add is an additional step in the MCMC where you accept proportional to the ratio prior(new) / prior (old).
In the original publication Marjoram, P.; Molitor, J.; Plagnol, V. & Tavare, S. (2003) Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA, 100, 15324-15328. they suggest to do this after the ABC step, but this is computationally inefficient if the ABC step is costly, so I would always do it before the ABC step.
The values 0.1 and 0.2 were a quick hack from me – the point here is that you need to standardize the summary statistics to a comparable range. In ABC-Rejection, this is often done by dividing by the standard deviation of the summary statistics. For ABC-MCMC, I would recommend settings these values to the standard deviation of the summary statistics at reasonable parameter values.
LikeLike
In case of MCMC without likelihoods, suppose prior() is a function that takes a vector of parameters and produces a loglikelihood value (e.g., like the one you have in the MH example), the following modification is correct?
proposal = rnorm(2,mean = chain[i,], sd= c(0.7,0.7))
c1 <- ABC_acceptance(proposal)
c2 <- exp(prior(proposal) – prior(chain[i,]))
probab <- c1*c2
And is it true that the acceptance calculation (c2) has to be in raw likelihood, rather than loglikelihood?
LikeLike
if your c1 is either 0 or 1, I guess that this would work, but it’s inefficient to first do the ABC and then the prior. Rather, do
propsal = c
accept with p exp(prior(propsal) – prior(current))
if accept do ABC step
LikeLike
Thank you very much! You have been very helpful.
LikeLike