*It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience. (From “On the Method of Theoretical Physics,” the Herbert Spencer Lecture, Oxford, June 10, 1933.)*

I would like to extend some recent musings about the probabilistic models that underlie statistical analysis. To give credit where credit is due, a recent post by Amy Hurford and an older one by Jeremy Fox inspired me to dive deeper into this issue.

**Stochasticity emerges from hidden or unobserved processes**

To be on solid ground regarding our definition of stochasticity, let me first establish that, when speaking about stochasticity in statistical models, we usually do not refer to fundamental indeterminism. There is currently no hard evidence for any fundamentally stochastic process at any scale in nature. We can obviously not outrule the unknown, but as far as we can tell from the means that science provides nowadays, the world unfolds according to a few deterministic laws that govern the interactions between matter and ultimately the future of everything there is (contrary to popular belief, this remains valid even for quantum systems, I make a few remarks on that at the end of the post).

Despite that, we experience stochasticity in practically all systems of higher complexity, particularly also in ecology. I shamelessly cite from one of my own papers:

As ecologists and biologists, we try to find the laws that govern the functioning and the interactions among nature’s living organisms. Nature, however, seldom presents itself to us as a deterministic system. Demographic stochasticity, movement and dispersal, variability of environmental factors, genetic variation and limits on observation accuracy are only some of the reasons. We have therefore learnt to accept stochasticity as an inherent part of ecological and biological systems and, as a discipline, we have acquired an impressive arsenal of statistical inference methods.

I guess most people immediately accept that this type of stochasticity is caused by deterministic, but potentially complex or even chaotic processes that act at, below, or outside the level of description of the respective theory or model – we may not know why a particular seed of a plant ends up in any particular place, but we have little reason to doubt that the causes, may it be wind or a bird, were acting with some purpose that is only cloaked from our knowledge. I want to stress that we are not only speaking about some microscopic processes here that are way below the level of ecology – all the work on chaotic systems has impressively demonstrated how easy “practical indeterminism” can emerge in extremely simple settings from processes at the same scale as the phenomenon that we examine. The question of stochasticity is thus one of the level of interest (we may not be interested in some lower level processes), but also of predictability (it may be that those processes are in our scope, but their outcome is so sensitive on boundary conditions that they are effectively unpredictable). Thus, we have to balance different problems and considerations when we ask:

**Which processes do we need or want to explain explicitly in ecology, and which can we average out and describe as “stochastic”?**

This is the crucial question, or, as we say in German: this is where the dog is buried. I fear that, ultimately, this question is as unsolvable with logical arguments as the discussion about “what makes a good model”, but I do think that we can give a bit more guidance regarding how we find a good description of stochasticity than the proverbial: “I know it when I see it”. I’ll give my idea of what is appropriate shortly below, but to get us started, I want to speak about the distinction of determinism and causality. I’ll take the liberty to quote a part of Jeremy’s post here – apologies for taking this out of context, but it nicely summarizes a view that I believe to be very common in ecology, certainly not entirely without reason:

… if we knew enough, much or even all of this apparent randomness could be explained away. But why would we want to explain it away? What would we gain? I’d argue that we’d actually lose a lot. We’d be replacing the generally-applicable concepts […] with a stamp collection of inherently case-specific, and hugely complex, deterministic causal stories.

I should say that I agree with the general sentiment of this post (certainly, it can’t be our goal to develop a “complete” model of the world), but I want to push a slightly different view on the praise of randomness, because: there is randomness and there is randomness. More precisely, I think we must make a distinction between ability of the model to make deterministic predictions (something like R^2), and the ability of the model to give causal explanations for the phenomena that we observe, the latter including the observation of stochasticity. Imagine two population models with equal R^2 (or some generalization of that) and an equal number of parameters. One model simply explains the observed stochasticity by a normal distribution without any causality implied, while the other explains it by a specific mechanism, e.g. by the way individuals reproduce or search for mates. Both are equally “random”, but the randomness in the second one is more causal.

I believe that searching for more mechanistic or causal description of stochastic terms has a number of advantages, including that parameters and models can ideally be reused and compared across systems and datasets. It must be our hope that there are sensible sub-units into which we can deconstruct stochasticity, and I do think that recent developments in statistics (I’ll talk about them soon) encourage this view. In comparison to that, simple regression model may be more universal in their use, but every application of such a model uses a slightly different, phenomenological description of stochasticity that subsumes observation error, process error and model errors in one bulk, and all this makes it ultimately difficult to compare results of parameter and model selections (also about the parameters of the stochastic process or “error”-model) between and across datasets, scales, taxa and so on.

**The costs of complexity and methodological developments**

The worry is of course that descriptions of stochasticity that are more grounded in mechanistic thinking will increase model complexity, and increasing complexity comes with all sorts of trouble, including computational costs, overfitting and other inferential issues related to complexity, and a loss model interpretability. However, I am not actually convinced that a more mechanistic view on the generating stochastic processes in statistical models does really necessary increase model complexity. Take, for example, the paper by Simon Wood, where he is generating the error model from a population model with deterministic chaos from the chaotic population that are created by this model. Statistically, this model has less parameters than a comparable “phenomenological” regression where the variance of the error is adjusted to the observed residuals. What is more complex is not the model as such, but the methods for the statistical estimation of this model, the latter requiring more sophisticated methods than a simple regression (see our review here)

Thus, I feel that we should not sing the praise of simplicity too loud. Quotes such as the “as simple as possible”, or “all models are wrong” are prevalent, but I personally suspect that, in many cases, those are not so much expressions of a well-pondered inferential philosophy than justification for simplifications that were technically necessary at their time. With sophisticated mechanisms such as hierarchical Bayesian Models (and specialized versions of that such as state-space models or DRMs) coming up, as well as simulation-based likelihood approximations such as ABC or the Wood paper, we are much less limited than before in creating stochastic model structures that separate and reflect the mechanisms that we know to be acting. As noted in the comments to my previous post, the work of Jim Clark (e.g. this) is a good example for that.

**Conclusion**

In conclusion, it is unquestioned that we have to make simplifications when modeling. However, assuming that there is only so much complexity that we can statistically support with our data, we still have a lot of freedom for how to built and design our statistical models. I hold that the concentration on phenomenological error models is more grounded on historical and computational than on fundamental reasons. In fact, I think much could be learned from accentuating more strongly that the variability in our data often originates from the same ecological processes than the “signal” (e.g. dispersal, population growth, inheritance). Of course, in some cases it doesn’t, and then it might be perfectly OK to average it away with some phenomenological distribution. However, even then separating different sources of errors as done in hierarchical models may provide a lot of interesting information.

———————-

**Some notes on indeterminism in quantum mechanics**

Quantum mechanics (QM) is frequently brought forward in this context with the argument that it included some fundamentally stochastic elements, thus killing off the Laplacian daemon, giving freedom to the way the future unravels and so on. Without diving into this discussion in detail, I believe that this is the physics version of what Jeremy Fox calls a “zombie idea” – it originates from a historical fallacy that was made in the early days of this theory (the Copenhagen interpretation). In retrospect, I believe that one can clearly trace back the Copenhagen interpretation to the understandable human “desire” for a a semi-classical world-view at this time – The Copenhagen interpretation of QM was hard enough to accept as it was, but alternatives seemed even crazier. Therefore, it was readily picked up by physicists, philosophers and the general public, and has remained popular until today, including frequent reincarnations in school books, popular science programs and philosophical discussions related to “free will”.

The problems with this interpretation were noted early, but empirical methods to distinguish between different alternative interpretations were lacking at that time. In the last two decades, however, there have been a number of sophisticated experiments that clearly show that no collapse of the wave function (which is the “stochastic mechanism” in the Copenhagen interpretation) takes place – what rather happens is that quantum objects are non-local, and when such objects interact with their environment, they undergo a process called decoherence that is in principle deterministic (at least we have no reason to think it isn’t), but unpredictable because its boundary conditions are fundamentally cloaked from our experience (basically, we would have to know the state of the whole universe). It’ll probably require a bit of reading understand the physics behind this, but if you want to read more about it, this is a good (and moderately technical) starting point.

Nice post. Path-dependence is one of the reasons why noise or stochasticity is largely dismissed in ecology. Why for example matrix models are still popular in population dynamics and not more process-based individual oriented models?

Some justifications of the use of outdated tools are of course post-razionalization of occurrences due to limited computational power in the early days. Of course, if to add stochasticity in some way we need to resort to stochastic differential equations 1) the solvable space is extremely limited 2) few ecologists have the skills to use SDEs appropriately. That’s one of the reasons why stochastic approaches were dismissed.

Another role was played by the view of ecology as a retrospective science and not as a predictive science, which still largely persist today and this has also make some very sophisticate analytical approaches to emerge with few useful predictions (see theoretical genetics).

It is undeniable that at some point the description or inclusion of processes has to stop, it is unclear when. Detailing the physiology of an individual in a population dynamics model will it make it more useful? Better able to predict?

I would guess that more mechanistic error models are not necessarily leading to better predictions when the phenomenological alternative provides an equally good description of the residuals (see my argument about the two models with the same R^2, they would possibly predict with the same precision). Yet,

1) Separating / explaining stochasticity increases our ecological understanding

2) It might even reduce model complexity when we understand the process that creates the stochasticity (e.g. a simple dispersal kernel vs. complicated correction for RSA)

3) I think that it also guards us against getting the right results for the wrong reasons, after all, there’s a lot of nonsense that can be done with inappropriate error models.

Maybe I should stress again that I’m not trying to promote going to ever finer details, it’s more about our general attitude towards stochasticity as either a nuisance that is used to calculate a likelihood function or a pattern that deserves and rewards ecological explanation.

Great post. I’m looking forward to hearing about these “recent developments in statistics”. Thanks for the write-up on indeterminism in QM at the end. Great stuff.

Thanks!

About these “developments”: I was hinting towards

1) Hierarchical Bayesian models (e.g. http://www.esajournals.org/doi/abs/10.1890/0012-9658%282003%29084%5B1382:HBMFPT%5D2.0.CO%3B2 or as a recent application http://onlinelibrary.wiley.com/doi/10.1111/j.1466-8238.2011.00663.x/full), which I would actually rate as quite established by now (quite a few textbooks out there), but which still have some ground to gain in practical applications

2) Likelihood-approximations of stochastic simulations. Regarding the latter, the paper by Simon Wood http://www.nature.com/nature/journal/v466/n7310/abs/nature09319.html that I mentioned in the text is an interesting example that would fit well to the population dynamic model example in your post. We have written a review on these methods which is available here http://onlinelibrary.wiley.com/doi/10.1111/j.1461-0248.2011.01640.x/abstract .

Still, you’re right, it would probably make a nice post to discuss this stuff a bit more in detail, the two approaches also share more similarities that it appears on a first glance.

Pingback: True models, predicitve models, and consistent Bayesian state-space estimators for chaotic dynamics | theoretical ecology