It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience. (From “On the Method of Theoretical Physics,” the Herbert Spencer Lecture, Oxford, June 10, 1933.)
I would like to extend some recent musings about the probabilistic models that underlie statistical analysis. To give credit where credit is due, a recent post by Amy Hurford and an older one by Jeremy Fox inspired me to dive deeper into this issue.
Stochasticity emerges from hidden or unobserved processes
To be on solid ground regarding our definition of stochasticity, let me first establish that, when speaking about stochasticity in statistical models, we usually do not refer to fundamental indeterminism. There is currently no hard evidence for any fundamentally stochastic process at any scale in nature. We can obviously not outrule the unknown, but as far as we can tell from the means that science provides nowadays, the world unfolds according to a few deterministic laws that govern the interactions between matter and ultimately the future of everything there is (contrary to popular belief, this remains valid even for quantum systems, I make a few remarks on that at the end of the post).
Despite that, we experience stochasticity in practically all systems of higher complexity, particularly also in ecology. I shamelessly cite from one of my own papers:
As ecologists and biologists, we try to find the laws that govern the functioning and the interactions among nature’s living organisms. Nature, however, seldom presents itself to us as a deterministic system. Demographic stochasticity, movement and dispersal, variability of environmental factors, genetic variation and limits on observation accuracy are only some of the reasons. We have therefore learnt to accept stochasticity as an inherent part of ecological and biological systems and, as a discipline, we have acquired an impressive arsenal of statistical inference methods.
I guess most people immediately accept that this type of stochasticity is caused by deterministic, but potentially complex or even chaotic processes that act at, below, or outside the level of description of the respective theory or model – we may not know why a particular seed of a plant ends up in any particular place, but we have little reason to doubt that the causes, may it be wind or a bird, were acting with some purpose that is only cloaked from our knowledge. I want to stress that we are not only speaking about some microscopic processes here that are way below the level of ecology – all the work on chaotic systems has impressively demonstrated how easy “practical indeterminism” can emerge in extremely simple settings from processes at the same scale as the phenomenon that we examine. The question of stochasticity is thus one of the level of interest (we may not be interested in some lower level processes), but also of predictability (it may be that those processes are in our scope, but their outcome is so sensitive on boundary conditions that they are effectively unpredictable). Thus, we have to balance different problems and considerations when we ask:
Which processes do we need or want to explain explicitly in ecology, and which can we average out and describe as “stochastic”?
This is the crucial question, or, as we say in German: this is where the dog is buried. I fear that, ultimately, this question is as unsolvable with logical arguments as the discussion about “what makes a good model”, but I do think that we can give a bit more guidance regarding how we find a good description of stochasticity than the proverbial: “I know it when I see it”. I’ll give my idea of what is appropriate shortly below, but to get us started, I want to speak about the distinction of determinism and causality. I’ll take the liberty to quote a part of Jeremy’s post here – apologies for taking this out of context, but it nicely summarizes a view that I believe to be very common in ecology, certainly not entirely without reason:
… if we knew enough, much or even all of this apparent randomness could be explained away. But why would we want to explain it away? What would we gain? I’d argue that we’d actually lose a lot. We’d be replacing the generally-applicable concepts […] with a stamp collection of inherently case-specific, and hugely complex, deterministic causal stories.
I should say that I agree with the general sentiment of this post (certainly, it can’t be our goal to develop a “complete” model of the world), but I want to push a slightly different view on the praise of randomness, because: there is randomness and there is randomness. More precisely, I think we must make a distinction between ability of the model to make deterministic predictions (something like R^2), and the ability of the model to give causal explanations for the phenomena that we observe, the latter including the observation of stochasticity. Imagine two population models with equal R^2 (or some generalization of that) and an equal number of parameters. One model simply explains the observed stochasticity by a normal distribution without any causality implied, while the other explains it by a specific mechanism, e.g. by the way individuals reproduce or search for mates. Both are equally “random”, but the randomness in the second one is more causal.
I believe that searching for more mechanistic or causal description of stochastic terms has a number of advantages, including that parameters and models can ideally be reused and compared across systems and datasets. It must be our hope that there are sensible sub-units into which we can deconstruct stochasticity, and I do think that recent developments in statistics (I’ll talk about them soon) encourage this view. In comparison to that, simple regression model may be more universal in their use, but every application of such a model uses a slightly different, phenomenological description of stochasticity that subsumes observation error, process error and model errors in one bulk, and all this makes it ultimately difficult to compare results of parameter and model selections (also about the parameters of the stochastic process or “error”-model) between and across datasets, scales, taxa and so on.
The costs of complexity and methodological developments
The worry is of course that descriptions of stochasticity that are more grounded in mechanistic thinking will increase model complexity, and increasing complexity comes with all sorts of trouble, including computational costs, overfitting and other inferential issues related to complexity, and a loss model interpretability. However, I am not actually convinced that a more mechanistic view on the generating stochastic processes in statistical models does really necessary increase model complexity. Take, for example, the paper by Simon Wood, where he is generating the error model from a population model with deterministic chaos from the chaotic population that are created by this model. Statistically, this model has less parameters than a comparable “phenomenological” regression where the variance of the error is adjusted to the observed residuals. What is more complex is not the model as such, but the methods for the statistical estimation of this model, the latter requiring more sophisticated methods than a simple regression (see our review here)
Thus, I feel that we should not sing the praise of simplicity too loud. Quotes such as the “as simple as possible”, or “all models are wrong” are prevalent, but I personally suspect that, in many cases, those are not so much expressions of a well-pondered inferential philosophy than justification for simplifications that were technically necessary at their time. With sophisticated mechanisms such as hierarchical Bayesian Models (and specialized versions of that such as state-space models or DRMs) coming up, as well as simulation-based likelihood approximations such as ABC or the Wood paper, we are much less limited than before in creating stochastic model structures that separate and reflect the mechanisms that we know to be acting. As noted in the comments to my previous post, the work of Jim Clark (e.g. this) is a good example for that.
In conclusion, it is unquestioned that we have to make simplifications when modeling. However, assuming that there is only so much complexity that we can statistically support with our data, we still have a lot of freedom for how to built and design our statistical models. I hold that the concentration on phenomenological error models is more grounded on historical and computational than on fundamental reasons. In fact, I think much could be learned from accentuating more strongly that the variability in our data often originates from the same ecological processes than the “signal” (e.g. dispersal, population growth, inheritance). Of course, in some cases it doesn’t, and then it might be perfectly OK to average it away with some phenomenological distribution. However, even then separating different sources of errors as done in hierarchical models may provide a lot of interesting information.
Some notes on indeterminism in quantum mechanics
Quantum mechanics (QM) is frequently brought forward in this context with the argument that it included some fundamentally stochastic elements, thus killing off the Laplacian daemon, giving freedom to the way the future unravels and so on. Without diving into this discussion in detail, I believe that this is the physics version of what Jeremy Fox calls a “zombie idea” – it originates from a historical fallacy that was made in the early days of this theory (the Copenhagen interpretation). In retrospect, I believe that one can clearly trace back the Copenhagen interpretation to the understandable human “desire” for a a semi-classical world-view at this time – The Copenhagen interpretation of QM was hard enough to accept as it was, but alternatives seemed even crazier. Therefore, it was readily picked up by physicists, philosophers and the general public, and has remained popular until today, including frequent reincarnations in school books, popular science programs and philosophical discussions related to “free will”.
The problems with this interpretation were noted early, but empirical methods to distinguish between different alternative interpretations were lacking at that time. In the last two decades, however, there have been a number of sophisticated experiments that clearly show that no collapse of the wave function (which is the “stochastic mechanism” in the Copenhagen interpretation) takes place – what rather happens is that quantum objects are non-local, and when such objects interact with their environment, they undergo a process called decoherence that is in principle deterministic (at least we have no reason to think it isn’t), but unpredictable because its boundary conditions are fundamentally cloaked from our experience (basically, we would have to know the state of the whole universe). It’ll probably require a bit of reading understand the physics behind this, but if you want to read more about it, this is a good (and moderately technical) starting point.