What does the mean mean anyway?

In this post Senior Editor Phil Stephens discusses a paper he recently handled by Angela Brennan and colleagues ‘Managing more than the mean: using quantile regression to identify factors related to large elk groups’

Recently, a colleague and friend left his UK university job and returned to his native Australia. As I gaze out of my window at the inky darkness of the northern afternoon, pictures of him enjoying sun-drenched picnics on the beach with his family cast some light on his decision. Selfishly, my colleague’s move was a shame, both personally and professionally. He has a way of looking at the ecological world that is rare in my experience. As a trained applied mathematician, he looks on the natural world as a collection of data. Where others see plants and animals, perhaps species and habitats, he sees age, stage and spatial distributions; where others see noise and variation, he sees signals and generating processes. His perspective is refreshing, because so many ecologists fixate on the mean relationships between variables, regarding variance as an inconvenience – something to be explained away, or overcome by increased sample sizes. His effect on me was to make me realise that variance is not an inconvenience; rather, variance is the source of information that tells us about the processes that generated the data.

The ecologist’s obsession with mean relationships is puzzling because ecology is, inter alia, the study of distribution and abundance. Both of these show huge variation, and the limits to that variation are often far more interesting than the means. In my own work, I notice that many papers have addressed the mean relationship between animal abundance and body size, but only a handful have given thought to the scaling of minimum or maximum abundance. This is despite the fact that the limits to abundance can tell us so much about the processes driving community composition and extinction. Perhaps one reason for the relative neglect of extremes in ecology is that they can be hard even to describe. During our early training, most of us rapidly become familiar with standard statistical procedures to explore the mean relationship between variables. By contrast, data at the extremes are inevitably patchy, unreliable and statistically awkward.

With their recent paper on the management of elk (Cervus canadensis) in Wyoming (Fig. 1), Brennan, Cross and Creel provide an excellent example of how useful insights can be gained by looking at the more extreme parts of statistical distributions. Their paper is particularly useful in two ways. First, it illustrates the application of a neglected statistical technique to explore the outer reaches of relationships that exhibit substantial variation. Second, it shows that applying this technique to data is not merely an exercise in statistical exploration; it can provide new insights to guide applied management.

elk groups

Fig. 1. Brennan, Cross and Creel investigated the factors that determine the size of the larger elk groups wintering in their study area in western Wyoming, USA [Photo Credit: USFWS / Ann Hough, National Elk Refuge volunteer (https://www.flickr.com/photos/usfwsmtnprairie/12395370205/in/album-72157627801102827/); released under the Creative Commons Attribution 2.0 license]

Brennan and colleagues were particularly interested in the factors that determine group sizes of elk. A common way to look at this question would be to collect data on observed group sizes, and to use a standard regression framework to relate those observed group sizes to aspects of the environment in which they were found. However, whilst a standard regression approach would tell us about the factors influencing mean group sizes, from a management perspective, the determinants of mean group sizes are rather less important than the determinants of the larger group sizes. This is because, whilst the majority of group sizes might be small, the majority of animals are found in a small number of very large groups. Moreover, it is the large groups that cause problems for managers, attracting predators, serving as foci for the rapid spread of disease and, potentially, causing damage to private property. Thus, a more interesting question from a management perspective is: what determines the size of the larger elk groups?

To answer this question, Brennan and colleagues used quantile regression. Quantile regression allows the researcher to quantify the relationship between a selected quantile of the data and one or more predictors. If the chosen quantile is 0.5, this amounts to a regression of the median. However, higher quantiles (such as 0.9, or 0.95) can be chosen to see what influences the data towards the upper end of its distribution. As Brennan and colleagues show, a factor that has a small or even non-significant impact on median group sizes can have a very strong impact on the sizes of the larger groups. A striking example was the impact of elk density itself, which had no discernible effect on median elk group sizes, but a strong positive effect on the sizes of large elk groups.

Overall, Brennan and colleagues found that large elk groups tend to be larger in open and irrigated areas, on private land and on land without late-season hunting, and in areas with relatively high wolf abundance (a result that was particularly pronounced in open areas and on private land). The authors were therefore able to recommend management strategies that would promote the removal or dispersal of individuals from very large groups, including promoting hunting on private lands, especially where that land is irrigated. Creating more varied habitat is also likely to reduce the sizes of the largest herds. The impact of wolves (Fig. 2) was harder to infer. The authors suggest that regions with higher wolf densities caused larger local aggregations of elk in areas of greater safety. Thus, they recommend greater tolerance of wolves among private landowners, in order to limit the occurrence of those areas of greater safety.

Wyoming’s wolves

Fig. 2. Wyoming’s wolves have a complex relationship with the size of elk herds. Greater tolerance of wolves on private land might discourage the formation of very large elk herds in those areas. [Photo Credit: Doug Smith – NPS (http://www.nps.gov/index.htm); photo in public domain]

There is something for everyone in the article by Brennan and colleagues. Some will be fascinated by insights into factors favouring the formation of such large herds of elk. Many of us may be motivated to use quantile regression to explore neglected aspects of our data sets. And some may even be motivated to ask, what does the mean mean anyway, is it always meaningful, and why is it so often the focus of our attention?