226 – Modelling versus science
Mick Keogh, from the Australian Farm Institute, recently argued that “much greater caution is required when considering policy responses for issues where the main science available is based on modelled outcomes”. I broadly agree with that conclusion, although there were some points in the article that didn’t gel with me.
In a recent feature article in Farm Institute Insights, the Institute’s Executive Director Mick Keogh identified increasing reliance on modelling as a problem in policy, particularly policy related to the environment and natural resources. He observed that “there is an increasing reliance on modelling, rather than actual science”. He discussed modelling by the National Land and Water Resources Audit (NLWRA) to predict salinity risk, modelling to establish benchmark river condition for the Murray-Darling Rivers, and modelling to predict future climate. He expressed concern that the modelling was based on inadequate data (salinity, river condition) or used poor methods (salinity) and that the modelling results are “unverifiable” and “not able to be scrutinised” (all three). He claimed that the reliance on modelling rather than “actual science” was contributing to poor policy outcomes.
While I’m fully on Mick’s side regarding the need for policy to be based on the best evidence, I do have some problems with some of his arguments in this article.
Firstly, there is the premise that “science and modelling are not the same”. The reality is nowhere near as black-and-white as that. Modelling of various types is ubiquitous throughout science, including in what might be considered the hard sciences. Every time a scientist conducts a statistical test using hard data, she or he is applying a numerical model. In a sense, all scientific conclusions are based on models.
I think what Mick really has in mind is a particular type of model: a synthesis or integrated model that pulls together data and relationships from a variety of sources (often of varying levels of quality) to make inferences or draw conclusions that cannot be tested by observation, usually because the issue is too complex. This is the type of model I’m often involved in building.
I agree that these models do require particular care, both by the modeller and by decision makers who wish to use results. In my view, integrated modellers are often too confident about the results of a model that they have worked hard to construct. If such models are actually to be used for decision making, it is crucial for integrated modellers to test the robustness of their conclusions (e.g. Pannell, 1997), and to communicate clearly the realistic level of confidence that decision makers can have in the results. In my view, modellers often don’t do this well enough.
But even in cases where they do, policy makers and policy advisors often tend to look for the simple message in model results, and to treat that message as if it was pretty much a fact. The salinity work that Mick criticises is a great example of this. While I agree with Mick that aspects of that work were seriously flawed, the way it was interpreted by policy makers was not consistent with caveats provided by the modellers. In particular, the report was widely interpreted as predicting that there would be 17 million hectares of salinity, whereas it actually said that there would be 17 million hectares with high “risk” or “hazard” of going saline. Of that area, only a proportion was ever expected to actually go saline. That proportion was never stated, but the researchers knew that the final result would be much less than 17 million. They probably should have been clearer and more explicit about that, but it wasn’t a secret.
The next concern expressed in the article was that models “are often not able to be scrutinised to the same extent as ‘normal’ science”. It’s not clear to me exactly what this means. Perhaps it means that the models are not available for others to scrutinise. To the extent that that’s true (and it is true sometimes), I agree that this is a serious problem. I’ve built and used enough models to know how easy it is for them to contain serious undetected bugs. For that reason, I think that when a model is used (or is expected to be used) in policy, the model should be freely available for others to check. It should be a requirement that all model code and data used in policy is made publicly available. If the modeller is not prepared to make it public, the results should not be used. Without this, we can’t have confidence that the information being used to drive decisions is reliable.
Once the model is made available, if the issue is important enough, somebody will check it, and any flaws can be discovered. Or if the time frame for decision making is too tight for that, government may need to commission its own checking process.
This requirement would cause concerns to some scientists. In climate science, for example, some scientists have actively fought requests for data and code. (Personally, I think the same requirement should be enforced for peer-reviewed publications, not just for work that contributes to policy. Some leading economics journals do this, but not many in other disciplines.)
Perhaps, instead, Mick intends to say that even if you can get your hands on a model, it is too hard to check. If that is what he means, I disagree. I don’t think checking models generally is harder than checking other types of research. In some ways it is easier, because you should be able to replicate the results exactly.
Then there is the claim that poor modelling is causing poor policy. Of course, that can happen, and probably has happened. But I wouldn’t overstate how great a problem this is at the moment, because model results are only one factor influencing policy decisions, and they often have a relatively minor influence.
Again, the salinity example is illustrative. Mick says that the faulty predictions of salinity extent were “used to allocate funding under the NAP”. Apparently they influenced decisions about which regions would qualify for funding from the salinity program. However, in my judgement, they had no bearing on how much funding each of the 22 eligible regions actually received. That depended mainly on how much and how quickly each state was prepared to allocate new money to match the available Federal money, coupled with a desire to make sure that no region or state missed out on an “equitable” share (irrespective of their salinity threat).
The NLWRA also reported that dryland salinity is often a highly intractable problem. Modelling indicated that, in most locations, a very large proportion of the landscape area would need to be planted to perennials to get salinity under control. This was actually even more important information than the predicted extent of salinity because it ran counter to the entire philosophy of the NAP, of spreading the available money thinly across numerous small projects. But this information, from the same report, was completely ignored by policy makers. The main cause of the failure of the national salinity policy was not that it embraced dodgy modelling about the future extent of salinity, but that it ignored much more soundly based modelling that showed that the strategy of the policy was fundamentally flawed.
Mick proposes that “Modellers may not necessarily be purely objective, and “rent seeking” can be just as prevalent in the science community as it is in the wider community.” The first part of that sentence definitely is true. The last part definitely is not. Yes, there are rent-seeking scientists, but most scientists are influenced to a greater-or-lesser extent by the explicit culture of honesty and commitment to knowledge that underpins science. The suggestion that, as a group, scientists are just as self-serving in their dealings with policy as other groups that lack this explicit culture is going too far.
Nevertheless, despite those points of disagreement, I do agree with Mick’s bottom line that “Governments need to adopt a more sceptical attitude to modelling ‘science’ in formulating future environmental policies”. This is not just about policy makers being alert to dodgy modellers. It’s also about policy makers using information wisely. The perceived need for a clear, simple answer for policy sometimes drives modellers to express results in a way that portrays a level of certainty that they do not deserve. Policy makers should be more accepting that the real world is messy and uncertain, and engage with modellers to help them grapple with that messiness and uncertainty.
Having said this, I’m not optimistic that it will actually happen. There are too many things stacked against it.
Perhaps one positive thing that could conceivably happen is adoption of Mick’s recommendation that “Governments should consider the establishment of truly independent review processes in such instances, and adopt iterative policy responses which can be adjusted as the science and associated models are improved.” You would want to choose carefully the cases when you commissioned a review, but there are cases when it would be a good idea.
Some scientists would probably argue that there is no need for this because their research has been through a process of “peer reviewed” before publication. However, I am of the view that peer review is not a sufficient level of scrutiny for research that is going to be used as the basis for large policy decisions. In most cases, peer review provides a very cursory level of scrutiny. For big policy decisions, it would be a good idea for key modelling results to be independently audited, replicated and evaluated.
Further reading
Keogh, M. (2012). Has modelling replaced science? Farm Institute Insights 9(3), 1-5.
Pannell, D.J. (1997). Sensitivity analysis of normative economic models: Theoretical framework and practical strategies. Agricultural Economics 16(2): 139-152. Full paper here ♦ IDEAS page for this paper
Pannell, D.J. and Roberts, A.M. (2010). The National Action Plan for Salinity and Water Quality: A retrospective assessment, Australian Journal of Agricultural and Resource Economics54(4): 437-456. Journal web site here ♦ IDEAS page for this paper
Hi David
Models are scientific if they follow scientific principles. The problem is that many do not; they fail at calibration and at verification. For example, a forecasting audit was conducted by Green and Armstrong (2007) to determine whether the IPCC’s authors followed standard scientific forecasting procedures e.g., as outlined by the International Institute of Forecasters (www.forecastingprinciples.com). They [found that] climate models did not and that “those forecasting long-term climate change have no apparent knowledge of evidence-based forecasting methods” (p.1016). Recently, McKitrick and his colleagues tested a number of climate model predictions against actual outcomes and found that only three models passed the test — NCAR (out of Colorado), a Chinese model and a Russian model. The Canadian Climate Centre model was the worst — an ‘outlier’. So calibration and verification of models is important for adding science.
This discussion reminds me of the succinct words from Henri Theil, one of the early economists in the field in econometrics. “It does require maturity to realize that models are to be used but not to be believed.” (Theil. 1971 p. vi). I continually remind myself about this issue as our students at the University of Alberta build various farm models.
Theil. H. 1971. Principles of Econometrics. John Wiley and Sons. New York.
Good discussion. But there is one aspect I believe can be further elaborated on when dealing with important policy decisions, that of treating model output (and perhaps all scientific output) as a ‘risky venture’: one can apply the Markowitz framework, or any expectation-risk tradeoff equivalent, to outputs from sensitivity and robustness analysis. Of course, the tradeoff curves are themselves subject to potential modelling errors, but the nature of the game becomes the right one: as with all real-life decisions, risk-taking is a fundamental aspect of the problem, precisely because the world is, as you say, “messy and uncertain”. No amount of modelling (or of science, for that matter) can do away with that when real people and real policies are involved. In short, I believe that, whether one considers “experimental scientific output” or “modelling output” (I don’t see any real distinction between the two), dealing with (inescapable) scientific uncertainties and risk-taking in decision making is the crux of the matter. Here economists and colleagues can usefully contribute. Rationality in risky decision making is not always optimal amongst decision makers, let alone scientists!
As a post-scriptum, i have a feeling that this sort of discussion, presumably involving biological and ecological scientists, might have taken a different turn had it involved, instead, physicists or astronomers. They do live “by and for” modelling, but it seems to me that their attitudes can differ markedly from that of bio-sceintists. Perhaps.
Enjoying the discussion, and a couple of points of clarification. The reference to ‘rent-seeking’ in relation to researchers was a reference to efforts to direct funding to particular research areas, and not for the personal gain of the researchers themselves (save for continued employment!). A second point relates to Dave’s discussion about the misuse of modelling information by policymakers, who invariably seek a single ‘number’ to use in their communications. Rather than blame the policymakers for this, I think there is a need for some responsibility on behalf of the modellers, to be more explicit about the confidence intervals associated with their results.
Thanks Mick. I think both sides bear some responsibility for the weak handling of uncertainty in some cases.
H David,
I agree with the caution around the use of models. They are great learning tools, first and foremost, from which the researcher can produce scenarios for consideration and presentation beside other ways of knowing. For understanding, models also require their assumptions to be made explicit, especially where causality and scale relationships are built in and feedback loops increase complexity. Such understanding will allow ‘appropriate’ conclusions to be drawn with the caveats that qualify or frame them.
I am of the view that modellers, in general, are a pretty cautious bunch. The level of discussion often required to understand the complexity of output and assocaited caveats is often beyond the capacity of decision makers and/or willingness to engage. I think this is what often leads to output being condensed to single numbers, rather than irresponsible modellers. Regarding the NAP; I have to agree with David’s suggestion that policy makers chose to ignore the more soundly based modelling. In some cases the gap between projected areas of salinity estimated prior to the implemenation of the NAP and estimates available soon after was stark. It became evident that modellers provided the results they were told to provide, even though better estimates had been made prior to NAP implementation
As my life experiences have accumulated I have believed less and less that “most scientists are influenced to a greater-or-lesser extent by the explicit culture of honesty and commitment to knowledge that underpins science.” Like all other humans it all depends on what gains and losses come from their actions. I have just read the book review of “Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients” by Ben Goldacre, which makes most ot the points about suppression of data and misleading information for policy-makers that has been the centre of this discussion.:
“Medicine is broken. We like to imagine that it’s based on evidence and the results of fair tests. In reality, those tests are often profoundly flawed. We like to imagine that doctors are familiar with the research literature surrounding a drug, when in reality much of the research is hidden from them by drug companies. We like to imagine that doctors are impartially educated, when in reality much of their education is funded by industry. We like to imagine that regulators let only effective drugs onto the market, when in reality they approve hopeless drugs, with data on side effects casually withheld from doctors and patients.
“All these problems have been protected from public scrutiny because they’re too complex to capture in a sound bite. But Dr. Ben Goldacre shows that the true scale of this murderous disaster fully reveals itself only when the details are untangled. He believes we should all be able to understand precisely how data manipulation works and how research misconduct on a global scale affects us. In his own words, “the tricks and distortions documented in these pages are beautiful, intricate, and fascinating in their details.” With Goldacre’s characteristic flair and a forensic attention to detail, Bad Pharma reveals a shockingly broken system and calls for something to be done. This is the pharmaceutical industry as it has never been seen before.” Luckily things are not as bad in NRM in Australia.
Secondly I do not think that you, David, have thought enough about how the decisionmakers can cope with these situations. Sure most of the money spent in NRM was generated more by vote buying than by trying to solve problems. But in some cases, like military strategy, inaction is more disastrous than action (think climate change and the Murray-Darling). Whatever the shortcomings of the model and its boosters, the message that action is needed sooner may be the most important part of the modellers message.
As I said in the post, I’m sure that there are cases where the information provided by scientists is distorted in one way or another for one reason or another – I’ve seen it happen, most strikingly in the salinity example identified by Mick. However, the description of medical science above (which I don’t dispute) simply does not accord with my actual experience of the usual practice of science in the areas of agriculture, environment and natural resource management. In my experience, the distortions are introduced far more from the way that policy makers use the information provided by scientists.
In your last paragraph you seem to be arguing that the end justifies the means — that exaggeration may be justified, or at least excused, if it helps to prompt action. I don’t accept that. If science doesn’t stick to its core value of honesty, it has little value. I don’t agree that inaction is necessarily more disastrous than action. In the salinity example we’ve been focusing on, inaction would have been far better than the actions we actually took. Similarly, in climate change, I’m not convinced that the actions we are taking, or look likely to take, are better than inaction. My feeling is that they are wasting, and will waste, massive amounts of resources without making the slightest jot of difference to climate change. Incidentally, climate change is the sort of high profile, highly politicised issue that is prone to the same sort of distortion of the science as occurred in the salinity example. (Not in most of salinity science, by they way. Mainly that example of the NLWRA work on risk.)
How can decision makers cope? Not sure what you’re referring to (cope with what?). Cope with the uncertainty/messiness? I appreciate that in some ways they face a no-win situation, as the community they must serve and satisfy is generally hopeless at dealing with the complexities of issues. That’s were the need for simplicity often springs from. It’s partly a question of whether the policy makers are strong enough to resist that pathway in order to increase the chances of achieving real outcomes. History says they often aren’t.
Or do you mean, cope with the distortions provided to them by scientists? If they genuinely want to do that, I think that’s do-able, using the sort of review process suggested in the post.
I think about the Japanese reactor builders and the results of their modeling.
We don’t know whether inaccurate modelling or poor handling of uncertainty had anything to do with what happened there. If it did, it’s unlikely to be due to modellers distorting their results for personal gain or to meet political objectives.
I think there are two big factors leading to the disconnect between models, proper science, and policy that you didn’t explicitly mention.
One that is implicit in your post is that modelers are not clear. The most important part that they tend to omit is the exact simplifying and arbitrary assumptions they made and why they made these choices. When a paper presents a model with some parameters that it sweeps through to show result robustness, it often sweeps under the rug all the structural parts of the model that were arbitrarily chosen, often specifically to create a publishable result. This makes the presentation of a model much more convincing then it would otherwise. I see this most often in the cognitive sciences with neural net models. Often authors will argue for how robust their model is to reasonable parameters, when in fact they spent weeks before publication tweaking the structure of their features to get exactly the results that would be publishable. This importance of feature representation, although essential to this modeling paradigm is never varied in print for the vast majority of articles.
A second disconnect is the abundance of models and the tendency of policy makers to select models that reinforce their pre-existing views or policies. This is especially dangerous when combined with the previous point. The structural parts of a model require a great amount of knowledge to discover and scrutinize. A policy maker is often not qualified to judge between the underlying structural assumptions of various models and tends to pick models based on their conclusions matching the policy maker`s desired conclusion. This also allows the policy maker the escape of saying `well, I made my decision based on a model, so it is not my fault the policy failed, but the models`. It is important to remove this scapegoat.
There is a very good “letter to the editor” on this topic from Andrew Moore on the Australian Farm Institute web site here: http://www.farminstitute.org.au/newsletter/November_letter.html