Would you have your cancer treatment be defined by one doctor supported by the opinion of two additional referees? Surely, you would hope the treatment to follow best practice standards reached by consensus among several doctors working in the field.

Surprisingly, no such standards exist in models entering biodiversity assessments, but we now provide such standards for a frequently used class of models: species distribution models.

We trust our study in Science Advances will encourage others to develop best practice standards in additional methodological applications used to support policy decisions in biodiversity science.

• Planet Earth is undergoing changes of enormous magnitude, and anthropogenic pressures on biodiversity are now of geological significance. Demand for models in biodiversity assessments is rising, but which models are adequate for the task?

• Despite growing use of distribution models in biodiversity assessments, no generally agreed-upon standards for best practices exist for guiding the use of data, building of the models, and for evaluating the adequacy of the inferences that feed into biodiversity assessments.

• An international team of biodiversity modelers has developed, for the first time, a best-practice standards framework with detailed guidelines enabling scoring of studies based on species distribution models for use in biodiversity assessments.

• The proposed best-practice standards now published in Science Advances constitute a tangible step toward improving the scientific foundation of future biodiversity assessments while providing a cornerstone of increased transparency and accountability.

Over the past 20 years, more than 6000 studies have used one of the most common classes of biodiversity modeling: species distribution models (SDMs).

Over half of the studies using SDMs sought to apply their results to at least one type of biodiversity assessment, including forecasting the effects of climate change on biodiversity, or selecting places for protected areas, habitat restoration, and/or species translocation.

Results of SDMs are now feeding into major global assessments of the impacts of human activities on the living world, such as those by the IPBES (Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services), the IUCN (International Union for Conservation of Nature), and the IPCC (Intergovernmental Panel on Climate Change). Despite the high demand for use of species distribution models in biodiversity assessments, no generally agreed-upon standards for best practices exists for guiding the building of these models and for evaluating the adequacy of the models that feed into these assessments.

“In practice, assessors often made ad hoc judgments about which studies to include, and papers with greater visibility, such as those published in high-profile journals, are frequently favored. The problem is that journal decisions depend on many factors that extend well beyond the appropriateness of the data and models, and the impact of a journal—or even the number of citations of a given paper—is a poor indicator for a study’s appropriateness for inclusion in biodiversity assessments” explains Miguel Araújo, lead of author of the study, from the Spanish Research Council (CSI) at the National Museum of Natural Sciences in Madrid. One solution, proposed by IPBES in one of its most recent assessments, is for the scientific community to establish and agree upon a specific set of best-practice standards and guidelines to support the evaluation of studies and weight the appropriateness of data and models used in the assessments supporting policy recommendations and decisions.

This is exactly what this international consortium of leading biodiversity modelers did, with core funding from the European Commission’s COST program.

“The aim was to reach consensus on best-practice standards for models in biodiversity assessments so to provide a hierarchy of reliability, ensure transparency and consistency in the translation of scientific results into policy, and encourage improvements in the underlying science”, elucidates Carsten Rahbek from the Center of Macroecology, Evolution and Climate at the University of Copenhagen.

“For different aspects of modeling, namely the choice of data, choice of model fitting strategy, and model evaluation we determined specific standards and guidelines. In particular, we proposed four levels of standards. The gold standard is aspirational. It usually requires data and next generation modeling approaches that remain under development, as well as results obtained through multiple sources of evidence. The silver standard corresponds to current cutting-edge approaches, typically involving imperfect data combined with analyses that allow uncertainty and bias to be reduced, accounted for, or at least estimated. The bronze standard encompasses data and procedures that represent the minimum currently acceptable practices. It includes approaches to characterize and address limitations of data and models, and to interpret their implications on the results. The final category (deficient) involves the use of data and/or modeling practices that are considered unacceptable for models used in driving policy and practice”.

The use of standards is not new in applied sciences. For example, the use of guidelines for application of best practice standards in health care has been shown to save lives in a variety of medical applications. Guidelines for best-practice standards have also existed for quite some time in aviation to determine whether every step in complex machinery operations has been taken.

Why similar guidelines have not previously been established in applications of models for biodiversity science?

“Firstly, there is still relatively little pressure for findings in biodiversity research to percolate through biodiversity management decisions. In practice, many decisions are still based on opportunistic considerations, expert judgment, or intuition. Secondly, while human survival, or passenger safety on aircraft, can be easily measured, the myriad facets of biodiversity are considerably harder to define, let alone measure. What is to be maximized? Species richness? Functional diversity? Persistence? Thirdly, is the lack of agreement among the modelers themselves about what constitutes a best practice. This was indeed the main reason why the authors of this research decided to get together and work on the differences over fundamental conceptual and methodological issues.

The consensus achieved represents a landmark in the field and we hope it will help biodiversity assessors navigate the jungle of published papers as well as contribute to increasing the quality of the data and models used in biodiversity assessments”, concludes Araújo.