Does putting science into a lower, more powerful gear, slow progress?
Or does it actually speed it up?
This is the question that Andrew Elmore and I asked ourselves as we discussed how to conduct the research that led to our recent article in Nature Ecology and Evolution. We knew that one of the most important regulators of global carbon fluxes is the availability of nitrogen to terrestrial plants. But there was uncertainty about whether N availability was increasing or decreasing in terrestrial ecosystems globally.
To address this uncertainty, we planned to synthesize measurements of foliar N isotopes from around the world over the past few decades. In short, if foliar d15N was going up over time, that would be evidence of increasing N availability. But if it was going down, it would support declining N availability.
But how should we conduct this research? We knew that our findings were potentially important enough to try to publish the results as quickly as possible. But, we also knew the results would be scrutinized carefully and could be considered controversial. For a synthesis like this, it was our top priority to have the results be as trusted as possible. Trust, here, could not (or should not) be conveyed by the weight of the status or number of authors. Instead, trust relied on maximizing transparency: reduce doubt that data or analyses were being crafted to generate a particular result. Trust meant putting the research into a low gear. The scientific engine (us) works harder, but progress is faster because the progress is more certain. Trust meant pre-registration.
The idea of scientific pre-registration of analyses is simple. Set up your protocols for data capture and analysis ahead of time. Remove options to alter data or analyses, which can inject bias into the results. Gather the data according to pre-determined protocols. Once you are done, hit the big red button to analyze the data, and you are done.
In practice, it’s not that simple. Effective pre-registration requires thinking through eventualities ahead of time. What are your hypotheses? What is your sampling plan? How specifically will these data be evaluated? Which covariates do you include? What technique will you use to identify outliers?
Writing papers can be simpler when one collects data and then crafts the story of a paper and the analyses together. Writing all of your R code before data are collected? Not simple.
We used the Center for Open Science’s pre-registration system for this project (http://osf.io/thnyf/). Their system was well thought out. It is more than an archive for R code. There are a series of questions about the study design. Andrew and I carefully thought through a number of potential scenarios as we pre-registered our data. We laid out our hypotheses, wrote the R code for data analysis and figure generation, and made a priori decisions on issues like what constituted an outlier. After answering the questions, our registration was reviewed by staff and the questions that came back from the reviewer were on point.
Once the pre-registration was done, the “easy” work began. Over 450 papers were examined to determine which data sets could potentially be included in the synthesis. Over 140 emails were sent out to individuals requesting data. Correspondence occurred with over 100 individuals. Over 43,000 data points were examined. 36 individuals became coauthors and commented on manuscripts. 384 days after pre-registration, the paper was published.
Our pre-registration wasn’t perfect. I made an error in formulating one analysis that was fixed post-registration. And we still added another set of analyses to secondarily explore patterns. But these changes were done transparently. The main results were based on analyses written before the data had been assembled.
Effective pre-registration is a skill to be honed over time. Pre-registration is not for the lazy. Yet, proper pre-registration is like putting the science process into low gear. Progress is slower, but more powerful.
And (hopefully) a lot less likely to slide backwards.
See https://rdcu.be/9Pjv for the article.
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in
Dear Joseph,
sounds really transparent. But how, within this pre-registration framework, do you deal with re-analyses that come up during rounds of peer review, or during interactions with colleagues (eg in conferences)?
thanks for this interesting post
Hi Ruben...There are a couple of different lines of thinking on this. 1) Once analyses are pre-registered, not only should authors have less latitude to add additional analyses, so should reviewers to propose additional analyses. One extreme is to have the analyses reviewed ahead of time and then locked in. Once that happens, you can't deviate. 2) Additional analyses can be added (or current ones changed) but there must be a record of what is "exploratory". We did something similar to this in our methods section. I think the protocols would progress to differentiate those additional analyses suggested by coauthors vs. reviewers.