Advanced Methods in Reproducible Science 2020

At the start of January 2020 I attended the Advanced Methods in Reproducible Science course at Cumberland Lodge.

Our first day started with a lecture by Florian Markowetz, who discussed selfish reasons to do reproducible science:

it helps avoid disaster (e.g. losing lab notes), helps with writing papers (having easily accessible code+data), helps reviewers see it your way (they can run the code themselves), enables continuity (the first re-user of your data will be your future self), and helps build your reputation. A project is more than a beautiful result.

Then, Alex Etz introduced us to Bayesian statistics using JASP. While frequentist approaches use p values and alpha levels to help us decide whether we will accept/reject the null hypothesis, in Bayesian statistics, inferences are probability statements about hypotheses or models. Data is used to update what we know (prior uncertainty) and reduce our uncertainty (posterior uncertainty). Recommended resources are: Intro to Bayesian inference for psychology, Statistical Rethinking course, Theoretical advantages and practical ramifications, Example applications with JASP. He also has a great blog called The Etz Files, with well explained info on Bayesian stats.

Density plots of original and replication p values (Open Science Collaboration, 2015, Science, 349)

Marcus Munafo emphasised the scale of the problem and provided a brief introduction to potential solutions.

Marcus suggested a bunch of great papers that are now at the top of my reading list:

A manifesto for reproducible science,

Scientists behaving badly,

The natural selection of bad science,

How citation distortions create unfounded authority, Why science is not necessarily self-correcting, Estimating the reproducibility of psychological science,

Scanning the horizon: towards transparent and reproducible neuroimaging research, Power failure: why small sample size undermines the reliability of neuroscience,

An Open, large-scale collaborative effort to estimate the reproducibility of psychological science, Current incentives for scientists lead to underpowered studies with erroneous conclusions, False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant, We knew the future all along: Scientific Hypothesizing is much more accurate than other forms of precognition,

Day 2 started with Chris Chambers explaining why we need Preregistration and Registered Reports. He talked about the discrepancy between what's best for science (i.e. high quality research published regardless of outcome) and what's best for scientists (i.e. producing a lot of "great results") and how this results-driven culture distorts incentives. Registered Reports are proposed as a potential solution, by making results "dead currency" in quality evaluation. A good intro to the topic is the UKRN primer. A message that stayed with me was that hypotheses are ~5 times more likely to be unsupported in RR compared with regular articles. Chris provided us with a list of journals (from cos.io/rr) that currently accept registered reports (including Royal Society Open Science, Nature Human Behaviour, BMC Medicine, European Journal of Neuroscience, Brain and Behaviour, Brain and Neuroscience Advances). A template for RR is accessible on the OSF and completed examples can be found on OSF (for stage 1 protocols) or Zotero (for stage 2 reports). Hannah Hobson then presented her own experience doing a registered report as an early career researcher.

George Davey Smith went on to discuss causal inference in observational studies. A good resource to understand this is this short youtube video. Mathematically, triangulation = a method that calculates a distance that is difficult to measure, from two or more easier to measure distances. This is important in science, as different approaches are unlikely to be biased in the same way. Triangulation can be employed in any of the following: data sources, investigator, methodologies, theoretical approaches, data-analysis.

George was followed by Mike Smith, who taught us the basics of using RMarkdown, which can be used to combine "human readable" prose with source code and output in a single document, making analyses easier to understand and reproduce (example). I was shocked by how simple this was to implement and finding out that this RMarkdown cheatsheet exists helped me convert one of my existing R scripts to Markdown in very little time.

Unfortunately I did not manage to go to Malika Ihle's session on using Github for version control, but I went through her slides in my own time, as well as her step-by-step tutorial. She also provided a great resource with info on GitHub for R users.

Lastly, Courtney Soderberg introduced us all to the Open Science Framework, which can be used to improve the transparency of the research workflow.

Visual representation of p values being used as a threshold to publish

Day 3 started with Daniel Lakens' lecture on diagnosing publication bias and other anomalies. He discussed the countless ways in which bias can be introduced in the research process, from research misconduct/fraud (e.g. the vaccines cause autism paper), to errors in statistical reporting (Statcheck can be used to detect these), inconsistencies (see Grim test), HARKing (hypothesizing after results are known) or publication bias (see figure). My main takeaway message from this was asking myself, whenever one of my results is p>.05 "Had this p value been significant, would I still have dismissed it as a flawed study?". A great resource to help you understand the flaws of p values is p-curve.com

Further, Dorothy Bishop and Lisa DeBruine taught us how to use R to simulate datasets and perform power calculations in complex designs. Simulating data can help us conduct power calculations and give us insights into study design and optimal analyses. Slides are openly available. They also suggested a great resource for understanding Cohen's d.

Lastly, Emily Sena and Kate Button ran a great session on analysing and reducing the risk of bias in the literature. Some relevant resources include the Equator network (Enhancing the QUAlity and Transparency of health Research), and papers like The Garden of Forking Paths ... , Sifting the evidence: What's wrong with statistical tests .

On the morning of Day 4, Nick Brown talked to us about his efforts to correct the literature without an advanced degree.

He was followed by Paul Thompson, who taught us how to use R for data cleaning and data wrangling. A great online resource is An introduction to data cleaning with R as well as the all time favourite R for data science and the dplyr cheat sheet. Suggested packages include assetr (to verify assumptions about data early in the process) and codebookr (uses data from codebook for cleaning). I learned about assertive programming, which means checking something is correct before moving on to the next operation (super important especially in large analyses where errors may be missed).

The day ended with Sam Parsons, reflecting on what he learned on the course last year. One of the most important points he made was that we (science) need to make reproducibility and transparency part of the research process, rather than an add on.

On day 5, Kirstie Whitaker discussed how we can transform research with collaborative working. We were introduced to the principles of open leadership: understanding (making work accessible and clear), sharing (making work easy to adapt, reproduce and spread), and participation & inclusion (building shared ownership and agency to make sure the work is inviting and sustainable for all). While open research can seem a bit overwhelming at first, Kirstie told us that incremental progress is still progress, and every little helps. A great example for designing an open project and creating a positive culture for contribution and collaboration is the BIDS starter kit on Github. (essentials include the readme and contributing files, as well as the code of conduct) and some additional resources can be found on The Turing Way.

Dorothy Bishop then went on to discuss the question of why literature reviews should be systematic, followed by Marta Topor and Jade Pickering 's guidelines for systematic reviews beyond clinical trials.

The day ended with Brian Nosek talking about why reproducible research is the future, and on the following morning we got some practical advice from Kirstie about how to take open, reproducible practices forward in our home organisation.

I feel incredibly lucky to have learned so much and to have met all of these amazing people. Hopefully the resources above can be helpful for others who are trying to find ways to make their research more open, transparent and reproducible.

ALEXANDRA LAUTARESCU

Advanced Methods in Reproducible Science 2020

Recent Posts