Concepts in Reproducibility
- Analytic data: Data used as the final input in a workflow to produce a statistic (display item) in the paper (including appendices).
- (Research) claim: According to a definiton provided by the repliCATS project, a research claim is a single major finding from a published study, as well as details of the methods and results that support this finding. A paper can include more than one claim. Claims are usually described in the paper’s abstract; however, sometimes, the claim described in the abstract does not match the claim that is tested in the paper.
In the SSRP framework, different claims in a paper may be tested using different methodologies, and their results may be presented in one or more display items, such as tables and figures (figure 0.1 illustrates this idea). There are several types of claims, including:
- Causal claim:An assertion that invokes causal relationships between variables. A paper may estimate the effect of X on Y for population P, using method F. E.g., “This paper investigates the impact of bicycle provision on secondary school enrollment among young women in Bihar/India, using a Difference in Difference approach.”
- Descriptive/predictive claim: An assertion that estimates the value of Y (estimated or predicted) for population P under dimensions X using method M. E.g., “Drawing on a unique Swiss data set (population P) and exploiting systematic anomalies in countries’ portfolio investment positions (method M), I find that around 8% of the global financial wealth of households is held in tax havens (value of Y).”
Coding error: A coding error may occur when a section of the code executes a procedure that contradicts the intended procedure expressed in the documentation (paper or comments of the code). For example, an error happens if the paper specifies that the analysis is performed on the population of males, but the code restricts the analysis to females.
- Data citation:The practice of referencing a dataset, rather than just the paper in which a dataset was used. Data citations help other researchers find data and reward data sharing. Find further guidance on data citations here.
- Data sharing: Making the data used in an analysis available to others, ideally through a trusted repository (see below).
- Disclosure: In addition to publicly declaring all potential conflicts of interest, researchers should provide rich details on the methods used for testing a hypothesis, e.g., by including the outcomes of all regression specifications tested. This can be presented in the appendix or supplementary material.
- Digital Object Identifier (DOI): A DOI is a string of numbers, letters, and symbols used to permanently identify digital scientific contributions, such as articles, datasets, software, and others (also see DOI.org). Once linked to a contribution, DOIs are permanent and are thus helpful to include in references.
- Intermediate data: Data not directly used as final input for analyses presented in the final paper (including appendices). Intermediate data don’t contain direct identifiers.
- Literate programming: The practice of writing and commenting code such that it can be read and understood by a human.
- Pre-specification (or pre-registration): The act of detailing, ahead of time, the statistical analyses that will be conducted for a given research project. Expected outcomes, control variables, and regression specifications are all written in as much detail as possible. This serves to make research confirmatory in nature.
Researchers commonly record such details in a hypothesis or trial registry, usually while also providing a pre-analysis plan.
- Processed data: Raw data that have gone through any transformation other than removing personally identifiable information (PII).
- Raw data: Unmodified data files obtained by the authors from the sources cited in the paper. Data from which PII have been removed are still considered raw. All other modifications to raw data make it processed.
- (Trial) registry – A public database of registered studies or trials, e.g. The American Economic Association’s registry for randomized controlled trials, the Open Science Framework registries, or ClinicalTrials.gov. Some of the largest registries only accept randomized trials, hence the frequent discussion of “trial registries.” Registration is the act of publicly declaring that a hypothesis is being, has been, or will be tested, regardless of publication status (see also “Pre-specification” above). Registrations are time-stamped.
- Replication – Conducting an existing research project again. A subtle taxonomy exists, and there is disagreement, as explained in Hamermesh, 2007 and Clemens, 2015. Pure Replication, Reproduction, or Verification entails re-running existing code, with error-checking, on the original dataset to obtain the published results. Scientific Replication entails attempting to reproduce the published results with a new sample, either with the same code or with slight variations on the original analysis.
- Reproducibility: A research paper or a specific display item (an estimate, a table, or a graph) included in a research paper is reproducible if it is possible to reproduce within a reasonable margin of error (generally 10%) using the data, code, and materials made available by the author. Computational reproducibility is assessed through the process of reproduction.
- Reproduction package: A collection of all the materials necessary to reproduce a specific display item or an entire paper. A reproduction package may contain data, code, and other documentation. When the materials are provided in the original publication, they should be labeled as “original reproduction package”; when a reproducer provides them, they should be referred to as “Reproducer X’s reproduction package.”
- Researcher degrees of freedom:The flexibility a researcher has in data analysis, whether consciously abused or not. This can take a number of forms, including specification searching (also referred to as “p-hacking”), covariate adjustment, selective reporting, or hypothesizing after the results are known (HARKing).
- Robustness check: An change in a computational choice, both in data analysis and data cleaning, and its subsequent effect on the main estimates of interest. The SSRP distinguishes between reasonable specifications and feasible specifications. Reasonable robustness checks (Simonsohn et. al., 2018) are (i) sensible tests of the research question, (ii) expected to be statistically valid, and (iii) not redundant with other specifications in the set. The set of feasible robustness checks is defined by all the specifications that can be computationally reproduced. We assume that the specifications already published in the paper are part of the set of reasonable specifications.
- Specification searching: Searching blindly or repeatedly through data to find statistically significant relationships. While not necessarily inherently wrong, if done without a plan or without adjusting for multiple hypothesis testing, test statistics and results no longer hold their traditional meaning, resulting in false positives, thus impeding reproducibility.
- Trusted digital repository: An online repository whose mission is to provide “reliable, long-term access to managed digital resources to its customers, now and in the future.”(see OCLC (2002)). Storing data here is superior to simply posting on a personal website since it is more easily accessed, less easily altered, and more permanent.
- Version control – The act of tracking every change made to a computer file. Version control is useful for empirical researchers who may edit their programming code often.
Concepts in the ACRE exercise and the platform
- Analysis code: A script associated primarily with analysis. Most of its content is dedicated to actions like running regressions, running hypothesis tests, computing standard errors, and imputing missing values.
- Candidate paper: A paper that has been considered for reproduction, but the reproducer decided not to move forward with the analysis due to failure to locate a reproduction package. Learn more here.
- Cleaning code: A script associated primarily with data cleaning. Most of its content is dedicated to actions like deleting variables or observations, merging data sets, removing outliers, or reshaping the structure of the data (from long to wide, or vice versa).
- Declared paper:The paper that the reproducer analyzes throughout the exercise.
- Display item:A display item is a figure or table that presents the results associated with a given research claim found in a research paper.
- Reproduction tree (or reproduction diagram): A diagram generated at the Scoping stage on the SSRP, which links display items with the code and data files that are required to reproduce them. The tree is meant to represent the entire computational workflow behind a result from the paper. It can also be used to guide users of the reproduction package and/or to identify missing components for a complete reproduction.