Chapter 11 Definitions

Abandoned paper: A paper that has been considered for reproduction, but the reproducer decided not to move forward with the analysis due to failure to locate a reproduction package. Learn more here.
Analysis code: A script associated primarily with analysis. Most of its content is dedicated to actions like running regressions, running hypothesis tests, computing standard errors, and imputing missing values.
Analytic data: Data used as the final input in a workflow to produce a statistic (display item) in the paper (including appendices).
Candidate paper: A paper that is currently been considered for reproduction. The reproducer can move forward to scope the paper or abandon the paper if there is no reproduction package. Learn more here.
(Research) Claim: According to a definiton provided by the repliCATS project, a research claim is a single major finding from a published study, as well as details of the methods and results that support this finding. A paper can include more than one claim. Claims are usually described in the paper’s abstract; however, sometimes, the claim described in the abstract does not match the claim that is tested in the paper. In the SSRP framework, different claims in a paper may be tested using different methodologies, and their results may be presented in one or more display items, such as tables and figures (figure 0.1 illustrates this idea). There are different types of claims, including:
- Causal claim: An assertion that invokes causal relationships between variables. A paper may estimate the effect of X on Y for population P, using method F. E.g., “This paper investigates the impact of bicycle provision on secondary school enrollment among young women in Bihar/India, using a Difference in Difference approach.”
- Descriptive/predictive claim: An assertion that estimates the value of Y (estimated or predicted) for population P under dimensions X using method M. E.g., “Drawing on a unique Swiss data set (population P) and exploiting systematic anomalies in countries’ portfolio investment positions (method M), I find that around 8% of the global financial wealth of households is held in tax havens (value of Y).”
Cleaning code: A script associated primarily with data cleaning. Most of its content is dedicated to actions like deleting variables or observations, merging data sets, removing outliers, or reshaping the structure of the data (from long to wide, or vice versa).
Coding error: A coding error may occur when a section of the code executes a procedure that contradicts the intended procedure expressed in the documentation (paper or comments of the code). For example, an error happens if the paper specifies that the analysis is performed on the population of males, but the code restricts the analysis to females only.
Data availability statement: A description, normally included in the paper or the appendix, of the terms of use for data used in the paper, as well as the procedure to obtain the data (especially important for restricted-access data). Data availability statements expand on and complement data citations. Find guidance on data availability statements for reproducibility here.
Data citation: The practice of referencing a dataset, rather than just the paper in which a dataset was used. Data citations help other researchers find data and reward data sharing. Find further guidance on data citations here.
Data sharing: Making the data used in an analysis available to others, ideally through a trusted repository (see below).
Declared paper: The paper that the reproducer analyzes throughout the exercise.
Digital Object Identifier (DOI): A DOI is a string of numbers, letters, and symbols used to permanently identify digital scientific contributions, such as articles, datasets, software, and others (also see DOI.org). Once linked to a contribution, DOIs are permanent and are thus helpful to include in references.
Disclosure: In addition to publicly declaring all potential conflicts of interest, researchers should provide rich details on the methods used for testing a hypothesis, e.g., by including the outcomes of all regression specifications tested. This can be presented in the appendix or supplementary material.
Display item: A display item is a figure or table that presents the results associated with a given research claim found in a research paper.
Intermediate data: Data that has been processed in some fashion but it is not directly used as final input for analyses presented in the final paper (including appendices).
Literate programming: The practice of writing and commenting code in an intertwined fashion, such that both components (code and text) can be read in narrative form.
Pre-specification (or pre-registration): The act of detailing, ahead of time, the statistical analyses that will be conducted for a given research project. Expected outcomes, control variables, and regression specifications are all written in as much detail as possible. This serves to make research confirmatory in nature. Researchers commonly record such details in a hypothesis or trial registry, usually while also providing more details in a pre-analysis plan.
Raw data: Unmodified data files obtained by the authors from the sources cited in the paper. Data from which only personally identifiable information (PII) have been removed are still considered raw. All other modifications to raw data make it intermediate.
(Trial) Registry: A public database of registered studies or trials, e.g. The American Economic Association’s registry for randomized controlled trials, the Open Science Framework registries, or ClinicalTrials.gov. Some of the largest registries only accept randomized trials, hence the frequent discussion of “trial registries.” Registration is the act of publicly declaring that a hypothesis is being, has been, or will be tested, regardless of publication status (see also “Pre-specification” above). Registrations are time-stamped.
Replication: Conducting an existing research project again. A subtle taxonomy exists, and there is disagreement, as explained in Hamermesh, 2007 and Clemens, 2015. Pure Replication, Reproduction, or Verification entails re-running existing code, with error-checking, on the original dataset to obtain the published results. Scientific Replication entails attempting to reproduce the published results with a new sample, either with the same code or with slight variations on the original analysis.
Reproducibility: A research paper or a specific display item (an estimate, a table, or a graph) included in a research paper is reproducible if it is possible to reproduce within a reasonable margin of error using the data, code, and materials made available by the author. Computational reproducibility is assessed through the process of reproduction.
Reproduction package: A collection of all the materials necessary to reproduce a specific display item or an entire paper. A reproduction package may contain data, code, and other documentation. When the materials are provided in the original publication, they should be labeled as “original reproduction package”; when a reproducer provides them, they should be referred to as “Reproducer X’s reproduction package.”
Reproduction tree (or reproduction diagram): A diagram generated at the Assessment stage on the SSRP, which links display items with the code and data files that are required to reproduce them. The tree is meant to represent the entire computational workflow behind a result from the paper. It can also be used to guide users of the reproduction package and/or to identify missing components for a complete reproduction.
Researcher degrees of freedom: The flexibility a researcher has in data analysis and data cleaning to, consciously or unconsciously, make certain analytical choices to obtain a desired result (most commonly to achieve statistical significance) . This can take a number of forms, including specification searching (also referred to as “p-hacking”), covariate adjustment, selective reporting, or hypothesizing after the results are known (HARKing).
Robustness check: An change in a computational choice, both in data analysis and data cleaning, and its subsequent effect on the main estimates of interest. The SSRP distinguishes between reasonable specifications and feasible specifications. Reasonable robustness checks (Simonsohn, Simmons, and Nelson 2019) are (i) sensible tests of the research question, (ii) expected to be statistically valid, and (iii) not redundant with other specifications in the set. The set of feasible robustness checks is defined by all the specifications that can be computationally reproduced. We assume that the specifications already published in the paper are part of the set of reasonable specifications.
Specification searching: Searching blindly or repeatedly through data to find statistically significant relationships. While not necessarily inherently wrong, if done without a plan or without adjusting for multiple hypothesis testing, test statistics and results no longer hold their traditional meaning, resulting in false positives.
Trusted digital repository: An online repository whose mission is to provide “reliable, long-term access to managed digital resources to its customers, now and in the future.”(see OCLC (2002)). Storing data here is superior to simply posting on a personal website since it is more easily accessed, less easily altered, and more permanent.

References

Simonsohn, Uri, Joseph P Simmons, and Leif D Nelson. 2019. “Specification Curve: Descriptive and Inferential Statistics on All Reasonable Specifications.” Available at SSRN 2694998.