Chapter 7 Additional resources

  • Coding errors: A coding error will occur when a section of the code, of the reproduction package, executes a procedure that is in direct contradiction with the intended procedure expressed in the documentation (paper or comments of the code). For example an error happens if the paper specify that the analysis is perform on the population of males, but the code restricts the analysis to females only. Please follow the ACRE procedure to report coding errors.

Create a section with short summaries of great resources for comp. repro and invite reader to contribute.

7.1 Some summaries

7.1.1 Summary on reproducible workflow (Chapter 11) from Christensen, Freese, and Miguel (2019):

Workflow practices, coding practices, and version control are three tools to make your work more reproducible:

Folder organization

  • Create a master folder with a descriptive name for the project, which should contain:
    • separate folders for programming script files, raw data, edited data, output, final paper or article text
    • a README file: description of contents of each folder, and installation and operating instructions for a reproducer
  • Keep raw data intact: any edits or datasets generated using raw data should be stored in a “data” folder separate from the “raw data” folder
  • When naming a directory or file, stick to lowercase letters with underscores (instead of spaces) to avoid cross-operating-system issues

Efficient and readable programming

  • Leave a record of any changes to the data: write code in the programming environment, instead of modifying data by hand in a spreadsheet or relying on point-and-click options
  • Include comments in code to explain changes, and save intermediate datsets used in analysis
  • Give variables names that are informative to future reproducers
  • Use relative directory paths, not absolute paths, so the work can be more easily reproduced from different computers.

Version control:

  • Maintain a written record of work:
    • In a central research log: log activities in a single central file as often as work on the project is being done (“which team member writes what code, produces what output, edits which files, and when.”)
    • in script files: “who edited which part of which file when, and why”
    • with a version control system, such as Git: Git records changes made to files, by whom, and when
  • A brief explanation of Git: users add changed files to the staging area, and then commit those changes to the project folder, or repository. Git keeps the filename and records the new version of each file from the staging area.

References

Christensen, Garret, Jeremy Freese, and Edward Miguel. 2019. Transparent and Reproducible Social Science Research: How to Do Open Science. University of California Press.