Document with care: README, Metadata, code comments, … to make both others and future you happy 😃

E-mail course: 6 steps towards reproducible research (step 3)

Nov 11, 2022

Hello friend 👋,

welcome back! In this newsletter we will discuss step 3 of our 6 steps towards reproducible research: Document with care: README, Metadata, code comments, …

Reproducible Research: 6 helpful steps. (1) Get your files + folders in order; (2) Use good names for files, folders, functions, …; (3) Document with care: README, Metadata, code comments, …; (4) Version control code, text, …; (5) Stabilize computing environment and software; (6) Publish your research outputs: Code, data, documents, …

What parts of my research project need documenting?

That’s for you to decide. There is no super-clear catch all answer. Here are a few thoughts from my side though 😉

README

One thing that I always do is to add a README-Text-File to each project. In the README I write the most important info about the project: What is it about? Who is involved? Where to find files? How to cite it? Where to find the paper? …

Code documentation

In my research projects code plays an important role. That might be different for you → feel free to skip.

To make my code as understandable as possible for others, I use literate programming (mixing text and code to make it easier to read, e.g. RMarkdown) or add clear code comments. When writing functions in R I additionally use the standardised way to document R functions (via Roxygen2).

An example of code comments in R (“#”):

## Load package + data
library("model4you")
data("MathExam14W", package = "psychotools")

## scale points achieved to [0, 100] percent
MathExam14W$tests <- 100 * MathExam14W$tests/26
MathExam14W$pcorrect <- 100 * MathExam14W$nsolved/13

## select variables to be used
MathExam <- MathExam14W[ , c("pcorrect", "group", "tests", "study",
                             "attempt", "semester", "gender")]

Metadata

Metadata is information about your data. It’s information on the license of the data, who owns it, what information the data cointain, …

Many research fields have standards for metadata. If you can’t find one for your field you can use a common standard (e.g. Dublin Core) or just ask a data manager or librarian at your institution. You can write metadata similar to a README (see e.g. this guide from Cornell University). If you upload your data to a data platform (e.g. Dryad) you won’t have to think about it as the platform usually takes care of that (Dryad uses Dublin Core).

Other

Whatever you work on, there might be parts of your research project that are difficult to understand. Say you work in a lab, then your documentation is a lab notebook. Or you do interviews, then your documentation may be your interview strategy. Anything that might be useful for others is worth keeping and worth sharing. After all, we all want to build on the work of others in order to make the world a little better.

Your tasks ✅

Check if your current research project already has a README. If not, create one 🙌
Do you write code? Make a habit of writing code comments right when you create the code.
- Will be coding this upcoming week? Start doing it (if you don’t already 😉).
- Won’t be coding this week? Go to a recent script and check if you did a good job. If not, try code comments 💪.
Check out the literature linked in this newsletter issue. Anything in particular you find interesting? Share your newly gained knowledge with your peers 🤓🤓🤓.

Heidi’s Newsletter

Discussion about this post