Version control code, text,… and stop worrying 🙌
E-mail course: 6 steps towards reproducible research (step 4)
Hi friend 🤗
Version control is a big topic for me. It completely changed the way I work. I am happy that we get to talk about version control as part of this e-mail course.
Step 4: Version control code, text, …
What is version control? Let’s say you are writing a paper. You will edit your paper and might want to keep different versions of it. A common way to handle that is by using different file names for different versions.
This way of “version control” is outdated and error-prone. The most common proper version control system today is Git, which I’d like to introduce to you now.
Git for version control
Git is free and open source 😃🙌
With Git you can track different versions of your paper. For each version you can add a description (“commit message”) and you even automatically track who made which change if you are working in a group. You can always go back to old versions.
The way you work with Git is that we have the version database both on our computers and on a server. To get the changes from and to the server we use commands (pull
= download stuff from server, push
= upload stuff to server).
Most researchers use GitLab or GitHub as platforms for working with Git and they also serve as a neat front end for the server. GitLab and GitHub give us some extra neat features for collaboration (e.g. issues, Wiki, …).
Learning Git can be daunting 🙀. I recommend learning it with a group or in a class. I am always happy to teach version control (get in touch!). You can also check if there is a free Software Carpentry class in your area.
Other version control systems
There are many other ways of doing version control out there.
Subversion: Simpler systems like Subversion are less used these days as Git offers more flexibility.
Google docs and friends: Many text editors (Google Docs, OneDrive, …) offer versioning now. It is not as advanced and versatile, but a nice way to work in a WYSIWYG (What You See Is What You Get) editor. Git really only works with real text files, so people usually use LaTeX or Markdown (not WYSIWYG) to write texts when using Git.
Versioning data: Version control of data is a difficult task. Let’s leave that for another newsletter. See here for more info for now.
Your task
For today’s task it makes sense to do it together with a peer. Sometimes getting started with git is difficult, but with a friend and a cup of tea 🍵 it is possible. Also I promise: it’s worth it and gets easier over time 😌.
Install git on your computer, create an account on GitLab or GitHub, and start your first repository. For R users, I recommend following these instructions. Others, please check the “further reading” links below.
Further reading
Version Control, The Turing Way
Version Control with Git, Software Carpentry
Version Control with Git (for R users), Anna Krystalli
Set up Git with RStudio & GitLab, Heidi Seibold
That’s all for today. Next time we’ll look into how to stabilize your computing environment and software in order to make your research reproducible.
Cheers,
Heidi