Version control with Git

Fundamentals

by: Marco and Valeria

Why version control?

Has this ever happened to you?

The project is the production of a report and you take the initiative to start it.

Natural instinct : create the first version naming it "v1".

problematic_workflow1

You send it to Ana and Bob for their parts and revision, while you also keep working on it.

Natural instinct : Ana and Bob work on their parts and name them something different.

problematic_workflow2

The project now exists in 3 different versions!

All of them at different stages of progress.

problematic_workflow3

You or someone else has to put them all together manually!

This will probably lead to a lot of confusion and to errors!

problematic_workflow4

Now what?: A better approach

  • Define and design project with structure: which parts and how they interact

  • Avoid thousands of versions and their storage for the same file(s)

  • Document changes as the files are built and go back to previous versions

  • Collaborate effectively with other people when making projects



"Never lose control on what you’re doing" - MV

What is version control?

  • The most robust paradigm to structure and organize your projects

  • A system that allows to effectively track changes and merge files

  • A hub to connect project developers

  • A platform to share locally (your computer folder) and remotely your projects



Version control is mostly used in the context of software development

But it can be used in the development of any well-structured digital project: e.g. collaborative reports, academic publications, etc.

Here’s how it works!
We start with the original version of the file. Each feature is colored differently.

correctworkflow1

Each user gets their own version.

correctworkflow2

They make their modifications. Each collaborator will work only on a previously agreed feature.

correctworkflow3

When Ana finishes, her work will update the existing old version of the file.

correctworkflow4

When Bob finishes his work, we update the existing file, leading us to the final version of the document.
Everything in one single file!

correctworkflow5

Which version control system?

Version control systems

Git, Mercurial, SVN Apache Subversion

Managers of remote repositories
  • Github

  • Gitlab

  • Bitbucket

  • SourceForge

  • Launchpad

Git

Git is an unpleasant or contemptible person

"I’m an egotistical bastard, and I name all my projects after myself. First 'Linux', now 'git'."

Creator of Linux
— Linus Torvalds

It became the most used open source version control system

Git

Use Git through a command window

git terminal

Use Git through user friendly interfaces

git_GUI

Basic Workflow

How can we properly and cleanly work with Git?

Time to try out!

Start a repository: basic jargon

  • Repository (repo): project directory (project folder). This directory is version-controlled

  • Local repo: the version of the repository that lives in a local machine

  • Remote repo: the version of the repository that lives in a remote machine or server

Start a repository

basic workflow 1

Either clone an existing remote repository or create a new one locally and push it remotely

Start a repository pt.2

How to start a repository video tutorial. Video only available during the workshop

While working on the project: basic jargon

  • Fetch: download remote history to your local repository to see what others have done.

  • Branch: a pointer that tracks a version of the repo. Several branches can be created to develop separately different parts of the project which can later be merged to a main branch. See here more details

  • Pull: download remote branch to your local branch to integrate what others have done. (pull = fetch + merge)

  • Commit: a documented checkpoint of the status of the repository

  • Push: send local repository branch to remote repository branch

While working on the project

basic workflow 1

While working on the project pt.2

Good workflow with Git video tutorial. Video only available during the workshop

Merging: basic jargon

  • Merge: integrate one branch into another branch

  • Merge conflict: an inconsistency between 2 branches that were merged. Same files have different changes at the same locations or lines. When a merge conflict occurs, solve the conflict, add and commit.

Merging

basic workflow 1

When you are done with a feature, merge the branch back with main

Examples / Applications of Git

Python packages development: packages stored in remote repositories to distribute them with conda

is number github
is number conda

R packages development: packages stored in remote repositories to distribute them with install_github

ggplot2 github
ggplot2 github
Bob's local computer
Ana's local computer

Personal projects: publications, collaborative academic projects, coding challenges, hobbies, …​

Remote

Create blogs or personal websites

website

Good practices for Git pt.1

  • Define a solid structure of your project before starting a repository

  • Add a readme file in the repository with detailed description about the project

  • Learn to read the Git outputs when you run commands such as pulling and pushing

  • Make mistakes and read through the errors you are going to make

Good practices for Git pt.2

  • Create effective branches: what feature? which contributor?

  • Before you start working: git fetch and git pull/merge to work on the last version

  • Commit frequently and descriptively, then push accordingly

  • Forgot what you are doing? Check git status to display in which stage you are

  • Something unexpected? Check git log to revise the commit history

Don’t panic, you can always open branches and reset to previous commits when things go really wrong

Be curious, Git is your friend to keep control!

git thank-you!