GitHub

In this lab, we will learn how to use GitHub for version control.

Goal: by the end of this lab, you will be able to commit, push, pull, and send pull requests.


What is version control?

Version control is a mechanism for collaborative software development that preserves histories. The major objective is to keep track of all the different changes that get made, so that nothing is lost and you can always go back to any previous state.

Version control systems have been in use for a long time, and many different systems have been used. Currently, git is the dominant verson control system. git is a standalone command line application. Interfaces to git include an official git GUI, and a built-in tab in RStudio.

Most of the things we do with GitHub can be done in RStudio, but it is occassionally necessary to use the command line (and you should embrace that!).

GitHub is a website that hosts many git projects. We will be using GitHub extensively, mainly through our dedicated GitHub organization.

For questions about git and GitHub, please see Jenny Bryan’s excellent book on the subject: Happy Git and GitHub for the useR. In particular, please read the troubleshooting chapter when you run into trouble!

In this lab, we will focus on the use of two functions, as detailed here:

Verifying your connection to GitHub

The git_sitrep() function provides comprehensive information about the status of your connection to GitHub.

library(usethis)
git_sitrep()
── Git global (user) 
• Name: "Quarto GHA Workflow Runner"
• Email: "quarto-github-actions-publish@example.com"
• Global (user-level) gitignore file:
• Vaccinated: FALSE
ℹ See `usethis::git_vaccinate()` to learn more.
• Default Git protocol: "https"
• Default initial branch name: <unset>
── GitHub user 
• Default GitHub host: "https://github.com"
• Personal access token for "https://github.com": <discovered>
✖ Can't get user information for this token.
ℹ GitHub API error (403): Resource not accessible by integration
── Active usethis project: "/home/runner/work/sds270/sds270" ──
── Git local (project) 
• Name: "Quarto GHA Workflow Runner"
• Email: "quarto-github-actions-publish@example.com"
• Default branch: "master"
• Current local branch → remote tracking branch:
  "master" → "origin/master"
── GitHub project 
• Type = "theirs"
• Host = "https://github.com"
• Config supports a pull request = FALSE
• origin = "beanumber/sds270" (can not push)
• upstream = <not configured>
! The only configured GitHub remote is "origin", which you cannot push to.
ℹ If your goal is to make a pull request, you must fork-and-clone.
ℹ `usethis::create_from_github()` can do this.
ℹ Read more about the GitHub remote configurations that usethis supports at:
  <https://happygitwithr.com/common-remote-setups.html>.

If you see errors in your output, investigate them!

Setting up your GitHub token

First, you need to obtain a token from GitHub, if you don’t have one already. Try running:

gitcreds::gitcreds_get()
<gitcreds>
  protocol: NA
  host    : NA
  username: NA
  password: <-- hidden -->

If you get an error or other unreasonable output, you need to tell RStudio about your GitHub token. The token is a long random string of characters that start with ghp_. If you have the token already, run gitcreds::gitcreds_set() to set it.

gitcreds::gitcreds_set()

If you don’t have the token already, run usethis::create_github_token() to create one, and then use gitcreds::gitcreds_set() to set it.

usethis::create_github_token()

Please see this article for comprehensive documentation.

The gh_token_help() function is also helpful for diagnosing issues with your token. Note the “Token scopes” in the output below.

gh_token_help()
• GitHub host: "https://github.com"
• Personal access token for "https://github.com": <discovered>
✖ Can't get user information for this token.
ℹ GitHub API error (403): Resource not accessible by integration

Making a contribution

In this first group exercise, each student will work individually to send a pull request to the maintainer (me) of a single repository. When you make a contribution to someone else’s repo, this is how you will do it. (See also https://happygitwithr.com/fork-and-clone.html)

Setting up the local repo

  1. Run usethis::create_from_github("sds270-s24/ourpackage", fork = TRUE)
Caution

READ THE MESSAGES!!

STOP. If Step 1 worked, proceed to remote verification. If Step 1 failed, pursue the following steps as necessary. THINK before you act!

  1. Fork the ourpackage repo on GitHub.
  2. Clone your fork and make a new project in RStudio.
  3. Set up your upstream remote

Remote verification

By now, you should have your fork set up at https://github.com/YOUR-GITHUB-USERNAME/ourpackage. Next, we will verify that you have your upstream remote set up as well.

  1. Run git remote -v in the Terminal (not in the R console). You should see something like this:
origin  https://github.com/beanumber/sds270 (fetch)
origin  https://github.com/beanumber/sds270 (push)

Making changes

  1. Run usethis::pr_init("<NAME>"), where in place of <NAME> you write your one-word name for the new branch (all lowercase, no spaces or punctuation).
  2. Add your first and last name, with a link to your GitHub user page, to README.md.
  3. Commit your changes.
  4. Push.
  5. Run usethis::pr_push() to send a pull request.
Caution

STOP

I will resolve all pull requests.

Collaborating on a project

Sync your fork

You need to sync your fork regularly (like every time you finish a pull request). Entering these commands in the Terminal should do it.

git pull upstream main

Note that usethis::pr_init() attempts to perform this operation, so if you are able to use pr_init(), you may not have to sync your fork manually via the command line.

Merge conflicts

If two or more people commit changes to the same part of the same file, a merge conflict is inevitable. With good git hygiene and clear project roles, the probability of a merge conflict can be minimized. But they will happen and you need to know how to resolve them.

A side-by-side comparison of the set of changes is helpful. A diff is a way to view these changes. Several editors will perform this comparison. I use meld. Another program is opendiff. You can use whatever you want!

Getting credit

Please respond to the following prompt on Slack in the #questions channel:

Prompt: What would help improve your comfort level with GitHub?