Skip to content

Recommended git setup

Erik Kluzek edited this page Jan 22, 2022 · 25 revisions

You do not need a GitHub account to simply clone and run CTSM. However, you will need a GitHub account in order to contribute changes back.

If you will be pushing changes to GitHub (either to your own fork or to shared forks), or if you will be pulling changes from private repositories on GitHub, then it's worth the time to set up ssh keys for each machine you'll be using. By doing so, you won't have to enter your password when pushing to or pulling from GitHub.

See https://help.github.com/articles/connecting-to-github-with-ssh/ for instructions. After doing this, you can use the ssh form of GitHub URLs (e.g., [email protected]:ESCOMP/CTSM.git) in place of the https form.

You can configure other GitHub settings by clicking on your profile picture in the upper-right of any GitHub page, then clicking on "Settings".

You can get email and/or web-based notifications for two types of activity:

  1. Issues, Pull Requests and other activity where you are mentioned by name
  2. Issues, Pull Requests and other activity for repositories, teams and conversations that you have chosen to watch

The number of notifications generated by (2) can be overwhelming, but it can be important that you see notifications generated by (1). Therefore, a recommended configuration is to get email notifications for (1) and just web-based notifications for (2). This can be configured by clicking on "Notifications" on your Settings page:

Recommended GitHub notification settings

To see your web-based notifications, click on the bell shown at the top of any GitHub page (it will have a blue dot when there are new notifications).

There are a number of required settings and a number of other helpful settings to streamline your git experience. These settings need to be made once per machine. They apply for all git repositories that you work with on that machine.

The following settings are used to add author information to your commits:

git config --global user.name "Your Name"

git config --global user.email [email protected]

The email you use here should be an email address associated with your GitHub account (see GitHub's help on setting your email address for more details).

Also see the section below on LFS, which is required to build the documentation or run our python tests, but not required for just running the main model.

The Git LFS tool provides support for managing large files. For most use and development of CTSM, you will not need this. However, we use Git LFS to:

  1. Manage image files for the documentation (relevant if you want to build the documentation or change or add image files)
  2. Manage NetCDF inputs used to test some of our python code (relevant if you want to run tests from the python directory)

Installing Git LFS on your machine is a two-step process; step (1) needs to be done once per machine, and step (2) needs to be done once per user:

  1. Install the Git LFS tool: Follow the instructions on the Git LFS page for installing Git LFS on your platform.
    • On cheyenne, Git LFS is already available as long as you are using a git module rather than the default system-level git. So just make sure that you are always using git via a git module (module load git).
    • On a Mac using homebrew, this can be done with brew install git-lfs.
  2. Set up your git configuration to use Git LFS by running: git lfs install (this will add a few lines to your .gitconfig file in your home directory). (If you're not sure whether you have already done this, it is safe to rerun that command to be sure.)

For more information on using Git LFS with CTSM, you can search for "lfs" on the Wiki page on editing the documentation, and the README file in the python testinputs directory.

NOTE: If you have run git lfs install on a given machine, then you can get errors when trying to run git commands if the Git LFS tool is no longer available. Solutions are to either reinstall the Git LFS tool or run git lfs uninstall. For example, this can happen in the following scenario:

  • You have loaded a git module that includes an installation of Git LFS
  • You then run git lfs install
  • Later, you are working on the same system, but no longer have the git module loaded, so the Git LFS tool is no longer available

In this case, you can solve the problem either by loading the appropriate git module or by removing the [filter "lfs"] section from your .gitconfig file in your home directory (though in the latter case, you will not be able to obtain the large files stored in the CTSM repository via Git LFS).

You can set which editor to use for log messages, etc., with:

git config --global core.editor [editor of your choice: emacs, vi, vim, etc]

(See http://swcarpentry.github.io/git-novice/02-setup/ for specific settings to use for many common editor choices.)

The following setting generates better patches than the default:

git config --global diff.algorithm histogram

The following setting makes it easier to resolve conflicts if you're doing conflict resolution by hand as opposed to with a dedicated conflict resolution tool. (However, for non-trivial conflicts, I recommend configuring git to use a graphical tool like kdiff3 for conflict resolution.) Without this setting, you'll just see your version and the version you're merging in delimited by conflict markers. With this setting, you'll also see the common ancestor of the two sides of the merge, which can make it much easier to figure out how to resolve the conflict:

git config --global merge.conflictstyle diff3

The following setting is unnecessary for Git 2.0 or later, but helps when using an older version. This ensures that only the currently checked-out local branch will be pushed to the remote repository (e.g., your GitHub fork):

git config --global push.default simple

The following settings are mainly useful for integrators - that is, people who are making changes to the master branch.

The following is important when merging a branch to master, in order to create actual merge commits; it is equivalent to specifying --no-ff when doing a merge:

git config --global merge.ff false

The following prevents you from using git pull if your local branch has evolved. This is particularly important when working with master, with a typical workflow of doing git pull in order to update your local copy of master:

git config --global pull.ff only

NOTE: The above settings assume that you use git merge when merging two different branches together, and git pull when merging a branch's upstream (remote) copy into the local copy. So, with these settings, you should NOT do something like git merge escomp/master to merge the upstream master branch into your local copy: instead, you should always use git pull for that scenario.

There are two helpful things you can do to streamline your use of git if you use the bash shell. These are completely optional, but improve your git experience. These are also documented here: https://git-scm.com/book/en/v2/Appendix-A%3A-Git-in-Other-Environments-Git-in-Bash

With bash, many command-line tools support smart tab-completion, so that (for example) if you type git ch[Tab], you will get a list of possible completions; if there is only one possible completion (e.g., git chec[Tab]), it will be filled in for you.

On a mac, if you use homebrew, you can install bash completion for many commands with brew install bash-completion, and there are probably similar things you can do on other systems.

To just install bash completion for git, you can copy https://github.com/git/git/blob/master/contrib/completion/git-completion.bash to some location on your system, then add the following to your .bashrc file:

source /path/to/git-completion.bash

It can sometimes be hard to keep track of the state of your working copy: you could have changes that are committed but not yet pushed, changes that have been staged but not committed, and/or changes that have not yet been staged. There are git commands to show you these things, but it can be more convenient to have that information right in front of you in your terminal at all times. This is where git-prompt comes in.

For example, I have configured my prompt to look like this:

git prompt - clean state

or:

git prompt - dirty state

The magenta line shows the git information. This shows:

  • What branch I'm on (master)
  • Whether there are changes that have not yet been staged with git add, indicated by a *
  • Whether there are chagnes that have been staged but not yet committed, indicated by a +
  • If the current branch has an upstream tracking branch: Whether this branch is up-to-date with the upstream branch (indicated by =), ahead of the upstream branch (indicated by >: in this case, you should do a git push at some point), or behind the upstream branch (indicated by < - for example, if you have fetched from the upstream but not yet merged into your local branch). (It's also possible to see <>, in which case there are some commits that exist only in your local branch and some other commits that exist only in the upstream branch; this can happen after a rebase.)

The git prompt can be configured in many ways to suit your needs. In my case, I have the entire path in my standard bash prompt, so I wanted the git prompt on a separate line to avoid having an even longer prompt line.

You can obtain git-prompt.sh from https://github.com/git/git/blob/master/contrib/completion/git-prompt.sh . Put this anywhere on your system, then add the following in your .bashrc file:

source /path/to/git-prompt.sh

There are a number of environment variables that you can set to control git-prompt. The two I find most useful are these (which I have in my .bashrc file after the above source command):

export GIT_PS1_SHOWDIRTYSTATE=1
export GIT_PS1_SHOWUPSTREAM="auto"

Caution: I recommend NOT setting GIT_PS1_SHOWDIRTYSTATE on super-computers with slow file systems (e.g., cheyenne).

Finally, you need to set your prompt variable. I use:

export PS1='\[\e[1m\]\[\e[35m\]$(__git_ps1 "(%s)\n\[ \b\]")\[\e[0m\][\h:\w]\$ '

Here are details on the above prompt, if you're interested:

  • The key part is __git_ps1, which inserts the prompt from git-prompt.
  • My basic prompt is given by [\h:\w]\$, which prints the hostname and current working directory in square brackets, followed by $.
  • Inserting a newline in a pretty way was tricky. I managed to accomplish this by giving git-prompt the formatting argument (%s)\n\[ \b\]: \n is a newline, but without anything following the newline character, the newline didn't appear for some reason. You can fix this simply by putting a space after the newline, but I didn't like that, aesthetically. Putting a backspace (\b) after the space fixed that problem, but resulted in problems with line wrapping. Enclosing the space and backspace in \[ ... \] fixed the line wrapping problem (treating the combination as a non-printing character, I think).
  • Most of the rest of the prompt controls the colors

For resources on getting more help with git, see the wiki page on Quick start to CTSM development with git.

Clone this wiki locally