Skip to content
Eric Hansen edited this page Jun 30, 2016 · 1 revision

GitHub Tutorial

Here is a summary of Git commands to use on a local Linux machine for interaction with a remote Git repository (Q2MM on GitHub as an example). Most of the information is from either git --help, the various man git, and man gittutorial (recommended!). Throughout this page, if you see <...>, it means that the enclosed text is something you can optionally type.

Getting Started

Make sure you can use Git, for example run git --help. If not, can you load a module, like module load git? Otherwise, you may need to install git. Google is your friend.

Configure Git for your own use. You can always check current configuration using git config -l. Identify yourself to Git

git config --global user.name "Your Name Comes Here"
git config --global user.email "you@yourdomain.example.com"
git config --global core.editor "emacs -nw"

Change what is in citation marks to whatever is relevant for you.

You can also edit these global configurations inside of a text file, which is located by default at "~/.gitconfig". If you have done the two commands above, you will see those settings in here.

At any point, you can get help with git --help or man git. These also work with subcommands, for example, git commit --help or man git commit.

Create a Git Directory to Work In

Two choices.

  1. If you will work with someone else on a remote repository, create a clone, a local copy, of that repository

    git clone https://github.com/Q2MM/q2mm mylocalq2mm
    

    You will create a new directory called "mylocalq2mm", with a complete copy, including a history, of the code on GitHub.

  2. Make your own independent repository, just for version control, or for later sharing with others

    mkdir mylocalrepository
    cd mylocalrepository
    git init
    

All Git commands in this manual will assume that you're in the directory containing your repository.

Note that the contents of the directory are not automatically part of the repository, Git must be told of them. Also, Git will not recognize changes to files unless you commit them.

Working in a Git Repository

Note that the directory you're working in is NOT the same as the repository. You can add and edit files, Git will not automatically know about it.

Adding an existing file (in the directory or subdirectory) to the repository:

git add "filenames"

You can simply do git add . to add every single file to the repository (don't know what happens with subdirectories).

Changing files will not automatically be part of the repository, you have to commit them. The simplest command, committing all changes to KNOWN files (that is, files that have been added, not just files that sit in the directory) is:

git commit -a -m "Title of changes" -m "" -m "Detailed description of changes (can be multiple -m options for multiple lines)"

The -m option seems a very central feature of Git. This is the tool to document, for others, and for your own later reminder. Don't skip it! There are several possibilities for simplifying the documentation, like writing it all in a file and using that with -F instead of -m. See git commit --help for details.

You could also simply do git commit -a, in which case you get sent to a file where you can edit the message and get information. You will need to know your text editor, vi by default, if you do this; if you're in and don't know how to get out, ":wq" will save your changes and get out, or ":q!" will just get out without saving anything. Simply doing git commit without options will tell you about uncommitted changes. If you changed your default editor earlier on with the global configuration options, then that editor will be used rather than vi.

If you've been working for a while since last commit, and don't know what changes you've made, there are two ways to list changes:

git status
git diff

Try them out, they give a bit different output; status is brief, diff detailed.

If you want to know the full history of everything that has been committed, use

git log

Running Git Code

This section is for those not used to running code under Linux. You usually don't want to run calculations in the same place you have your code, so you have to keep track of at least two directories, and you must be able to access them using Linux paths. My description will be centred on using Linux with a bash shell (you can man bash); alternatives would be, for example, tcsh (man tcsh). In particular, you'll need to know how to handle environment variables ($NAME) under your shell (type env under bash for a listing)

Remember that the tilde, ~, always represents your home directory. You never do anything there, but create subdirectories using mkdir NewDirectoryName, then go to the relevant directory using cd DiretoryName. Just cd will always bring you to your top level, your $HOME (an environmental variable with the full path name of your home directory). Other good shortcuts are . for the currently active directory ($cwd), and .. for the directory above. Thus, cd ../.. will bring you two levels up. You can find out where you are by looking at $cwd (echo $cwd), or typing pwd.

For this example, I'm going to assume that the code you are working on is in the directory "q2mm" in your home directory, and that you're using the commands on files in a different directory. At any time, you can address files in the directory by starting the path with ~/q2mm/. If they are defined as executable (by giving the command chmod +x mycommand), you can run it directly, from anywhere (the directory where you have files to work on, not the code), by typing

~/q2mm/mycommand <options>

You can also make life simpler by putting the directory in the path where Linux automatically looks for named commands (look out for name conflicts though). You do this by augmenting and exporting the environment variable $PATH, like this:

export PATH="$PATH:~/q2mm"

After this, or easier, if you have this line permanently in your .bashrc file so that it gets activated every time you login or open a new shell, running the command is simplified to:

mycommand <options>

If your command should not be executable, for example if you have python code that needs to be run under Schrödinger, I recommend using the full path instead:

$SCHRODINGER/run ~/q2mm/mycommand.py <options>

Working with Branches

Whenever you have the first fully functional version of a code, it should be considered "master", and you should only change it when you have fully tested any additions. You do that in branches. You create a branch for an issue or a new feature, test it, and when finally satisfied, merge it back into master. You might do a simple, critical bug fix directly on master, but that should be an exception.

You create a new branch, a full copy of the currently checked out branch (normally master) by

git branch "newbranchname"

You can see all existing branches with

git branch

You switch between branches using

git checkout branchname

Or, you can combine the create branch and checkout commands on one go using either -b or -B option

git checkout -b branchname

Preferably, you should do this before any code changes, so that you never change the code directly in master. Note that when you checkout a branch, all the code in the directory will change to the version in that branch, allowing you an easy way to check the same calculations (run somewhere else, see above) with different versions of the code.

If you find you want to discard changes in a file, you can do that with the following "checkout" command:

git checkout -- filename(s)

When you have finished modifying the code in a branch, committed all changes, and tested the code to ensure it is at least as good as master, you are ready to merge it back. If the master is unchanged from when you started your branch, this is trivial, you simply do

git merge

This will merge the current branch with whatever branch it originated from, usually master. If master have changed in between, git will try to combine your changes with the new master (called a fast forward), but there may be conflicts you have to resolve, like if your branch and the master have done different modifications to the same line of code.

In cases where the merge can not be fast forwarded, you will have to resolve merge conflicts manually. Git will report to you which files can not be fast forwarded in its standard output, and then you must edit the files manually with a text editor. Fortunately, Git does a good job of marking which sections of the code have conflicts. They usually appear something like the following.

$ git status
# on branch branchyoutriedtomerge
# you have unmerged paths.
#   (fix conflicts and run "git commit")
#
# unmerged paths:
#   (use "git add ..." to mark resolution)
#
# both modified:      favoritepet.txt
#
no changes added to commit (use "git add" and/or "git commit -a")

$ emacs -nw favoritepet.txt

Inside favoritepet.txt, you can search for "<<<<<<<" to locate the sections where merge conflicts arose. Here's an example.

<<<<<<< HEAD
cats
=======
dogs
>>>>>>> branchyoutriedtomerge

In the example above, you must pick whether the line of code should be "cats" or "dogs". Delete the other lines manually, save, and then add the resolved files and merge.

$ git add favoritepet.txt
$ git commit

An alternative is to merge the current branch (usually master) with one or more development branches.

git checkout master
git merge branch1 branch2 branch3~2

There are many options to "merge", for example allowing you to cherry-pick certain commits from other branches. A branch name generally refers to the last commit within that branch, but it is possible to instead reference earlier commits. In the example above, the last two commits in branch3 were skipped in the merge. You can also use named commits directly. See git merge --help for details.

When you have successfully merged, you can continue working in the branch, or deleting it (recommended, you want to have issue-specific branch names)

git branch -d branchname

Remote Git Repositories

All examples here will assume you work against the Q2MM code located on GitHub. You will create a local copy of the code there, running and modifying it locally, download updates, and occasionally contribute by uploading code. You may have local firewall problems interfering with git commands; if so, ask a local expert how to bypass, usually using proxy settings.

You create a fresh copy of the current version of the code by cloning into a new directory.

git clone https://github.com/Q2MM/q2mm myq2mm

This will create the local directory "myq2mm", as a complete git repository, and also store the address to the GitHub repository (much information in the hidden directory .git; don't make changes there unless you REALLY know what you're doing). You can use the code as described above, or create a development branch, check it out, and start modifying the code.

cd myq2mm
git branch mydevelopment
git checkout mydevelopment

At any point, you can download the current version and merge it with your own changes (possibly having to address conflicts, see git merge above).

git pull

Behind the scenes, this command combines a git fetch using the stored GitHub address, followed by a git merge.

The remote repository you cloned from will be known as "origin", and you can, for example, refer to "origin/master" in different git commands.

You can use the pull command to access versions from different addresses, for example if there are personal versions of the code on GitHub. You would create and checkout a new branch, and pull a specific branch from someone else into it.

git checkout -b mytestbranch
git pull https://github.com/OtherPerson/q2mm develop_branch

You can now test out the alternative code, and possibly merge between it and some of your own commits.

Before uploading anything, it is recommended that you ensure that you can merge with the current version performing a git pull, and then only uploading your currently checked out branch using git push.

git pull
git checkout mydevelopment
git push

At this point, you login on GitHub using a browser, create a pull request, and ask some other user to check your code and then merge it into the master.

Creating a Pull Request

Creating a pull request is simple using the GitHub interface. Simply go to your repository, in this case https://github.com/Q2MM/q2mm, and open the tab pull request. Click the green "New pull request" button. For the base, select master using the drop down options, and for compare select the branch you wish to merge. Then simply hit the green button "Create pull request". It will change to a new screen where you can give the pull request a title and description. The description should include the motivation for the changes and what changes were made. After adding that, hit "Create pull request", and you're finished! Now wait for someone else to read over your changes, and allow them to merge your changes into master. Never merge your own changes! The reviewers may suggest that you make additional changes before allowing the merge.

The above all assumes that you have read/write access on the repository of interest (in this case, q2mm/q2mm). If you don't then you must first fork the repository. Again, this is very simple using the GitHub interface. Simply go to the main page of the repository, and click "Fork" in the upper right corner. You can then clone your fork and make changes in the same way as described as above. Even submitting pull requests works exactly the same way, except now you simply choose to pull across repositories after clicking the green button "Create pull request".