2011-05-24

My Git workflow

The first time I encountered Git, I was confused. Pushing and pulling and branching and merging all sound straight-forward in and of themselves, but when you’re first trying to figure out what to do when it can rapidly descend into witchcraft. Neither was I particularly organised about how I managed code and deployment - it was all a bit ad-hoc, which works fine for a while but eventually has the potential for accidental cockups.

What helped me to get my head around the processes was following other people’s workflows to see how they managed. There’s lots of documentation around for Git, but it didn’t really sink in until I saw it being used in practice.

If enough clever people do enough clever things often enough, eventually some processes emerge, and that’s exactly what Git Flow is. It’s a set of macros that sit on top of raw Git, and set up branches for you, ready to use with a suggested workflow. It’s designed for a full-on development environment complete with staging, hotfixes and all the trimmings - but can be condensed down to the point where you just use the bits that are relevant. This is how it works for me.

Git Flow works with three main branches - master, which I’ll come back to; develop, which is the main day-to-day branch; and features, which are a series of short-lived branches.

My process runs like this:

Create the project, and run git flow to get the initial branch structure set up. I use the defaults, and just ignore the branches I don’t need like hotfixes.

	git flow init




	< work through the prompts, using the defaults >




	< edit the project's readme file >




	git add .
	git commit -am "Initial commit on master branch"
	git push origin develop

That creates that initial version on the develop branch, which I’ll then deploy to the server (of which more later) That way there’s a ‘ground zero’ version of the project. You don’t need to do this, but it suits my slightly OCD tendencies. Apart from anything I always feel better for having made that first commit, because it feels like the project has Started.

Then I use git-flow to start a feature.

	git flow feature start <feature name>
	< edit the feature's readme file >
	git add .
	git commit -am "Initial commit on feature XXX"
	git push origin feature

A feature for me corresponds either to a user story, or to a chunk of functionality that seems to be discretely definable. It might be less than a story if it’s just something quick - I know this breaks the story-driven development conventions to an extent, but the point is to make the process work for you rather than the other way around. Mentally I’m trying to structure the development process so that there aren’t any dependencies between features.

I also push the features to the remote repository. There are two schools of thought about this, one being that it’s not really necessary and clutters up the repository. I prefer to look at it from the point of view of the remote repo also being part of my backup, so the tradeoff is worth it over the long term.

When a feature’s complete, I use git flow to close and merge it back to the develop branch; which is then pushed back to the repo.

	git flow feature finish




	git push origin develop

While I’m working on the feature, I’ll often use a temporary “throw away” branch to try things that I’m not sure about. The cost of branching with Git is so low that it’s worth creating a new branch and throwing it away later, rather than trying to undo code changes:

	git checkout -b <temporary branch>




	do stuff, add and commit

If the code doesn’t work, then I can throw the branch away:

git checkout

	git branch -d -f <temporary branch>

Or merge I’ll merge something successful into the feature that I’m working on:

	git checkout
	git merge --no-ff

There seem to be two schools of thought about the “best” way to merge branches - personally I prefer the –no-ff method because it preserves the temporary branch and the associated history. Viewing that in the commit tree gives me a clearer picture of what happened, to my mind at least.

The final step in the process is the release branch, which in my workflow is the one that gets deployed to the staging server for final tweaking and QA. That’s created from the develop branch:

	git flow release start

That gets deployed, and any final tweaks made with the usual add / commit / push cycle. Once the staging version gets signed off, it’s time to complete the release:

	git flow release finish

The magic behind the scenes is that this automagically gets merged back into the master branch. If you work on the basis that your production server only ever deploys from master, it becomes a fairly robust process.

To get the code from my local development environment out into the real world, ideally I’ll have a three-server setup and use Capistrano to automate things as far as possible. The day-to-day development stuff runs on one called ‘develop’, which deploys code from the develop branch. That gets deployed as often as there are changes on the develop branch. Sometimes this gets blurred a bit - features might get deployed if there’s a good reason for it, but the point is not to get anally retentive about the process and do what’s need to get the job done.

The second server or instance is for staging - that’s the behind-the-live-curtain version that the client can play with. That gets deployed from the develop branch - so once a feature is completed and merged in, it can be pushed out to the world to be signed off.

The final piece of the jigsaw is the “live” server - called live, for want of a better term - which runs the code from the master branch.

Because I try to use Capistrano to automate deployment, it’s relatively trivial to nail this all down - the multi-stage functions of Capistrano means that you can abstract out the stage-specific settings from the main body of configuration. In this case I set the production script to deploy from the master branch, and the staging script to deploy from the release branch. The advantage of doing it this way is that I can’t accidentally deploy the wrong version out to the live environment.

The point about all of this is not that this is the One True Path to successfully using Git - it’s what works for me. It’s also not my invention - I’m just using the functionality that Git Flow provides, and tweaking it to fit my way of working. Hopefully someone will also find it useful, either as it stands - or even better, using it as the basis to come up with something even better. Such is the joy of open source :-)