Red Pill Or Blue Pill
While setting up a new project on Visual Studio Team Services (VSTS), you are given a choice. The choice of source control. That choice provides the direction of how your code will be managed.
In one hand, you have the blue pill. Team Foundation Version Control, or TFVC for short. A centralized version control system with which you might be totally comfortable. Something you might not want to stop using.
In the other hand, you have the red pill. Git version control. A Distributed Version Control System (DVCS) of which you might have only heard about. Perhaps something you are curious to find out more about.
Well go ahead and choose the Red Pill and dive into Git version control. We’ll outline the reasons you might want to do so in this post.
Distributed vs Centralized
You might have noticed above, or heard, that Git is a “Distributed Version Control System” (DVCS). At a very basic level this means that the repository does not need a central place for hosting like SVN or TFVC.
For example, with TFVC, you “checkout” by getting a snapshot of the latest code that is in the central repository on some server. The history is still tracked on the server. Each time you “check-in” you are pushing your code changes to everyone else. This often means that you hold off on that “check-in” so you don’t mess someone else up, or worse yet, break the build. This results in large “check-ins”, which can also mean a reduced ability to revert changes if the need arises because you would also revert things you might not want to revert.
Conversely, in a DVCS like Git you “clone” the entire repository to your local machine. Not just a snapshot of the latest version. This means that you have the advantage of being able to work completely offline if the need arises. In fact, that’s how you would normally work. Disconnected from any central hosting server. Your “commits” are done locally until you are ready to “push” your commits to the “remote” Git repository (i.e. central hosting server). Though you don’t need a central hosting repository as there are other ways to share code.
You might think keeping the entire history locally would take up a lot of space but, surprisingly, it doesn’t. The reason for this is that Git stores each revision of a file only once and in a compressed state. Meaning that it doesn’t recreate the file for every revision done to the repository. Only the revisions for which that file has changes. Git also keeps track of deltas instead of an entire copy of a file. So if only one line of that file changed then it tracks the delta of that line instead of the whole file. Git is essentially an event-sourced system whereby every action taken on the repository is recorded as a diff in the form of an event with a unique hash to identify that “event”.
Advantages of Git
Since Git is used in a “disconnected” manner, it allows for some distinct advantages. I’ve already stated that you can commit while disconnected. That’s not as big of an advantage as it used to be these days but it ‘s still worth mentioning.
The other side of that coin, though, is being able to commit without adversely affecting teammates. This can be done because you are committing locally for as long as you need to. Then, when you’re ready, you can push those commits to give your teammates access to your changes. This allows you to make smaller more manageable commits which makes tracking your history easier. It also allows you to revert commits without reverting things that aren’t related. You might hear the saying “Commit early and commit often” when dealing with Git. That is a great practice to get into.
Another advantage of Git is the ability to take a set of changes on a file and commit individual lines of that file. Say, for example, you are in a state of flow working for a couple of hours and realize that you have a lot of changes you need to commit. One file might have indirect changes as a response to explicit changes made in other files. For example, you renamed some methods or properties in class1 and class2 which caused multiple changes in class3. You can group these related (cohesive) changes together and make individual commits.
Using Git you can make commits by “staging” specific changes using any combination of file, “hunks”, or specific lines in a file. This allows you to make commits that are cohesive across files. Using the same example as above, you can make one commit by “staging” the rename changes of class1 with the specifically affected lines of class3. You can then make another commit by staging class2 with the remaining affected lines of class3.
Obviously you can make these commits as you go but sometimes – for me, often times – you get into a working flow and the changes get stacked up before you realize how many changes you have.
Branching and Merging
This is where Git shines in particular. Branching, and more importantly, merging are very easy and the preferred method of working with Git. The major difference with branching between Git and TFVC is that TFVC makes copies of the parent from which it branched while Git branches are just pointers to a commit.
This can be a tough concept to understand. When TFVC creates a branch, it creates an entire copy of its parent. This ends up taking more space. Especially if you are using branching strategies to manage features and releases. On the other hand, Git simply creates a new stream of commits for a branch. Thereby keeping only deltas between commits and branches. This keeps the repository size very small. Especially when comparing an equal number of branches between the two systems.
Merging is even more powerful. Git treats a merge as just another commit. Albeit, a commit from two parents. Git, by default, will figure out the most efficient way to merge the two branches and is great at resolving most conflicts on its own. This approach to merging keeps history of both branches intact and allows for much safer branching in merging. Compare that to TFVC where branching can end up rewriting history of one or both branches in order to resolve conflicts.
The ease of branching allows for some great workflows both locally when developing and when using continuous integration or continuous delivery. That topic could be a whole blog post on its own.
But What About My Shelve-Sets?
TFVC has a great feature called “Shelve-Sets”. This allows one to take pending changes and shelve them in order to work on something different and not affect the current state of the repository.
Git has a similar feature called stashing, or “git stash” using proper parlance. The biggest difference between the two is that stashing is only done at your local repository. You cannot “push” the stash (or stashes – you can have multiples) up to the shared remote.
That said, you’ll find that you won’t use stashing very often. I use it maybe once every several months. Its usually in very extreme circumstances where I need to switch branches immediately to make an emergency fix to some code and don’t have time to make the small commits with a large stack of changes I have pending.
Git has come a long way in recent years when it comes to tooling. What started as a “command line only” approach has now become a plethora of choice. First there is great support for Git directly inside Visual Studio. There are still features lacking, though. Specifically when it comes to more advanced workflows like stashing and staging “hunks” or lines instead of whole files. That said, you can accomplish 90% of what you need to accomplish using just the tools Visual Studio provides.
Another fantastic, and free, tool put out by Atlassian is called “Source Tree”. It has a great GUI with a full feature set of what can be done with Git. I would highly recommend this tool. Especially if you want to see the power of staging “hunks” or individual lines of a file.
Naturally, there is the command line which gives you all the control you’ll ever need. When I began learning Git, I ended up using the command line quite a bit, and I’m not a command line type of guy. The reason for this was because all the help and answered questions out there were using the command line so I got used to it over time. In fact I still use it quite a bit. It’s slightly faster than navigating through a GUI in several cases.
I have encountered only one stumbling block when it comes to Git. The use of large binary files, like game assets, videos, sound files, or generally “undiffable” files. Managing these types of files is not the out-of-the-box experience for Git but there are things you can do to overcome this obstacle. Git has this notion of Large File Storage (LFS). This ends up being a remote storage location where the files and assets are actually stored and Git only keeps track of the metadata about them. This keeps Git light and fast so that your workflow is not impeded. More information about this feature in VSTS can be found here.
Take the Plunge
Still not sure if you’re ready? Or you’ve made the switch but feel that something is missing? Why not let us guide you in your endeavor to use Git to its maximum potential. Feel free to get in touch and let’s see how we can help you along.