Articles

Why Git Ain’t Better Than X

In Version control on March 26, 2010 by Matt Giuca Tagged: , ,

I’ve been aware of the website Why Git is Better Than X for some time, and it’s always irritated me. This is my rebuttal.

Firstly, some context. I’m an avid Bazaar user. I say this upfront because when I talk about revision control, I’m always biased towards Bazaar. Having said that, since the website, made by Scott Chacon, claims to exist because “I seem to be spending a lot of time lately defending Gitsters against charges of fanboyism, bandwagonism and koolaid-thirst,” I feel it’s necessary to expose this website’s fanboyism, bandwagonism and koolaid-thirst. I’ll try to be objective, but I think this website does a great deal of damage to Bazaar’s reputation so I want to challenge it.

Here’s the deal: Distributed version control systems (DVCSes) are awesome. We all know that, and if you disagree, you’re living in the 90s. Why Git is Better Than X (WGBTX) makes a very good case for DVCSes, comparing the author’s favourite DVCS, Git, against the previous decade’s champion VCS, Subversion. Git works very differently to Subversion, and the site does a good job highlighting the differences. There’s also Perforce, which I don’t know much about, but I gather it’s a crappy proprietary centralised VCS which is worse than Subversion in pretty much every way. [Edit: I got a lot of heat for the Perforce comments. I admit, I know nothing about it so disregard my comments.] So WGBTX does a lot of Perforce bashing. Unfortunately, the two other major DVCSes, Bazaar (bzr) and Mercurial (hg) get a lot of heat too, and in my opinion, for no good reason other than that they don’t behave exactly like Git. (Perhaps the site should be titled “Why Git is more like Git than things that are not Git.”)

So WGBTX isn’t an entirely bad site. It makes a good case for DVCSes. I just think that its comparisons between Git and Bzr/Hg have almost no merit, and therefore the site should be called “Why distributed version control is better than centralised.” Not quite so catchy, though.

So here we go, a point-by-point rebuttal.

Cheap local branching

This is my biggest complaint. The website claims that Git and only Git has “cheap local branching”, while Bazaar, Mercurial, Subversion and Perforce do not. The fact is that cheap local branching is part of being a DVCS, and they all have ways of doing it. They just aren’t the same as Git’s. So I’ll assume that Scott was unaware of Bazaar’s “shared repository” feature (ignorance rather than malice).

In Git, the basic unit of revision control is the “repository”. When you do a git clone, you clone all or part of a remote repository (you get some or all of its branches). Branches live inside a repository. When you do a git branch within the repository, it’s very cheap because all of the branches inside a repo share their revision history. That’s all well and good, but it isn’t the only way to do it.

In Bazaar, the basic unit of revision control is the “branch”. When you do a bzr branch, you clone a single branch. Projects on Launchpad aren’t repositories, they’re branches (related branches are organised on Launchpad into projects, while on my personal computer, I organise them by directory — that’s outside of the VCS). The problem is that these “branches” don’t share data, so making a bzr branch, even on the same file system, copies all of the branch history. The solution in Bazaar is to create a “shared repository”. It’s very simple — bzr init-repo <dir> makes <dir> into a shared repository. Any branches created in a descendent of that directory share revision history with one another. I like this because Bazaar has no conceptual repository, it’s just a transparent cache. So I don’t need to mentally deal with repositories vs branches, I just store my large projects in a shared repo, and it’s all good.

So Bazaar does not have cheap local branching by default, but it’s very easy to achieve. Effectively the only difference with Git is that I am forced to create a shared repository for every branch, even if it’s tiny. I prefer this because I don’t want shared repositories by default — only on large projects I am likely to have a lot of branches on. On my PC, I probably have around 50 Bazaar branches for tiny projects. I have two major projects which have a shared repository each, and a number of branches under that.

[Edit: Jakub points out that the cost of the repository is only one factor — there’s also the cost of the working tree, since every Bazaar branch has a separate working tree as well. I explain in the comments below how to work around this in Bazaar, but I admit it would be nicer if Bazaar had a simpler way of handling this.]

I haven’t used Mercurial, but this page indicates that the basic “clone” command does a full expensive branch, while the “branch” command does a cheap local branch. So this argument only applies to non-distributed VCSes.

Everything is local

No issue — only an argument against non-distributed VCSes.

Git is fast

Yes it is, but .. is this even worth mentioning? Local operations in any VCS are practically instantaneous. In my own experiments, I found that Git was around about 10 times faster than Bazaar. 10 times in this case meaning 0.02 seconds versus 0.2 seconds. I’m perfectly happy with a 0.2 second local commit time.

Remote operations are what takes the real time. And these are governed by the network latency, not the speed of the implementation. Despite the claim that Subversion is so slow it isn’t worth measuring, this assumes that all work is done locally. Of course, even DVCS users need to do remote commands some times.

The real issue here is with the ridiculous times listed for Bazaar. Maybe this was measured a long time ago when Bazaar was slower (I gather it’s been improved a lot). But really, 14 seconds for a bzr status or bzr diff? What changeset was this run on?? The page gives no details.

I repeated all of the experiments myself as closely as I could (I used the same Django repository). My results are attached to this post. I tried, as Scott suggested, adding 2000 files for testing the add, status, diff and commit commands. For Bazaar, add took 1.5 seconds, status and diff took around 0.5 seconds, and commit took around 3 seconds. In Git, add took 2.8 seconds (as Scott noted, Git add is slow for some reason [Edit: Jakub points out that git add is actually copying the files into the repository, not just intending to add them later]), status took around 0.3 seconds, diff didn’t show the added files, and commit took around 0.4 seconds. So Bazaar is slower, but it really won’t affect your work.

Also I suspect the branch figures suffer the same issue as the “Cheap local branching” issue — I assume the git branch of 1.16 seconds was a cheap local branch, while the bzr branch of 82 seconds was an expensive copy, because Scott didn’t use the “shared repository” feature. So this is comparing apples with oranges. I ran a branching test — without using shared repositories, git clone took 5.5 seconds; bzr branch took 31 seconds (though I should note that one can also copy a non-shared bzr branch using cp -a, which took 3.2 seconds). For creating a branch within a shared repository, git took 0.01 seconds, while bzr took 7.3 seconds. Again, Git is still faster, but Bazaar is nowhere near as slow as Scott is reporting.

[Edit: The bzr branch command does more than Git; it creates a separate working tree. You can avoid this by running bzr branch –no-tree, and work in the branch by ‘switching’ some other checkout to it, if you like. That should be faster.]

Git is small

[Edit: Scott has removed “bzr” from this category, but still reports the old Bazaar figures.]

This figure may have changed since Scott reported it, but Bazaar now has better branching formats. When I ran this test, I found Bazaar to be (just) smaller than Git.

For Git, the Django repository is now 27MB (Scott reported 24MB, presumably awhile ago). The entire directory is 53MB.

For Bazaar, the Django repository is 24MB (Scott reported 45MB). The entire directory is 50MB.

So Bazaar is now the title holder for smallest repository format.

I also wish to point out that because Bazaar offers a number of workflows, you can also use it in a “lightweight checkout” mode (i.e., non-distributed VCS). If you just want to do a quick checkout, you can use bzr checkout –lightweight, which creates a Subversion-like branch. You can do anything you can do with Bazaar, but like Subversion, any log, revert or commit actions are performed remotely. In this mode, the Bazaar metadata alone is 672KB, and the entire directory is 27MB.

The staging area

Here is another case of “Git is better than everything else because they don’t do it Git’s way.” Git has this “staging area” which means before you commit, you have to explicitly add each file (not just the first time you create a file, but every time you commit to a file). The advantage here is that you don’t have to commit your entire working changeset, you can just commit some of the files. You won’t accidentally commit changes to a file by accident, but then again, you could accidentally not commit a change you thought you were committing. Also handy is that you can stage only part of your changes to a file, so you don’t accidentally commit your debug prints, for example.

I personally think this is a very bad idea — I’m already prone to forgetting to add files I just created. I’m sure if I used git I would often forget to add every file I wanted to commit. But I can see why people like it. As Scott points out, you can opt-out using git commit -a. I like the idea of a) being able to selectively commit files, and b) being able to commit only part of a file’s changeset, but I think it should be opt-in, not opt-out.

Firstly, obviously, all VCSes, even CVS and Subversion, let you selectively commit files. You just have to explicitly list them on the ‘commit’ command-line. So that’s easy.

As for committing part of a file, Bazaar has a (relatively new) feature called the “shelf”. I can type “bzr shelve” which brings up an interactive screen, very similar to git add, which lets me say Y/N for each change to each file. Anything which I “shelve” is completely reverted (no longer in the physical file), but it’s stored on the “shelf” for later. So if I want to commit only part of the file, I write “bzr shelve”, shelve all the things I don’t want, commit, then write “bzr unshelve” to get them back. The unshelve is great because it does a proper merge with the file as it is now. This is more roundabout for Git users used to just selectively adding, but it’s more powerful, because I will often wish to shelve a change for a long time, perhaps even days. If I’ve got a “mini-feature” which doesn’t warrant a branch, but isn’t finished, I might just kick it out the way onto the shelf, do more work, then unshelve later (giving me a merge). In that sense, it’s almost like a mini-branch.

The point is, Git isn’t the only VCS with this feature, it’s just implemented differently elsewhere. Mercurial doesn’t seem to have shelving built into it, but there is a ShelveExtension which you can add to Mercurial to give it the same feature as Bazaar.

[Edit: I got some comments which show that the Git community sees partial commits as much more natural than shelve/commit/unshelve. I will state clearly that I strongly disagree with that view.

In my view committing something which is not *exactly* the current working tree is asking for trouble. If you run your code through a test suite and it passes, then you commit some but not all of your changes in the current working tree, then you are *committing untested code*. You may think that you are including only the important changes, but programs are complex. Some part of the code which you didn’t explicitly commit may actually be necessary for the other changes to work.

It’s not as easy, doing it the shelve/stash way, but this workflow is the only way to ensure you are committing tested code:
1. bzr shelve
2. Run test suite
3. bzr commit
4. bzr unshelve

Also, Git does have a shelve command as well – git stash. I would recommend that over partial commits.]

Distributed

No issue — only an argument against non-distributed VCSes.

Any workflow

This is probably my biggest gripe with the site. It only claims this is an advantage Git has over Subversion and Perforce, so it isn’t bashing Bazaar/Mercurial. But in my opinion, this is a weakness of Git. Git users have told me they are proud of their distributed-only model. In my experience, “Any workflow” is a major strength of Bazaar, which it wields over Git, Mercurial, Subversion and Perforce.

In the Bazaar manual is a list of workflows. This is the really cool thing about Bazaar. Basically, being distributed is awesome. Being able to work locally is great, being able to commit locally and send changes to a server later is great, not even having a server is great also. But those are just some of the workflows which I have in my everyday job. It turns out that, for me at least, most of the time I am using Bazaar like Subversion. I am working in a close team of a handful of people on a project, and we are all making close changes which could, from one minute to another, conflict.

We can’t afford to be each working on our own separate branches and occasionally push our changes to the server and see if they conflict. We all want to be working with the latest version at all times. That’s the good old fashioned Subversion model. Does this make me a bad DVCS citizen? I don’t think so… because at any time I can whip out a new branch, do a bunch of local commits, merge from trunk, then push. Or be working on the train, and do a merge when I get to the office. I do all of these things. With Bazaar, I can effortlessly switch between workflows, and I love it.

The basic feature Bazaar offers here is bound mode. If I enter bound mode (either by doing a “bzr checkout” instead of “bzr branch”, or by typing “bzr bind” at any time), my local branch is synched to a remote branch. I still have the full history locally, but if I do a commit, it will first check that my branch is up to date, then commit remotely first, and finally apply the commit locally. This “lock-step” development model is often perfect because we never have to do merges.

Despite what WGBTX says, Git doesn’t really offer a “Subversion-style” workflow. If you want to work that way, you have to commit locally, then try a push, and if someone has pushed since you last pulled, you must merge and (as style dictates) rebase.

GitHub

GitHub is great, but every open VCS has its own free development communities. Bazaar has Launchpad (apparently Scott took Bazaar off the list of things Git is better than for this category because Launchpad has a large community — all of Ubuntu for starters). Mercurial has BitBucket. Subversion has heaps — SourceForge and Google Code for starters. Perforce is proprietary crap so who cares.

I know Scott has already come under fire about bashing Bitbucket, and he retracted some comments. But Hg is still listed as a “Git is better than” in this category. Apparently not because it’s easier to get Git hosting than Mercurial hosting, but because “This social aspect of GitHub [that it has a larger community] is the killer.”

I personally don’t see that as a significant advantage — if people want to develop for your project, they won’t care what hosting service you’re using. And it isn’t an advantage of Git at all.

Easy to learn

Apparently a reason why Git is better than Perforce.

I disagree, but I’m very biased. As someone who switched from Subversion to Bazaar, I must say it was extremely easy. Getting my head around the branching and so on was tricky, but at least Bazaar gave me the gentle learning curve, since I could operate in bound mode and it felt exactly like Subversion.

This learning curve should not be underestimated. If you’re working with Bazaar, it’s very easy for someone with Subversion skills to join your project. You can just tell them, “do a checkout, stay bound, and just do everything the same way as Subversion.”

This simply isn’t true with Git, because you have to learn all about branching, merging, rebasing, using SHA-1 hashes for revision IDs, etc, on your first day. Showing that Git has the same commands as Mercurial is a pathetic argument, since they behave quite differently.

That’s it

Apologies if some of these words are harsh. It’s just that I’ve spent a lot of time defending Bazaar against the Gitsters. It really annoyed me seeing this frankly quite ignorant, or at the very least, out of date, website. I find that most people who’ve used Git try to convince me of how much better it is than Subversion. I tell them that I agree, but I use Bazaar. The problem is, they’ve never tried it.

I haven’t really said much about why Bazaar is better than Git in this post. I plan to do some follow-ups which are hopefully more technical than argumentative.

Please comment if you think I’ve got anything wrong. I’m not a Git/Mercurial user, so I’d be happy if you taught me something.

92 Responses to “Why Git Ain’t Better Than X”

  1. Disclaimer: I am user of Git.

    Cheap local branching

    Sharing data is only one part of “cheap local branching”. Another issue is in-place branch switching, although some would state that it is a matter of taste. Does Bazaar allow for this? Git and Mercurial both do. Without this each additional branch bring the weight of full checkout, even if repository data is shared.

    Also of issue is cloning and fetching multiple branches, and injecting of names remote branches into local namespace. Is it possible to fetch / push multiple branches at once (in one go) in Bazaar?

    Git is fast

    ‘git add file‘ does more than add in other version control system. It doesn’t just mark file to be included in next commit (to start tracking given file), it adds contents of said file to repository. The equivalent command to what other version control system understand as “add” would be ‘git add -N’, i.e. ‘git add –intent-to-add’. This would also solve the problem of ‘git diff’ not showing newly added files (‘git diff HEAD’ would show newly added files even without using ‘-N’ option).

    Creating a branch in 7.3 seconds in Bazaar, as compared with fraction of second in Git, means that branching in Bazaar is not cheap, at least in the sense of performance. Anything more than second is slow.

  2. Hi Jakub. Thanks for commenting.

    > Cheap local branching
    > Another issue is in-place branch
    > switching …
    > Does Bazaar allow for this?

    It isn’t the normal “Bazaar way”; the normal idea is that if you want to work on another branch, check it out in a different location.

    You can do it with a bit of effort, because Bazaar can have branches without working trees. So if you want to work like this, check out all of the branches you are interested in without working trees (bzr branch –no-tree, or if you already have a branch, bzr remove-tree). Then, for your main working tree, make a lightweight checkout of one of the branches (bzr checkout –lightweight). Now commits to the working tree will be pushed to the branch. Use bzr switch to make an in-place branch switch. This even works with uncommitted changes; they are merged with the target branch.

    I understand that’s a bit of work to set up. Any Bazaar people have a better way of doing it?

    > Git is fast
    Good point. That explains why add takes so long in Git. Anyway the purpose of the tests was just to show Bzr isn’t that slow; the author already showed large times for git add.

    > Creating a branch in 7.3 seconds in
    > Bazaar, as compared with fraction of
    > second in Git, means that branching
    >in Bazaar is not cheap

    Well it’s because you are making a new working tree as well. I assume if you made a bzr branch –no-tree, as I suggested above, it would be much cheaper.

    I don’t consider 7 seconds to be a significant overhead for something as relatively rare as creating a branch.

    Thanks for your comments. I’ll edit the blog post to reflect.

    • > I don’t consider 7 seconds to be a significant overhead for something as relatively rare as creating a branch.

      Well, creating a new branch is not so rare in the topic branch / feature branch workflow (where for example maintainer creates new branch for each patch series sent by email). I find 7.3 seconds a bit excessive. BTW is it 7.3 with repository data sharing between branches (repository data doesn’t need to be copied, even using hardlinking)?

      • Yes, unfortunately. That was 31 seconds to branch without a sharerd repository. 7 seconds with the repository data sharing between branches.

        I suspect this is as you pointed out (but I didn’t test before deleting my test checkout) due to re-checking out the working tree. So I suspect that a bzr branch –no-tree (or a bzr branch in a shared repo with –no-trees) would be much faster, and that’s the Git-style workflow.

        I found some official docs on how to use Bzr “Git-style”.
        http://wiki.bazaar.canonical.com/GitStyleBranches

      • Have you tried the new bzr-colo plugin ?
        It creates a repository in .bzr/branches and gives a few commands that let you use it quite easily (+integration with qbzr and others)

    • Branching is rare in most VCSs because it’s painful and tedious. In git it’s so fast and easy to create and merge branches that it doesn’t have to be rare at all. 7 seconds becomes a pretty big difference then.

      But maybe this is more a matter of git allowing a different kind of workflow.

      • “Branching is rare in most VCSs because it’s painful and tedious.”
        Come on. We were talking about Bazaar’s branching time in a very large project, including taking a separate checkout of all the files. If you are happy to work colocated (Git style; only one checkout at a time), then it will be much faster. But I am happy to wait the 7 seconds, and that is assuming a very large repository.

        On all projects I have worked on, all Bazaar operations have been instantaneous. I have spent longer typing this response to you than the entire sum total of all the times I have ever waited for a Bazaar branch operation to complete in my lifetime. Saying “ah well, I guess non-Git VCSes are just not cut out for branching that often” is a ludicrously Git-centric viewpoint. It’s exactly the reason I wrote this post: Git is not the only DVCS, nor is it the only good one.

  3. Also you can run bzr init-repo –no-trees, which will cause all branches in the repo by default to not have working trees. That seems to be geared to this git-style “switch branch” approach.

    So I still content Bzr suits any workflow (though it can be trickier to set up).

    Maybe there’s a way to get Bzr-style “multiple branches checked out at the same time” with Git — anybody know? Checking out multiple git repositories, one per branch, isn’t a valid solution because there’s no equivalent of a “shared repository”. Which is why Bazaar separates the concept of a repo and a branch.

    • Maybe there’s a way to get Bzr-style “multiple branches checked out at the same time” with Git — anybody know? Checking out multiple git repositories, one per branch, isn’t a valid solution because there’s no equivalent of a “shared repository”. Which is why Bazaar separates the concept of a repo and a branch.

      First, there is “git-new-worktree” command in ‘contrib/worktree’, which uses git-core worktree mechanism to have multiple working directories (multiple checkouts) from a single repository. It is recommended that each checkout correspond to different branch.

      Second, in Git you can share repository data among repositories (like Bazaar can share repository data among branches) using alternates mechanism. OTOH you would have then to take care when garbage collecting (e.g. using refs/borrows or similar mechanism).

    • Thanks for writing the article, Matt.

      Since you asked this question, Bazaar developers have been working on a solution; the chosen term for this style of working is “co-located branches”.

      It’s currently best implemented in a popular plugin, ‘bzr-colo’, which is in beta testing for inclusion in a future Bazaar release.

      • Thanks Ben. I have actually used bzr-colo (must have been after I wrote the above comment). It worked pretty well, even though it was slightly hackish. It’s been in testing for years. I wonder when it will get included? (I personally wouldn’t usually use it — I really like working on multiple branches simultaneously, but it’s good to satisfy the git-style use case.)

  4. His comments beat up the “cheap local branching” pretty well. The amount of time it takes me to make a branch in git is not measurable… therefore I do it very many times a day.

    Switching between branches doesn’t require me to change… everything. IDEs, editors, working directories for the multiple terminals I have running, copy configs around, etc…

    Everything is local in git unless you explicitly need otherwise. Here’s a real-world case I have: I want to build a patch that includes the difference between what I believe the upstream revision to be and the local changes. If it’s possible to do that without talking to an upstream repository, someone please please submit a buildbot patch better than this one — in git, I can make this whole thing happen on all builders without any git network communication (just buildbot passing around refs and a patch).

    Regarding speed, my data might be a little old, too, but my experience is that it’s different enough to be noticed.

    Size is a silly comparison. The weird branchy stuff makes it a bit difficult to do apples-to-apples, but someone set up an lp mirror of memcached just tracking a couple individual branches out of git. I asked git to clone only the master (1.4) branch and asked bzr to do the same thing from lp (assuming they know what they’re doing). That gives me 1.1MB of .git and 2.1MB of .bzr, though the original git repo has 828 revisions and the .bzr one only found 767. Not sure what they lost in their conversion.

    The staging section really shows a broken mental model. I don’t commit files, I commit changes. As such, I want to commit the things that make up the actual logical change I’ve got. Whether I use add -p or magit or a GUI tool to select the parts that are interesting, I craft the relevant parts into my commit and omit the irrelevant.

    If the guy’s worried about times he forgets to add files, that’s just as bad, but it’s OK. In git, we don’t consider our mistakes permanent; we fix them before we publish them to the world so other people don’t have to deal with them. (deal with them == have a harder time understanding why something was done a certain way, or when a change was introduced, or when a build broke during a bisect, etc…)

    git has github, gitorious, git.or.cz, beanstalk, and a few others. hg has bitbucket, google code, kiln, and a few others. bzr has launchpad… I’ve not seen it used elsewhere. That’s worrisome.

    Easy to learn is relative. Subversion is hard to unlearn. I find bzr really strange to work with. It reports sequence numbers and people like to brag about them, but you can’t actually communicate them without referring to a particular location (similar to hg). But nobody does. They say, “hey, there’s a bug in version 852”. Your 852 ≠ my 852 (though there’s magic here in the case of bzr). So you end up with these, so you have the revno (e.g. vcs-imports@canonical.com-20090930201056-85t5lnr6ais9no5g is my current version of twisted which is local revno 15303 — I made a commit and now I have dustin@spy.net-20100327054152-f3ga1ktezgfdcceb or 15304 — now you try…).

    The hashes can appear obscure, but they do not lie.

    • Thanks for your comment Gunni.

      First, as we’ve discussed a bit above, you can switch branches in Bzr without making a separate checkout. It’s just a bit tricky and isn’t the default workflow.

      In Bzr you can certainly make a patch against an upstream version without any network communication. Everything is local in all DVCSes.

      Lastly about the version numbering in Bzr. This is a common complaint, but it usually means the complainer is not “getting” Bzr. The version numbers are deliberately per-branch — remember in Bzr a branch is the highest unit of work, not the repository. So if I have a trunk and feature1 branch, I can’t just say “Look at revision 72,” I have to say “look at revision 72 trunk” or “revision 72 feature1.”

      It is *not* usually the case that different authors working on the same branch (with their own local copies) have different revision IDs. Whenever you synchronise with the server (with bzr push if unbound or bzr update if bound), it will also synchronise the revision IDs on the local and remote branches. So everyone working on the one branch can use the same revision IDs.

    • > The staging section really shows a
      > broken mental model. I don’t commit
      > files, I commit changes. As such, I
      > want to commit the things that make up
      > the actual logical change I’ve got.

      I really disagree here (and I can see this is just one of those fundamental things that Git people like and I don’t).

      In my view committing something which is not *exactly* the current working tree is asking for trouble. If you run your code through a test suite and it passes, then you commit some but not all of your changes in the current working tree, then you are *committing untested code*. You may think that you are including only the important changes, but programs are complex. Some part of the code which you didn’t explicitly commit may actually be necessary for the other changes to work.

      It’s not as easy, doing it the shelve/stash way, but this workflow is the ONLY way to ensure you are committing tested code:
      1. bzr shelve / git stash
      2. Run test suite
      3. commit
      4. bzr unshelve / git unstash (or however stash works in git)

      I am aware that Git people like to correct their mistakes by rewriting history. But that’s really not a good answer to a workflow which invites making mistakes in the first place.

      • The main advantages of index isn’t making commits by selecting which changes from working area to include (note that “git stash –keep-index” allow to check staged changes). The advantages are:

        1. Marking file (conflict) as resolved during merge/am/rebase conflict resolution
        2. Incrementally comitting, so you don’t see in “git diff” parts of changes which are already ready to commit (but still see in “git diff HEAD”)
        3. Comitting with dirty tree, e.g. with Makefile changed to contain next version, or with debugging turned on
        4. Partial commits, i.e. comiting only parts of changes in the working area

        See also In praise of Git’s index on Aristotle Pagaltzis blog.

        • OK I think that the git index PLUS git stash –keep-index is a nifty feature. But if you’re going to the effort of stash anyway, it’s not really any more effort than doing an explicit stash. Your (3) and (4) are only safe if you stash and test first.

      • Not everything stored in a repository is code. (And not every code has tests.)

        I kept my master thesis in a git repository, including code and text. The staging area allowed me to commit code and text separately or allowed me to make a commit that just fixed typos in the text that was separate from a new section I wrote.

  5. Perforce may be proprietary, but there is nothing in the open source world that can touch it in a few different areas. We use it in game deelopment because it’s fast as he’ll and can deal with huge repos with very little slowdown. Try hg, bzr, git, or svn on anything more than a few gigs including quite a bit of binary data and you’ll go insane. Yet with p4 this works very well.

    You admit to not knowing a thing about p4 yet readily bash it. If you’re anywhere near San Diego I’ll gladly give you a quick tour at the office.

    That said, for smaller personal projects I use hg. Git is a UI nightmare. Like a view into the brain of a madman. No thanks. Hg does the same shit and is simple and elegant. Nice.

    • Is the problem large repository, or large binary files in said repository? With large repositories one should take into account that which is single humungous “everything” repository in centralized version control system with partial checkouts should be split into many smaller repositories in distributed version control systems (in some cases using submodules). Note that partial (subtree) checkouts have its disadvantages wrt whole-tree commits.

      If it is about large binary files, take a look at git-bigfiles project. I think that Mercurial and Bazaar have extensions / plugins to better deal with large binary files.

      • Repos shall not be split, for then we couldn’t sync back in time and get matching code+data.

        It’s a huge repo, and it’s a bunch of binary files.

        Out of curiosity I tried both Hg and Git on our repo. In both cases it was pathetically slow compared to p4.

        My only point here is that git/hg/bzr don’t solve all problems. There are still a few things that p4 does considerably better.

        As a point of reference, my current game’s code+data is 7.5GB. I just did a “p4 sync” and it was about one second. I’ve seen games on modern consoles with 300GB repos in p4, and it handles it very well.

        BTW, the OP is ***way*** off-base in assuming p4 is as shitty as svn. Though even svn satisfied the needs of many projects, p4 is considerably better in almost every facet.

        Should I expect the author to make a correction about these false assumptions?

    • Hi Stephen. Thanks — you’re right. I shouldn’t be commenting on it since I haven’t used it. I didn’t say anything specific about it though. My intention was just that it’s proprietary and therefore it’s not relevant to me and the open source community. Someone later said that it’s free for open source projects, though, so… I don’t know.

      I edited the OP, striking out the offending lines.

  6. @Stephen Waits

    Perforce has been optimized for holding large amounts of binary data in addition to code. The recent DVCSs are designed quite specifically for code, and Git specifically for the Linux kernel, which is not huge but still pretty big, has a large number of contributors and changes very rapidly. Probably a lot of the DVCS developers would take principled exception to the idea of holding massive amounts of graphics data (game development) in the version control system; in their mind, it’s the wrong thing to do in the first place. Perforce has been developed in a different direction.

    To the OP: The Why Git Is Better Than X site was indeed made at a time when the latest Bazaar was slower than it is now. Git’s data structure is unique among the DVCSs, and there are some nice properties and some limitations that follow from that. Fortunately it seems that all three, Git, Hg and Bzr, now have sufficient momentum that their further development seems to be secured.

    I followed the Emacs project’s switch from CVS to Bzr in some detail last year. They chose Bzr ultimately for political reasons with the leadership of Richard Stallman. He argued that the three major DVCSs were similar enough in function that they pick the GNU project regardless of the technical details. The switch was held back for a time as a result, because Bzr had some performance problems and was about to change repository formats at the time. I thought Git would rather obviously have been a techinally more mature and better option for a large project such as Emacs. The CVS repo was already being successfully converted into the Git format, and I believe (I may be mistaken) the final conversion to Bzr went via Git, which should tell you something. That said, Bzr seems to now be serving the project reasonably well.

    • Oh. I didn’t realise Emacs used Bzr. But I suppose RMS seems to have something against Linus.

      I am aware of Bzr’s speed issues and data format instability in the past — I never used it then. But I gather those are all in the past.

      • “I suppose RMS seems to have something against Linus.”

        I doubt that was much of a factor. Bzr simply appeared good enough and, out of the options, it is the official GNU project, which makes it Stallman’s choice by default. The central Emacs developers didn’t seem to be particular DVCS enthusiasts at the time, so there was relatively little attachment to any of the options beforehand.

        As I understand, RMS wasn’t happy with the way Linus participated in the GPL3 process (giving negative comments in interviews without actually formally taking part) and they disagree on the GNU/Linux naming thing, but RMS probably has bigger disagreements with any number of other people.

      • “But I suppose RMS seems to have something against Linus.”

        You know, that comment says more about you than it says about RMS. RMS has nothing against Linus. That much should be obvious to anybody who has followed his work and political evangelism.

        • http://www.efytimes.com/e1/fullnews.asp?edid=31990
          “Well, you can see that [Linus Torvalds] is a person who doesn’t believe in freedom. You can tell that from his writings… We are trying to stop a practice where a manufacturer can change it, but the user can’t. Well, this is what Torvalds objects to. He is in favour of tivoization. He doesn’t care if the user of, in this case Linux, is free to change it.”

  7. Listen to Stephen — bashing Perforce w/o trying it just makes you sound dumb. Give a tool an honest go before bashing it.

    (And doesn’t Perforce give a free license to open-source projects? They may not be open-source, but they’re not hostile to open-source either!)

    Calling SVN the decade’s champion assigns undue status to SVN. It barely started to displace CVS before BZR, GIT, and HG came along, and it /never/ had the technical sophistication of Perforce.

    I’ve used RCS (much better than tarballs), CVS (absolutely amazing at the time), Perforce (very impressive, if somewhat complicated), SourceSafe (an abomination), ClearCase (bleah), SVN (bleah), Hg (aside from the Python headache, okay), and Git (at first unusable w/o Cogito, but now pretty good). I was never able to get arch to work, and so I’ve never bothered to check out its offspring BZR. I shall have to remedy that, I see.

    Can’t offer a decent opinion until I’ve used it ‘in anger’, even if only for a couple of months. 🙂

  8. Your post is begging for similar post “Why Bzr Ain’t better than Perforce”.

    “Proprietary crap” comment is childish and very closed-minded.

    The fact it has commercial license is orthogonal to quality of software.

    You don’t have experience with this VCS, so calling it outright “crap” shows that you’re not better than author of “Git better than X”. You make the same mistake of bashing anything that isn’t familiar to you.

  9. The shelve feature sounds more like git stash
    http://www.gitready.com/beginner/2009/03/13/smartly-save-stashes.html

  10. > diff didn’t show the added files

    Sounds like you are after git diff –cached, to see changes between the index and your last commit. By default, git diff compares your working copy to the index, once you’ve added all your changes, there is no difference between the index and the working copy, so git diff returns nothing. The other two interesting things to diff are working copy against head (git diff HEAD) or any other commit (git diff ).

  11. I haven’t used Mercurial, but this page indicates that the basic “clone” command does a full expensive branch, while the “branch” command does a cheap local branch. So this argument only applies to non-distributed VCSes.

    Hg’s “branch” command permanently stamps the commits with the branch name. They’re not totally local, then end up in the server when you push. They have something else called “bookmarks” to achieve git-style references.

    Also, about bzr’s “shelve” command: Git has one called “stash” that does about the same thing.

  12. Git’s equivalent to “shelve” is called “stash”. 🙂

  13. These are mostly good points. The performance issues are because the site is pretty old – Dec 08. I need to update it, I would love to do the tests again, but more likely I will just put the versions I used (if I can figure out what they were – I have all the data around here somewhere).

    I’m sorry it bugs you – these days I would probably not write the site, or it would more likely just be SVN focused. If I revamp the site, I will probably take all the DVCS references out. It was not a focus to piss off Hg and Bzr people, I just wanted to defend Gitsters from people thinking there were no good reasons. The problem, as you point out, is that most of these are subjective, not objective. I highly dislike Bzr and Hg branching options, and many do, as evidenced by both projects having Git-style branch plugins (Hg Bookmarks) or pages on how to do Git style branches (http://wiki.bazaar.canonical.com/GitStyleBranches). It doesn’t mean they are objectively better, just that I am (and many others are) addicted to them and that’s partially why I choose Git. It’s a valid argument that I like them better, not that they are objectively technically superior.

    The point of the site was an apologia against people who thought devs switched to Git just because it’s popular or some other shallow reason, when there are subjectively good reasons to like it over other systems (as you like Bzr over Git – it’s also entirely subjective, neither is an absolutely superior system).

    Finally, I take exception that the commands are similar is a “pathetic” argument, as I have talked to many people who used Git easily because they use Hg, and I was able to use Hg pretty easily because I used Git. The commands are very similar in how they work – the differences are minimal and generally pretty easy to figure out (I had some issues with Hg stuff that was confusing to me or didn’t work how I expected – they both have somewhat unintuitive stuff, but the point is that it’s not overly difficult).

    • Hi Scott. Thanks for your considered reply. I definitely understand that a) this site was written awhile ago when Bzr and Hg were not as featureful/fast/compact and b) it’s mostly about selling DVCSes over other ones. As I said, I agree that DVCSes are the way to go, and I think the site makes a great case for them.

      It’s clear from these discussions that the Bzr/Hg branch model does not sit well with Git people. They prefer to work in one place, and switch their working tree between branches. I don’t work that way, but it’s a matter of opinion. My point was just that other DVCSes let you work that way too. I disagree that the fact that Bzr lets you work “Git style” is a sign of people’s dislike for the Bzr-style branching model. Bazaar tries to offer as many workflows as possible.

      This discussion has convinced me about the merits of the Git style workflow — in some situations. I would like Bazaar to support it less hackishly in future (others have pointed out the bzr colo plugin but it seems like work in progress).

      Sorry about the last point. The “pathetic” comment was aimed at the fact that the argument seemed to rest entirely on the commands having the same names. From my limited experience with Git, I gather that commands such as “branch”, “merge” and “checkout” have entirely different semantics to other VCSes. The same goes for Bzr vs Svn, for example. The point is just that having the same set of commands doesn’t make it easy to learn.

      I appreciate your taking the time to address my criticism.

  14. Git’s concept of the staging area has the advantage that you can commit *parts* of the changes in a file: ‘git add -p’ asks you for every hunks separately, which often come quite handy. Likewise, on merges, the unconflicted parts are put into the index, while the conflicted ones aren’t. That way, you can diff the conflicts more easily.

    As for the other part: In git the unit of revision control is the branch as well (or rather: the commit including its ancestors). While git clone fetches all branches from the source, you can work around that to only pull a single branch. Push, too.

    I was very sceptic of in-place branch switching, too, as I always used separate sandboxes with cvs. But git makes switching much easier (just temporarily commit dirty files and switch), and up to one of the preceding comments, I didn’t even notice that it means that I just refresh my eclipse workspace instead of setting up and opening another (I never use eclipse on cvs mulitbranch projects).

    But what impresses me with git are the many helping tools like clean, bisect, describe, archive, or the pickaxe. Not needed for version control, but very, very useful. (Although I don’t know how other DVCS fare in that department; I probably will try merc one day; bzr doesn’t cause enough blips on my radar.)

    @Stephen: git has a very simple data model; once that is understood the commands become pretty obvious. But not before that. 😉

  15. > Git’s concept of the staging area has the advantage that you can commit
    > *parts* of the changes in a file: ‘git add -p’ asks you for every hunks
    > separately, which often come quite handy.

    That is exactly how bzr shelve works. It’s interactive, and lets you shelve each hunk individually. I have updated the blog post to address why I think partial commits are a Very Bad idea.

    > But git makes switching much easier (just temporarily commit dirty files
    > and switch),
    “Temporarily commit”? That sounds like a disaster…

    I agree that switching is good for not having to relocate your IDE and so on. But it seems like if you want to switch branches while you have no uncommitted changes, go ahead. If you want to have a set of working changes in multiple branches at once, you are *begging* to have the branches checked out in separate directories (then you can even have multiple IDE sessions open at once).

    As I’ve been saying all along, Bzr isn’t better because it has a better branching model. Bzr is better because it offers any branching model you like.

    • >“Temporarily commit”? That sounds like a disaster…

      It only sounds that way; it is exactly what git stash does.

      And with your stash/unstash sequence to ensure tested commits: One of the points of git that you *can* correct commits afterwards, so there is no point in checking everything beforehand. (Correction becomes much harder once you pushed it elsewhere.) That’s why merges automatically commit when there are no conflicts: Semantic conflict resolution can easily be amended, as can typo fixes in the commit message itself.

      • Stash is fine; a “temporary commit” sounds very bad.

        As I said above:
        “I am aware that Git people like to correct their mistakes by rewriting history. But that’s really not a good answer to a workflow which invites making mistakes in the first place.”

        • Just to be clear. There is no difference to git between a commit and a stash. A stash is implemented in terms of a commit, with some UI to let you stash/unstash very quickly.

          Continuously amending your latest commit can be a very common workflow, and there’s nothing bad or dangerous about it.

  16. While Git works on Windows, its speed is much slower than on Linux, still faster than Bazaar though.

    But Git’s support on Windows is still shaky. I hate Cygwin. There’s a second way to run Git on Windows, I forgot the name, but it’s not complete as yet.

    People, if you do not support Windows, I take it to mean that you are writing software for yourself rather than others. How else could you ignore the majority?! Thanks very much Torvalds!

    Regarding Mercurial:

    It may be better than Bazaar in some respects. But I hate all those guys who cannot even get the user interface right. I take it to mean that they do not even understand user requirements properly. If you can draw diagrams on white-board properly, you must be able to get user interfaces right, else please get back to the white-boards!

    And even though they say they handle renames correctly while blaming other VCS, it does not work right even with in the latest version. I cannot diff across renames.

    • There’s a second way to run Git on Windows, I forgot the name

      It’s msysGit (see http://git-scm.com/download)

      People, if you do not support Windows, I take it to mean that you are writing software for yourself rather than others. How else could you ignore the majority?! Thanks very much Torvalds!

      Git was written originally with the specific purpose of being DVCS for Linux kernel, so supporting MS Windows wasn’t even a question (originally). It makes haevy use of POSIX features (like e.g. mmap and pread) for performance. Also because of the way it was developed some high-level commands are still written as shell scripts, something which makes problem on MS Windows.

    • “How else could you ignore the majority?! Thanks very much Torvalds!”

      If you think Linus is worried about Windows users, think again. He probably wouldn’t have cared even if Git was never used outside of kernel development.

      Git was originally written by a Linux kernel programmer specifically to be a fast VCS. Obviously it is going to take advantage of the features of the Linux kernel. The essential feature of Git is the data structure, and as Linus himself has noted, Git can be re-implemented independently the same way Unix has been (there are projects attempting this). I wouldn’t be suprised if a more native Windows version appears.

      • I know. Git was designed by Torvalds for himself. This is why Git is better than X for him, but not for me.

        If and when native Windows version appears, I’ll be happy to reconsider it. For windows users, the question if Git is better than X would be a valid question only then.

    • I use msysGit daily. It works great. It is fast. If it is still ‘shaky’, I don’t notice it.

      • Josh,

        Thanks for letting me know that msysGit is working great for you. I was told by someone who tried that it is shaky so I did not try myself. I’ll give it a shot.

        Have you also used some GUI with it, like TortoiseGit? I am not a command line guy, so need the GUI functional. They said the GUI does not support all functions, but I can use command-line once in a while if the need be.

        Ray commented on the other hand that Bazaar had some issue with big repos (4 GB) while Git worked just fine. I had tested with about 200 MB without any issues but not with 4 GB. I do need something that can handle much larger sizes very well. So I’ll definitely give Git a try.

      • (Odd… you can’t reply to a message more than two deep…)

        I mainly use the shipping git gui and gitk. I also enjoy using Git Cola.

        I don’t like Tortoise* products. I prefer a GUI that can stay resident and has most/all of the functionality I need in menus and such.

      • @Josh & Alok

        Give Git Extensions a try for Git on the Windows platform (http://code.google.com/p/gitextensions/)

        I’m with Josh, I really don’t like the way the Tortoise* products work, so I was quite happy when I found Git Extensions.

      • I should read further down before commenting, I see you did give Git Extensions a try 🙂

  17. Side note:I have written, with the help of Steve Losh blog post comparing branching in Mercurial and Git (mentioned here), and with the help of people on #mercurial IRC channel on FreeNode, Git and Mercurial – Compare and Contrast (an answer on StackOverflow Q&A site).

  18. […] Why Git Ain’t Better Than X I’ve been aware of the website Why Git is Better Than X for some time, and it’s always irritated me. This […] […]

  19. FWIW, mercurial has a record extension which allows you to commit only certain changes from a file(s).

  20. Just a note on your comparison experiment: git diff wouldn’t show anything after an add because it diffs the working tree with the index. After a full add the index has been matched to the working tree, and is set for a commit of the current working tree state. To diff the working tree with the head of the current branch, you need to specify `git diff HEAD`.

    On a more general note, it would be great to see a comparison of these systems done by someone who is knowledgeable in each one’s use.

  21. 2 month ago, I was comparing hg, bazaar and git for my startup. We were using Subversion, and I want to change it to DCVS.
    Bazaar was chosen. At first, it was running smooth and well, learning curve was not so curvy :). But then when we want to store one of our 4GB project into bazaar repository, problem begin to rise. Bazaar simple crash in the middle of add operation. After seventh failure attempt, we begin to look other option, and that is git.
    In out first attempt, git repository successfully created for that project.
    Maybe my doing of things with bazaar was wrong, or there are something that I missed something in those attempt with bazaar. Even though I still prefer bazaar’s way of doing things, we’re using git until now.

    • Ray,

      I had tested Bazaar recently with a 200 MB working tree (but only about 200 files). I never tried 4 GB projects or having a very large number of small files.

      I wonder if you followed up the bug with Bazaar support. They generally reply within a day. I will be test-driving Git on recommendations from people on this blog. Hope its GUI is better than Mercurial, and diffs work across renames.

      • I will be test-driving Git on recommendations from people on this blog. Hope its GUI is better than Mercurial, and diffs work across renames.

        There are two things that you should remember about handling renames in Git:

        1. Git uses rename detection and not rename tracking. You need to explicitely request rename detection when running diff… or configure Git to do this automatically with ‘diff.renames = true’ (or ‘diff.renames = copies’ to also detect copies).

        2. Because Git does history simplification before rename detection, “git log -p” (or equivalent) would show renames, but “git log -p file” would not: you need to use “git log -p –follow file” to follow history of a single file across renames (and “–follow” implementation is not perfect, yet ;-)).

        See this patch in gitweb to see that in Git diff works across renames.

      • Thanks for the information Jakub. This is appreciated.

        I am trying to download Git+msysgit and found the following:

        1. Here is a quote from the Git Wiki page: “Running Git on Windows is working already, but it is recommended only to users who can fix issues themselves”

        2. All installers I find there (http://code.google.com/p/msysgit/downloads/list) are labeled either beta or deprecated. Are there any stable or release versions?

      • I was able to get Git running on Windows with mysysgit. I think I used the info on this site: http://www.lostechies.com/blogs/jason_meridth/archive/2009/06/01/git-for-windows-developers-git-series-part-1.aspx

      • Thanks James, this was helpful. I do not indend to use Git from command-line. But nevertheless this helped.

      • I couldn’t get GitExtensions to even detect Git after five different tries.

        TortoiseGit does not seem to allow any options to diff across renames. I tried putting diff.renames=true in config file inside .git folder. But this makes git misbehave.

  22. Being convinced by comments from people on this blog, I test-drive Git on Windows. Here is a summary of what I find:

    1. MSysGit was tested with both GitExtensions and TortoiseGit. Command-line was not tested. Latest version of everything was used.

    2. Ray had commented that Bazaar had issues with 4 GB repo, because of which they use Git now. I tested Bazaar with a newly created 16 GB repo (~10,000 files). Bazaar GUI seem to froze for several minutes on commit, but came back clean after the job was done. Add operation took about 8 minutes. I did not time commit but expect it to be another eight minutes (see below).

    3. Here are speed comparison on a smaller repo (350 MB, ~4000 files):

    Bazaar: (Add+Commit) = ~5 min
    GitExtensions: (Stage+Commit) = ~5 min.
    TortoiseGit: Add itself took 13 minutes. Did not text commit.

    Bazaar took about the same amount of time for Add and Commit. GitExtensions took about 4 minutes for Stage and about 1+ for Commit.

    TortoiseGit was updating its GUI after processing each and every file during Add. This is simply bad GUI design and not a slowdown because of GUI. Bazaar does not update the GUI in real-time like this — which on the negative side makes the GUI freeze for some time, but on the positive side, retains high performance.

    So no, on Windows and using GUI, Git is not faster than X.

    4. TortoiseGit does not support diff across renames. Jakub had suggested command-line method for this, but it is not supported by TortoiseGit.

    5. Even though I chose KDiff3 as the default diff tool with GitExtensions, I cannot get it to show me a visual diff with two panes showing old and new versions of the files. It says in the GUI that it can follow renames (Experimental feature), I could not figure how to make it work.

    6. Bazaar GUI is undeniably the best. With GitExtensions, the working tree is shown as a long list of files with their full path names. The “tree” part is gone.

    7. Repo size came out the same for all cases. But my test here is incomplete as I only tested initial add+commit. True test of repo size would come when file modifications becomes a sizeable fraction of the total repo.


  23. I tried putting diff.renames=true in config file inside .git folder. But this makes git misbehave.

    Alok, please note that .git/config file has ini-like syntax, so it would be key ‘renames’ in the ‘[diff]’ section… or use “git config diff.renames true” to set this config variable.

  24. Some discussion about the relative merits of Bzr on the emacs-devel mailing list:

    http://lists.gnu.org/archive/html/emacs-devel/2010-04/msg00195.html

  25. I am interested in trying Git Cola on Windows. Is there an easy way to install it? Or alternatively any step-by-step guide would help too.

    Here is what I find scary:

    The Git Cola download page provides binaries for Windows by need Python 2.6 and pyqt4 in addition to Git to be pre-installed. To get pyqt4, you must previously have installed SIP (whatever that is) from source. (They supply binaries for developers that includes the whole QT4 development environment, which additionally I am sure would be an older version.) I am sure this will be minimum of a two day project, plus any Google searches and forums for the problems encountered. I already heard somewhere that Python and PyQt4 both must be at the default installation locations “C:\Something” for things to work.

    Here is by the way the installation guide for Git Cola: http://cola.tuxfamily.org/install.html
    It would have been better to write the above in Python; I could then at least run it!

    Some referenced links:
    Git Cola Download: http://cola.tuxfamily.org/downloads.html
    Pyqt4: http://www.riverbankcomputing.co.uk/software/pyqt/download

  26. There is obviously a lot to know about this. I think you made some good points in Features also.
    Keep working ,great job!

  27. http://lucumr.pocoo.org/2010/4/3/april-1st-post-mortem/
    ”As much as I love bitbucket and mercurial, but there is an immense difference between having your project on github or bitbucket, and I’m afraid that no matter what bitbucket does or what the mercurial people do, they will never even come close to github in terms of user base people following your code and contributing.” — Armin

    • In my experience, you go where the code is. If you hear about some interesting project, it doesn’t matter whether it’s on github, bitbucket, launchpad or sourceforge. You just git/hg/bzr/svn clone/clone/branch/checkout the source, play with it locally, discuss bugs on the project’s bug tracker, push changes to the appropriate website, and discuss them there. Once you have discovered a project it shouldn’t matter where it is hosted (as far as the community size is concerned).

      Therefore, if I am publishing a project, the only difference the community size of the website makes is how discoverable the project will be on that site, and it seems like on a large site like github it will be hard for projects to be seen anyway. Aside from that, any other marketing effort will lead to the same results no matter where it is hosted. But I have not used GitHub.

  28. […] Why Git ain’t better than X, a reply to the above article by Bazaar user Matt Giuca. […]

  29. “Distributed version control systems (DVCSes) are awesome. We all know that, and if you disagree, you’re living in the 90s”
    Wrong and too harsh.The fact some people might not like DVCes does NOT imply they still live in the 90s.
    Don’t like these DVCes, Git in first place. These “DVCes” are far too much abstract ; moreover, it’s a chore to use when a dev team is strictly unable to separate the different things of a web project (like html templates and php).

    DVCes are not useful for everything, man !

    • Well I said that because a proper DVCS is a superset of a centralised VCS. They have all the features of a normal one, plus the advantages of working offline, proper branching, etc. That isn’t to say that all DVCSes are better than all centralised VCSes. You say you don’t like Git in particular — that was really the entire point of my post. Git is turning people off DVCSes because it is often marketed as an “upgrade” from Subversion. I don’t like Git. I find it far more confusing in many ways than Subversion: no revision numbers and no way to automatically commit to a central repository for starters. The point of my post is “don’t let Git turn you off DVCSes — try another one”. Bazaar, my preferred DVCS, lets you work exactly like Subversion if you want to.

      The one argument Subversion has over all known DVCSes is that it lets you check out part of a repo, whereas DVCSes force you to check out the whole thing. Maybe that’s what you’re alluding to when you say “when a dev team is strictly unable to separate the different things of a web project.” This is tricky to do in a DVCS since each client needs a full copy of the repo (by definition), and it would be messy to let clients commit to part of a repo. But do you have any good reason to check out only the templates and not the PHP, or vice versa? In the vast majority of cases, you want to check out everything. The only reason you may want to check out part of a repo is if you have multiple projects stored in the same repository, as is common in Subversion. That’s easy to solve: don’t do that. Create a separate branch for each project.

    • There are reasons for one to prefer centralized version control system over distributed version control system, but being “unable to separate the different things of a web project (like html templates and php)” should be not it. Just put everything together in one repository, as it is one project (and changing template might require changing PHP code), and if DVCS supports it (sufficiently modern version of Git does), checkout only required part (so called partial checkout).

  30. You don’t sya nything about your 0.02 vs 0.2 seconds measure. Not a word about how many files, their nesting level and the total capacity.

    • No to mention that it’s a full order of magnitude difference.
      Maybe it doesn’t seem like a lot with 0.02s vs 0.2s (thanks to our brains’ perception of such small time measurements), but try making that argument with 20s vs 200s.

      Also to the author about “forgetting” to add changes to the commit (something I’d fall victim to often): there’s the ‘-a’ flag.

      • My point was that you don’t see 20s vs 200s scenarios (or at least, not with enough frequency to base the decision of which VCS to use on). In day-to-day life, your operations are so fast that it doesn’t matter.

        I am quite aware of the -a flag. But my point was, why not make it the default? You admit you fall victim to it often. Wouldn’t you rather not fall victim to it by having a VCS that automatically commits all of your changes? A friend of mine came all the way into work one day, sat down at his computer, typed in “git pull” and then went “fucking git. Going home, see you guys.” He had committed his changes at home, but forgotten -a, so he had to go all the way home to get the changes off his computer. This is not a good default.

  31. I don’t want it to commit all of my changes. I want it to commit only those changes that I want in that specific commit. I’m very likely to have local changes that are not meant to be committed.

    It’s very easy to get a list of all changes that you might want to commit. And if you happen to be in a situation where you want to blindly commit everything, then you can easily do so with -a. But personally, I prefer commits to be small and meaningful, and not a big pile of unrelated crap.

    • You can have small and meaningful commits with any VCS. The problem with git index is that it introduces another layer of complexity, that take useful brain cells away from solving the problem at hand.

      Example 1:

      You modified 5 files (2 git add’ed), created 3 files (1 git add’ed). If you issue git commit –all now, what does it do?

      Example 2:

      You carefully crafted the index, and are now reviewing it with git diff –cached. You notice a brainfuck that should be fixed in this commit. You go and make the change. Now what?

      There are reasons why the output of git status contains a bunch of git-related tips, while the output of hg status doesn’t. The rest of the world do just fine without the –all/–cached flags.

  32. Example 1: What you request it to do. The same as with ‘vi file; rm file’.

    Example 2: ‘git add -p’, which is also the showcase for the index. Meaningful commits are easy as long as items that need to be committed separately are in separate source files. As soon as they don’t you’re lost in any other VCS and need to do manual shuffling.

    The reason that git status shows tips and hg status doesn’t is that hg is a hammer, while git is a toolset. 🙂

  33. I am very glad to read this post. Not only because it replies to an immature text, but because I discovered a lot more about Bzr, too. I am a “mercurial guy” now but my first version control system was bzr – even before subversion! – and I really like its ease of use.

    More importantly, I am really, really happy to discover the shelve functionality. When I learned about git’s stage area I looked for an equivalent in Mercural – the record extension – and find it great. However, with little time I discovered the big risk of committing untested changes. Now I discover the shelve feature and will use it a lot, as well as git stash.

    OTOH, I would say you are starting to be annoying when criticizing SVN and (specially) Preforce, but I understand your situation. Anyway, you avoided to become a “Why Bzr is better than X” kind of post, which is enough.

    BTW, it is great to have this Scott Chacon comment in hand when Gitsters start to annoy us.

    Congratulations, and thanks!

    • Hey Brandizzi. Thanks, I’m glad you got something out of it. You’re right, I shouldn’t have been so critical about other platforms. I do feel like SVN is a generation behind the DVCSes (not its fault — it was built a generation earlier). I shouldn’t have criticised Perforce since I haven’t used it. But I gathered that it has no DVCS (local branch) capabilities.

  34. Nit: It is sort of silly to think of Named Branches in Hg as anything more than a *name permanently associated with a branch when the branch is committed*. A branch is a branch (identified with a Node ID) and nothing more; it just happens to be associated with a “name”. While some of the commands operate depending on the “current named branch” one is in, it is important to take a step back and realize the name is only for us — humans — as Hg really doesn’t care otherwise. (I have unfortunately just started with Git and do not have Bzr experience). In summary: a “named branch” in Hg is cheap because it just changes the embedded tag that is committed as part of a branch.

    • Note that so called “named branches”, which in fact are “commit labels”, can because of distributed part of DVCS and lack of global naming authority form disjoint and unconnected coloring of DAG of revisions.

      Just saying…

      • Oh, I’m not defending their current implementation (I find them useful, but as pointed out, they are not as awesome as they could be) …

        … I’m just saying using “named branches” is *different* than creating a “cheap copy”, which would be a same-filesystem Hg clone (that *initially* uses hard-links), and therefor is misleading in the article. (A branch is a branch in Hg, no need to complicate it: I have no idea how bzr handles branching.)

        I really like the term “commit labels” as the term “named branches” is loaded and can make it hard to explain Hg due to underlying implicit assumptions. (e.g. “Oh, like SVN branches?” No, not at all 🙂

        • I don’t know what you mean when you say “a branch is a branch” — that doesn’t add much. A Subversion user could say “a branch is a branch” and they would be referring to something completely different to what you are.

          What I dislike about branches in Git (and I’m not sure how Mercurial handles them) is that they are inside the repository, which means that they can’t be treated independently (such as having one branch in one directory, or having another in another). Git branches don’t have unique URLs — the repository has a URL and then there is some extra syntax to make sure you are referring to the correct branch name.

          In Bazaar, branches are simple: they are the top-level unit of revision control. There is nothing (conceptually) “outside of” a branch to worry about. Each branch has a separate URL, and resides in a separate directory. When you pull from Bazaar, you are pulling one branch. Thus it is absolutely clear when you say “bzr push ” that you are pushing the current branch you are in to the branch named by . I am still scared in Git when I type “git push” that it is going to push all of my private branches to the server that I am not ready to make public.

          On top of this, Bazaar has what Jakub calls “commit labels” (and I too like that terminology) — each commit permanently records the name that the branch had when it was committed, but it is nothing more than a simple text string, and doesn’t really have anything to do with the branch structure.

          • First, with the help of git-new-worktree script from contrib you can in Git have different branches checked out in different directories… though you better not have multiple non-readonly checkouts of the same branch.

            Second, I’d rather have a way to clone and fetch whole repository, with all branches, or selected subset of branches. This is more important. Unique URLs are not that important (Cogito, former once-existing interface / frontend / alternate porcelain supposted such URLs, with ‘#’ to separate branch from repository, but it didn’t took when it got absorbed into Git itself).

            If you want to give single branch or single signed tag, send a pull request via git-request-pull. The receipient can just copy and paste single line to their “git pull” commandline.

            Third, the default in Git is to push “matching” branches. This means that if you do “git push”, you would push only those branches that are already made public because you pushed them explicitly (“git push /repo/ /branch/”) earlier. You can always configure Git behavior so that e.g. it pushes only current branch.

            Fourth, branches as top-level unit of revision control doesn’tmake sense. It is repository that contains collection of branches. Different clones / forks of the same repository can have different set of branches, and the same branch can have different names in different repositories: for example ‘master’ in Joe Contributor’s repository will be remote-tracking branch ‘joe/master’ in maintainer repository.

            Fifth, “commit labels” as a way to mark branches doesn’t actually make sense. Registering branch names, hardcoding it for the future in the commit object is idiotic – in most cases branch names are transient. If you really want to record branch name or equivalent, do it during merge like Git can do (merging of singed tags, just in).

  35. Every time I try to use Git, I come away hating it even more. I think it instills a lax attitude towards repository management, which may be related to what you say about the staging area.

    One of the pluses of Mercurial, in my opinion, is that you very quickly learn that what you do has only a limited window in which you can undo it. You learn to check carefully before you do what comes next. Git, on the other hand, goes ahead and does everything, perhaps blazingly fast, then gives you a hint of how to maybe undo what you just did blazingly fast if it was an accident. If anything, we need to slow down. I mean, really, just drink less Mountain Dew. Take your time and make sure it is right.

    The real killer is mentioned on the Bazaar homepage, which is comparing the help messages. Git’s look like lengthy old typewritten mathematical monographs, and I think they are saying the same thing, though heck if I know. Bazaar’s help messages are really, really badly written, but at least are much simpler than Git’s. Subversion, Mercurial, and Darcs are better; they understand what “help” means: conciseness.

  36. I find Git overkill for the “1 team” projects I am typically involved in. With Git I spend a lot of time sorting out problems during push and pull (keeping repositories in sync.) Everyone quickly gets into the habit of regular pulls to avoid the problems – how “distributed” is a system that requires regular communication with an upstream repository anyway?

    Git is great for “distributed development” like the teams that develop Apache, etc. If you’re a “1 team” kind of place or you can organize yourselves in some general way, avoid the headaches that come from keeping multiple repositories in lock-step and use one of the simpler systems like Subversion.

  37. Good article. Glad a few people in some places are standing up for Bazaar!

    It’s not a rare thing that I have to explain to some Git fanboy what Bzr actually is, and that it’s actually been around roughly the same time as Git (it’s predecessor a little longer actually). I mean, Git is only the official VCS of Ubuntu and a good deal of Linux software. In fact, it’s also the official VCS of the GNU project these days, lest people forget!

    You might like this article too:

    http://steveko.wordpress.com/2012/02/24/10-things-i-hate-about-git/

    I don’t like to bash other people’s preferences for the sake of it, but this post provides a number of excellent reasons why I eschew Git in favour of Bzr, and have for some time, despite understanding pretty well what Git is, how it works, and why it’s so popular in the general community.

  38. I agree with your complaint about the staging area, it’s the single most braindead idea in the git-world. If you want to selectively commit, just specify what, works with every VCS as you mentioned.

    The -a flag to the rescue… lol… not. It is a lousy hack, and most of the people use it all the time (so shouldn’t that be at least default, then?). Also, to make -a useful, meaning in order to not add absolutely everything, including brand new files, .gitignore is then usually created and checked-in, everywhere. That is, sometimes even in projects that don’t use git, because someone else wants you to as they use git as a client. Isn’t that kinda pointless to have files like .gitignore checked in, at first hand, especially when others just ‘export’ your repo and don’t even intend to work with it?

    Also, the staging area is basically what p4 does with ‘edit’, a huge pain in the neck – welcome back, git, to the style of a program of the 90s. I rather have my CPU figure out what files changed, than doing that manually all the time.

    • Selective commit (i.e. “git commit file1 file2”) is only a limited subset of what the staging area (the index) allows. See e.g. http://tomayko.com/writings/the-thing-about-git how “git add –interactive” (and “git stash save –keep-index” to test) allows to deal with the tangled working copy problem.

      You are wrong, “git commit -a” doesn’t add new files – it means that git figures which of tracked files changed, and stages them. With “git commit -a” you can forget about the staging area… till you need it.

    • I hate to feed trolls, but I want to clear up active disinformation. You either don’t know what you’re talking about or you don’t care.

      First of all, the commit command’s ‘-a’ flag will not add ‘absolutely everything’. That’s made quite clear in the reference manual. It will only stage files that git already knows about that have been changed or deleted. So it doesn’t touch anything it doesn’t know about. You still have to add unknown files explicitly, just like in any other VCS.

      Second, if no one in the project is using git you can, obviously, decide to delete the .gitignore file. If someone then wants to use git privately on the project, they can have their own private ignore file on their own copy of the repository, which is stored away from and never committed into the working tree.[1] So again, if you never intend to work with git, you never have to check in a .gitignore file.

      Third, the staging area doesn’t make you ‘manually’ ‘figure out what files changed’, it gives you the power to decide which changes will go into the next commit, up to and including at a line-level granularity. What files changed have _already_ been figured out automatically by git, you use the staging area to compose the next commit. Of course, which brings us back to the ‘-a’ switch when you don’t need to carefully compose the commit and just want to commit all the changes.

      [1] https://help.github.com/articles/ignoring-files#explicit-repository-excludes

Leave a reply to pstickne Cancel reply