Quantcast
Channel: Planet Object Pascal
Viewing all articles
Browse latest Browse all 1725

Delphi Code Monkey: What am I missing out on if I'm not using Distributed Version Control?

$
0
0

As requested, I'm posting my thoughts about Centralized version control versus Distributed version control.

 Centralized version control was, in its day, a wonderful thing.   It is dated now, and for many teams, the benefits of moving to a distributed version control system could be huge.

But I Want One Repo To Rule Them All!

Fine. You can still have that.  You can even prevent changes you don't want to get into that One Repo from getting in there, with greater efficiency.  Distributed Version control systems actually provide greater centralized control  than central version control systems.    That should sound to you like heresy at first, if you don't know DVCS, and should be obviousto you, once you've truly grasped DVCS fundamentals.

Even if having a real dedicated host server is technically optional, you can still build your workflow around centralized practices, when they suit you.

Disclaimer: There Are No Silver Bullets.  DVCS may not be right for you.

Note that I am not claiming a Silver Bullet role for this type of tool.  Also, for some specific teams, or situations, Centralized version control may still be the best thing.  There is no one-size-fits-all solution out there, and neither Mercurial nor Git are a panacea.  They certainly won't fix your broken processes for you.   Expectations often swirl around version control systems as if they could fix a weak development methodology.  They can't.  Nor can any tool.

However, having a more powerful tool or set of tools, expands your capabilities. Here are some details on how you can expand your capabilities using a better, and more powerful set of tools.

1.  Centralized version control makes the default behavior of your system that it inflicts your changes on others as soon as you commit them yourself.   This is the single greatest flaw in central version control. This in turn can lead to people committing less often, because they either have to (a) not commit, or (b) decided to just go ahead and commit and inflict their work on others, or (c) create a branch (extra work) and then commit, and then merge it later (yet more extra work).   Fear of branching and merging is still widespread, even today, even with Subversion and Perforce, especially when changes get too large and too numerous to ever fall back on manual merging.  I recently had a horrible experience with Subversion breaking on a merge of over 1800 modified files, and I still have no idea why it broke.  I suspect something about '--ignore-ancestry" and the fact that Subversion servers permit multiple client versions to commit inconsistent metadata into your repository, because Subversion servers are not smart middle tier server, they're basically just dumb HTTP-DAV stores. I fell back to manually redoing my work, moving changes using KDiff3, by hand.    With Distributed Version control, you can take control,  direct when changes land in trunk, without blocking people's work, or preventing them from committing, or forcing them to create a branch on the server and switch their working copy onto that branch, which in some tools, like Subversion, is a painful, slow, and needlessly stupid process.

2.  Centralized version control limits the number of potential workflows you could use, in a way that may prevent you from using version control to give your customers, and the business, the best that your team can give it. Distributed Version Control Systems encourage lightweight working copy practices.  It is easy to make a new local clone, and very very fast, and requires no network traffic.  Every single new working copy I have to create from Subversion uses compute and network resources that are shared with the whole team.  This creates a single point of failure, and a productivity bottleneck.


3.  Centralized version control systems generally lack the kind of merging and branching capabilities that distributed version control systems provide.  For example, Mercurial has both "clones as branches", and "branching inside a single repo".    I tend to use clones and merge among them, because that way I have a live working copy for each and don't need to use any version control GUI or shell or command line commands to switch which branch I'm on.     These terms won't make sense to you until you try them, but you'll find that having more options opens up creative use of the tools.  Once you get it, you'll have a hard time going back to the bad old days.

4.  For geographically distributed teams, Distributed Version control systems have even greater advantages.  A centralized version control system is a real pain over a VPN.  Every commit is a network operation, like it or not.  DVCS users can sync using a public or private BitBucket repository at very low cost, and don't even have to host their own central servers.


5. For open source projects, Distributed Version control permits the "pull request" working model.  This model can be a beneficial model for commercial closed source projects too.   Instead of a code review before each commit, or a code review before merging from a heavyweight branch, you could make ten commits, and then decide to sync to your own central repository.   Once the new code is sitting there, it's still not in the trunk until it is reviewed and accepted by the Code-Czar.

6.  For working on your own local machine, if you like to develop in virtual machines, having the ability to "clone" a working copy quickly from your main physical machine down into a VM, or from VM to VM, using a simple HTTP-based clone operation, can really accelerate your use of VMs.  For example, my main home Delphi PC is a Dell Workstation that has Windows 8.1 on the Real Machine, and runs Hyper-V with a whole bunch of VMs inside it. I have most of the versions of Windows that I need in there. If I need to reproduce a TWAIN DLL bug that only occurs in Terminal Server equipped Windows Server 2008 R2 boxes, I can do it. I can have my repos cloned and moved over in a minute or two.  And I'm off.

7.  Rebasing is a miracle.  I'll leave this for later. But imagine this:  I want to commit every five minutes, and leave a really detailed history while I work on some low level work.  I want to be able to see the blow-by-blow history, and commit things in gory detail.  When I commit these changes to the master repo, I want to aggregate these changes before they leave my computer, and give them their final form.  Instead of having 80 commits to merge, I can turn it into one commit before I send that up to the server.

8.  Maintaining an ability to sync fixes between multiple branches that have substantial churn in code over time, is possible, although really difficult.  By churn, I mean that there are both "changes that need merging and changes that don't".  This is perhaps the biggest source of pain for me with version control, with or without distributed version control systems..  Imagine I'm developing Version 7 and Version 8 of AmazingBigDelphiApp.   Version 7 is running in Delphi 2007, and Version 8 is running in Delphi XE5, let's say.  Even with Distributed Version Control (git or mercurial), this is still really really hard.   So hard that many people find it isn't worth doing.  Sure, it's easy to write tiny bug fixes in version 7 and merge them up to version 8, unless the code for version 8 has changed too radically. But what happens when both version 7 and version 8 have heavy code churn? No version control system on earth does this well. But I will claim that Mercurial (and Git) will do it better than anybody else.   I have done fearsome merge ups and merge downs from wildly disparate systems, and I will take Mercurial any day, and Git if you force me, but I will NOT attempt to sync two sets of churning code in anything else.  I can't put this in scientific terms. I could sit down with you and show you what a really wicked merge session looks like, and you would see that although Git and Mercurial will do some of the work for you, you have to make some really hard decisions, and you have to determine what some random change is that landed in your code, how it got there, and whether it was intentional, or a side effect, and if it was intentional, if it's necessary in the new place where it's landing.  If it all compiles, you could go ahead and commit it. If you have good unit test coverage, you might even keep your sanity and your remaining hair, and your customers.

9.  Mercurial and Git have "shelves" or "the stash".  This is worth the price of admission all by itself. Think of it as a way of "cleaning up my working copy without creating a branch or losing anything permanently". It's like that Memory button in your calculator, but it can hold various sets of working changes that are NOT ready to commit yet, without throwing them away either.

10.  Mercurial and Git are perfect for creating that tiny repo that is just for that tiny tool you just built.  Sometimes you want to do version control (at work) for your own temporary or just-started-and-not-sure-if-the-team-needs-this utility projects without inflicting them on anybody else. Maybe you can create your own side folder somewhere on your subversion server where it won't bother anybody, or maybe you can't do that.  Should you be forced to put every thing you want to commit and "bookmark" up on the server as a permanent thing for everybody to wonder "why is this in our subversion server?". I don't think so.

11.  Mercurial and Git are perfect for sending a tiny test project to your buddy at work in a controlled way.  You can basically run a tiny local server, send a url to your co-worker and they can pull your code sample across the network.   They can make a change, and then they can push their change back. This kind of collaboration can be useful for training, for validation of a concept you are considering trying in the code, or for any of dozens of other reasons.    When your buddy makes a
change and sends it back, you don't even have to ask "what did you change" because the tool tells you.


Bonus: Some Reasons to Use Mercurial Rather than Anything Else

TortoiseHG in particular is the most powerful version control GUI I have ever used, and it's so good that I would recommend switching just so you get to use TortoiseHG.  Compared to TortoiseSVN, it's not even close.  For example, TortoiseSVN's commit dialog lacks filter capabilities.    SVN lacks any multi-direction synchronization capabilities, and can not simplify your life in any way when you routinely need to merge changes up from 5.1-stable to to 6.0-trunk, it's the same old "find everything manually and do it all by hand and hope you do it right" thing everytime.

Secondly, the command line. The Mercurial (HG) command line kicks every other version control's command line's butt.  It's easy to use, it's safe, and it's sane.

SVN's command line is pathetic, it lacks even a proper "clean my working copy" command.  I need a little Perl script to do what the SVN 1.8 commandline still can't do. (TortoiseSVN's gui has a reasonable clean feature, but not svn.exe). Git's command-line features are truly impressive. That's great if you're a rocket scientist and a mathematician with a PhD in Graph and Set Theory, and less great if you're a mere human being.   The HG (mercurial) command line is clean, simple, easy to learn, and even pretty easy to master.  It does not leak the implementation details out and all you should need to evaluate this yourself is to read the "man pages" (command line help) for Git and Mercurial. Which one descends into internal implementation jargon at every possible turn? Git.
 I've already said why I prefer HG to GIT,   I could write more about that, but I must say I really respect almost everything about GIT.  Everything except the fact that it does allow you to totally destroy your repository if you make a mistake. That seems so wrong to me, that I absolutely refuse to forgive GIT for making it not only possible but pretty easy to destroy history. That's just broken.

Side Warning


Let me point out that you need a backup system that makes periodic full backups of your version control server, whether it is centralized or distributed.  Let me further point out that a version control system is not a backup system.  You have been warned. If you use Git, you may find that all your distributed working copies have been destroyed by pernicious Git misfeatures.  If you choose to use Git, be aware of its dark corners, and avoid the combinations of commands that cause catastrophic permanent data loss. Know what those commands are, and don't do those things.

 Get Started: Learn Mercurial

If you want to learn Mercurial,  I recommend this tutorial: http://hginit.com/

I really really recommend you learn the command line first, whether you choose to learn the Git or the Mercurial one. Make your first few commits with the command line.  Do a few clone commands, do a few "show my history" commands (hg log), and a few other things. If you don't know these, you will never ever master your chosen version control system.  GUIs are great for merges, but for just getting started, learn the command-line first.  You're a programmer, darn it.  This is a little "language", it will take you a day to learn it.  You will reap benefits from that learning time, forever.






Viewing all articles
Browse latest Browse all 1725

Trending Articles