Wednesday, March 4, 2009

Subversion Best Practices: Branching and Merging (because we've been doing it wrong all along)

Despite having read the chapter on branching and merging in the svn book, and spent a lot of time thinking about how it should be done, the subversion repository structure at both my current job as well as my previous one have left something to be desired. Branching and Merging never quite worked as nicely or smoothly as it was supposed to. Eventually we just looked at how the Apache project did it and the correct method became clear to us and although we are not suggesting exactly what they're doing (they have active development happening in several branches) it took looking at their project to make the light go on.


The Old Way

  1. Trunk was supposed to mirror what was in production, although we were never actually able to achieve this which sparked the many discussions about how else we could do it.
  2. All Development happened in a numbered branch and a new branch was created for development every time we wanted to push to testing in preparation for staging (which made for messy SVN logs and much difficulty with Merging caused by confusing ancestry).
  3. Bug fixes would be made to the old development branch which would then require us to either make the same bug fix to the new branch or merge the old branch up and then back down into the new branch or even a merge directly across between the 2 branches, all of which added to the mess in some way causing conflicts or ambiguous ancestry.
  4. Tags were made when something was pushed to staging/production but it was unclear how/when these tags should be created.
  5. It was confusing to explain the branching procedures to new people and difficult to enforce/maintain.

The New Way
  1. Development occurs in the trunk. Trunk is ultimately the development branch itself. The majority of development will occur in the trunk.
  2. Branching occurs when:
    1. We want to create a release of current feature set. This can be done as a Tag, or a branch and a Tag. The latter method would create a branch for doing bug fixes and new tags would need to be created for that branch every time a set of bug fixes are pushed as a release.
    2. Development of a new feature would seriously disrupt normal development
  3. Release branches will be pushed to Testing, then to Staging and finally production. Bug fixes to these branches will be done within the branch and then merged up into the trunk. This way bug fixes are always included in the trunk (dev) and there is no messy merging across or down from trunk into a current dev branch.
  4. Every set of fixes/changes made to the old branch are tagged so we know which version contains which set of fixes.
  5. Easy to explain to new people.
  6. Easy to maintain.

10 comments:

  1. This is great! We have just come to similar conclusions with regards to keeping the flow on the trunk and branching code meant for testing.

    Since you wrote this nearly 2 years ago, how did it pan out?

    ReplyDelete
  2. It has worked out quite well. We do most development in the trunk but create branches for features requiring major changes. We create tags for each release and if bug fixes are needed in an older release we branch off of the release tag and make the fix there and then if we also need that fix in trunk we merge those changes into trunk to keep it up to date. Occasionally we have to merge changes between branches to prevent the branches diverging too much when development on a feature takes a long time.

    ReplyDelete
  3. Ya I find the trick is to really think about backwards. So rather then thinking about how you're gonna get your code from trunk to you different environments, like dev,qa,prod. First think about where you need that code. So is this a bug fix that we need in production, then put it in there and merge it backwards. Is this a new feature thats not in production yet? Then put it in the trunk or maybe a certain tag version and merge it from there.

    Typically your tagged versions are "clean" version, meaning they only contain the latest clean code, so you can usually safely add code there and merge it outwards without conflicts.

    A good description of SVN best practices.

    http://www.duchnik.com/tutorials/vc/svn-best-practices

    ReplyDelete
  4. Thanks for the link. It's a nice resource that covers a lot of different cases in SVN branch management.

    ReplyDelete
  5. IMO/IME there are several things you are missing here.

    Developers on any size team should be branching EVERY feature and bug you're currently working on..and giving it a name that not only you understand but that is the name of the feature or is the bug number or whatever..something that others can figure out what it is if they're browsing the repo.

    And so another thing I disagree on. IMO/IME the decision to branch should not be an "if the changes or feature or bug is big enough to disrupt the existing codebase/trunk" mentality. When you branch --everything-- you work on, you don't have to worry then about for example if a lead comes to you midstream and asks you to switch tasks and work on a different feature or bug. You've already got the others you're concurrently working on isolated into individual branches locally so all you have to do is naturally create a new branch for whatever new bug or feature they're asking you to switch to. And you don't have to worry about ok now that small feature or bug I thought was only gonna take a few lines of code ended up disrupting more than I thought or ended up being more code than I thought..because it's already in a branch, you have nothing to worry about.

    You can't trust your assumptions no matter how much you are sure that bug fix or whatever is definitely gonna be only x lines of code and you're 100% confident it's not gonna break anything when you make the changes. Just branch what you do, big or small and you don't have to worry later if your assumption was wrong, ok now I have to get my trunk copy into a branch....that's a bunch of extra unneeded work that you could have prevented if you just sucked it up and branched it in the first place as a habit / best practice.

    As long as you follow best practices which means you update trunk changes up to your branch DAILY or even more per day, you never have to worry about any merge hell in the end after code complete and you're ready to merge back into the trunk finally with the completed product.

    Might sound like a lot of work to keep your branches (you may have 2-4 features or bugs you are working on concurrently) updated from trunk daily but it becomes a habit and it's really no big deal after you force yourself to do it.

    It's as simple as that. If you don't do these things, you'll have a wrong view about branching and are doing it wrong.

    It's called distributive development and I've done this even on teams of just 2 developers (forced my boss to :) at one company) and we wouldn't have done anything different because it rocks if you do it right. Saves developers and PMs so much time and prevents so much headache for many things.

    So if you branch every feature and bug you work on and are a --responsible-- developer because you are a professional, you use best practices (update your branches daily) and then end up with a lot of productivity gains both as a developer and a team.

    The expectation has to be set forth from top down by the architect or lead as a team standard to adopt and be expected...and then you as a developer need to adopt it, believe in it, and ensure you follow the standard/best practice which is once and again, I will repeat UPDATE TRUNK TO BRANCH DAILY.

    Simple as that. There's a lot of misconception on what branching is, why you need it, why you IMO should branch everything you work on (segregate assignments), and ultimately what you will gain and the team gains in productivity. You won't see it --until-- you start doing both of these as a team...then you will see the light.

    When the team branches everything and keeps their branches up to date, the Project Managers or leads become much more relaxed and lose that fear of "Monster Merge Syndrome".

    Again, you have to do this before you will see the light.

    ReplyDelete
    Replies
    1. Actually, I couldn't agree more. What we thought was working fine is now causing problems. Initially the application was built for internal use and we only had to worry about deploying it internally. Now we have external clients and we're finding we're unable to deploy a feature because it was done in the same branch or even in trunk along with some other features that are not yet done. This is partly due to developers getting out of the habit of creating feature branches when they should have. So it's partially the branching model we were using and partly people not following our decided upon practices.

      As a result I'd been thinking about what we were doing wrong and how we might address it and I was actually going in much the same direction as you have described here. You did fill in a few gaps that I hadn't considered yet but I fully agree with your described approach.

      The thing about the approach you describe is that it can be applied regardless of the versioning software used.

      And I agree that the key to success with the model is to be pedantic about creating a branch for everything.

      Thanks for the comments.

      Delete
    2. I hope i dont sound too stupid, but what do you do with your branches after you've merged them back into the trunk? I'm a bit new to subversion and have not worked in large development environments...but i would guess that those "feature" branches are gonna start to pile up if you dont do something about them.

      Delete
    3. You're right, they will pile up. If you follow a nice naming convention it's not too bad and you can sort through them easily. Once you get to the point where there are just too many you can always "prune" them by deleting them. The branches will still exist in the history but won't show up anymore from the revision you removed them on.

      Delete
    4. >>"As long as you follow best practices which means you update trunk changes up to your branch DAILY or even more per day, you never have to worry about any merge hell in the end after code complete and you're ready to merge back into the trunk finally with the completed product"
      Oh boy! you make a lot of assumptions about the nature of human beings :) You make the developers face merge hell everyday. IMO developers should focus on code and not on merging nuances with a versioning software- wastes valuable coding time. If trunk is the dumping ground for code that works on a developers machine and there is a branch for every release, it is easier on the developer. Bad code dumped into trunk shld be written up...be harsh on the developer.

      So all these guys work on sep branches for bugs and when do you ask them to put it back? The day they put it back, it WILL be merge hell :). So conclusion...follow the "New way" the owner-blogger suggested. Less merge hell and you wont make developers merge experts, but instead hone their developer skills...which is what is needed.

      Delete
  6. It's a good idea to delete feature branches just after they have been reintegrated, because they become "unusable for further work", as it is explained in the SVN Book.

    The approach delineated in the last few comments is the one explained in the SVN Book too. Very good.

    ReplyDelete