Version Control Systems and Open Source

Fernando
Fernando

If you haven't looked at the available Version Control Systems lately, you'd better look again.

About 5 years ago, CVS was pretty much the only open source Version Control System in use, and it's still very popular.

But in the last few years, a surprising number of new and really good open source version control systems came to life. Like most open source projects, their authors started them to scratch a personal itch. In this case, the itch was caused by some important limitations in CVS:

  • The directory structure is not versioned (it only keeps history of files)
  • Impossibility to rename, move or copy files (without loosing history)
  • Operations (like commits) are not atomic
  • No concept of "change sets"
  • Very expensive (inefficient) branching mechanism
  • Limited merging capabilities
  • No support for decentralized repositories (distributed development).

Don't get it wrong though: CVS is a very respectable piece of software: it's been first released more than 20 years ago!

Among the new open source alternatives I found these to be quite popular:

  • Subversion -- Explicitly designed as a replacement for CVS. I would say it's achieving it's goal. Many big projects moved from CVS to Subversion, like KDE, GCC, Apache.

  • SVK -- Built on top Subversion's libraries. It offers additional functionality, like distributed repositories and better merging. It integrates with existing Subversion repositories, so it's more an extension than an alternative to Subversion.

  • Arch -- Very powerful and decentralized. The current version 1 received many complains around usability, which spawned new projects like Bazaar. But version 2 promises many improvements, including ideas from Bazaar and other systems.

  • Git -- Developed by Linus Torvalds and other Linux Kernel hackers when they were forced to stop using the commercial BitKeeper because of licensing issues.

  • Darcs -- Written is Haskell.

  • Codeville -- Apparently has an advanced merging algorithm without the problems of 3-way-merge.

  • Monotone

Subversion, like CVS, is designed around the concept of a centralized repository: all developers work against one single repository. All the other systems mentioned above are decentralized, allowing for more distributed development: several repositories may exist (say, one per developer) and they are synchronized in a peer-to-peer way.

If you never used a decentralized system, it may sound like a crazy idea. But for some project teams, like the Linux kernel hackers, a decentralized system is a requirement. They cannot rely on a centralized repository (even the Subversion team understands this).

Some projects use decentralized systems but still define a main repository against which all developers synchronize often. Then only a few maintainers (main developers) have write permissions to the main repository, but anyone who wants to contribute is free to create a local (personal) branch off the main repository, without even notifying the maintainers. When a contributor is ready to share his new code, he/she sends a patch to the maintainers.

I think the big Open Source projects are the most demanding users of Version Control Systems. And companies can learn a lot from them.

May be, most in-house software development teams do not have such strong requirements on source control because they don't face all the obstacles found in distributed Open Source projects: people working on and off, at different times with no clear schedules, from different time-zones, little (if any) face-to-face communication, hard to have "all-hands" meetings.

But, with the increasing number of companies outsourcing part of their development teams, many of these same problems arise. So I bet many companies would benefit from a more decentralized approach.

I only have experience with Subversion and SVK. That's what I'm currently using. The good thing about SVK is that it provides a decentralized system that is compatible with Subversion repositories. This allows you to create your own local branches off any Subversion repository you have access to.

So, if your are still using CVS, you should take a serious look at the new alternatives. If you think a decentralized approach is not for your team, then go with Subversion.

I like SVK because I need to interact with Subversion repositories. But if you are defining a new project and want it to be decentralized, you should probably look into some of the naturally decentralized systems.

For more information: