Distributed version control

From Citizendium
Revision as of 17:11, 5 August 2010 by imported>Yuvi Masory (Undo revision 100697677 by Yuvi Masory (Talk))
Jump to navigation Jump to search
All unapproved Citizendium articles may contain errors of fact, bias, grammar etc. A version of an article is unapproved unless it is marked as citable with a dedicated green template at the top of the page, as in this version of the 'Biology' article. Citable articles are intended to be of reasonably high quality. The participants in the Citizendium project make no representations about the reliability of Citizendium articles or, generally, their suitability for any purpose.

Nuvola apps kbounce green.png
Nuvola apps kbounce green.png
This article is currently being developed as part of an Eduzendium student project. The course homepage can be found at CZ:Special_Topics_2010.
To provide students with experience in collaboration, you are warmly invited to join in here, or to leave comments on the discussion page. The anticipated date of course completion is 13 August 2010. One month after that date at the latest, this notice shall be removed.
Besides, many other Citizendium articles welcome your collaboration!


This article is a stub and thus not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

Distributed version control systems such as Git and Mercurial have emerged in the last few years as competitors to older centralized version control systems such as Subversion and CVS.

Overview

Terminology

A tool that manages and tracks different versions of software or other content is referred to generically as a version control system (VCS), a source code manager (SCM), a revision control system (RCS), and with several other permutations of the words "revision," "version," "code," "content," "control," "management," and "system." [...] [E]ach system addresses the same issues: develop and maintain a repository of content, provide access to historical editions of each datum, and record all changes in a log.[1]


All unapproved Citizendium articles may contain errors of fact, bias, grammar etc. A version of an article is unapproved unless it is marked as citable with a dedicated green template at the top of the page, as in this version of the 'Biology' article. Citable articles are intended to be of reasonably high quality. The participants in the Citizendium project make no representations about the reliability of Citizendium articles or, generally, their suitability for any purpose.

Nuvola apps kbounce green.png
Nuvola apps kbounce green.png
This article is currently being developed as part of an Eduzendium student project. The course homepage can be found at CZ:Special_Topics_2010.
To provide students with experience in collaboration, you are warmly invited to join in here, or to leave comments on the discussion page. The anticipated date of course completion is 13 August 2010. One month after that date at the latest, this notice shall be removed.
Besides, many other Citizendium articles welcome your collaboration!


This article is a stub and thus not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

Distributed version control systems such as Git and Mercurial have emerged in the last few years as competitors to older centralized version control systems such as Subversion and CVS.

Overview

Terminology

A tool that manages and tracks different versions of software or other content is referred to generically as a version control system (VCS), a source code manager (SCM), a revision control system (RCS), and with several other permutations of the words "revision," "version," "code," "content," "control," "management," and "system." [...] [E]ach system addresses the same issues: develop and maintain a repository of content, provide access to historical editions of each datum, and record all changes in a log.[1]

History

Linux kernel crisis

Popular DVCSes

Git

Git was started by Linus Torvalds for use on the development of the Linux kernel. Originally, Linus had been using the commerical BitKeeper software, but this caused controversy among advocates of free software. Git was developed as an alternative. It is written in C with some modules written in Perl.

Git has become popular with quite a number of developers especially in the Ruby community, especially due to the availability of Github, a commercial Git hosting site that provides free hosting for open source projects, and Gitorious, an open source Git hosting system. Integration of Git exists with TextMate, Vim, Redmine and many other systems.

In addition to the main implementation in C, there are implementations of Git in other languages: JGit (Java) and Dulwich (Python, named after the town in which "Mr and Mrs Git" live in a Monty Python sketch). Dulwich can be used for interoperability between Git and Mercurial and JGit is used for Java IDE integration and to push and pull Git repostiories to Amazon's S3 cloud storage platform.

Mercurial

Mercurial is an open source DVCS written in Python.

Others

Other DVCSes include Bazaar (bzr; used heavily by Ubuntu's developers and development platform, Launchpad.net), Darcs (written in Haskell) and Monotone (written in Java).

Comparison with centralized version control

Merging

Workflow

Adoption

Open source software

Open source DVCSes have swept the open source community. Within just a few years of their release Git and Mercurial in particular boast some of the largest open source projects as partial or complete adopters. These include Android, Debian, Eclipse, GNOME, GTK+, Mozilla, Netbeans, the Linux Kernel, OpenJDK, Perl, Qt, and Ruby on Rails. Likewise, all of the major open source code hosting services now support multiple DVCs.

DVCSes were designed with open source development in mind, so many of their features mesh nicely with the open source workflow. For example, the emphasis on change sets (instead of versions) allows patters of non-linear development that are less widespread in proprietary software.

GitHub and BitBucket

GitHub.com was launched in 2008 as a hosting service for Git repositories. Improving upon many of the perceived flaws of older hosting services, GitHub has captured the vast majority of the Git hosting market share. GitHub's popularity has been partly responsible for the rising popularity of Git itself. BitBucket.org was also launched in 2008, offering a GitHub-like interface for Mercurial repositories. GitHub, BitBucket, and similar sites have swept the open source community, with many of the largest open source projects migrating to DVC and new DVC hosts simultaneously. Many of these new DVC hosts offer free public repositories for open source projects, charging only for private repositories.

Barriers to adoption

Support

Many companies are wary of using software that doesn't offer vendor support. While users of open source version control systems will turn to mailing lists, Internet Relay Chat, forums, and the like for help, this type of informal support may be perceived as risky by large companies. This perception has fueled some of the demand for proprietary VCSes like Perforce and Microsoft Visual SourceSafe.

Since proprietary version control systems are mostly centralized (BitKeeper is a notable exception), and DVCSes are nearly all open source, the lack of support may slow industry adoption. At present, companies seeking commercially supported DVCs can purchase BitKeeper's Pro or Enterprise licenses.[2] Kiln, from Fog Creek Software, offers a Mercurial-based DVC that simplifies deployment and code review. It too includes technical support.[3]

Auditing

Reliable auditing of the major DVCSes is generally impossible. DVCs allow users to permanently delete data and alter saved history. In some cases it may be impossible to recover data or to determine which user introduced a given change. These abilities have little to do with the distributed model per se, but do separate the popular centralized systems from the popular decentralized ones. Some source control users may require reliable auditing in order to protect intellectual property or comply with record-keeping laws. git-svn is a tool that offers some Git features on top of an existing Subversion repository, and as such may offer some of Subversion's auditing abilities.

Access controls controls are difficult to enforce in a DVCS, since DVCSes are designed for each user to have a complete history of the repository stored locally on his or her machine. Gitolite and Gitosis were developed to offer per-repository and per-branch/tag access controls, but it is unclear what success, if any, they have had in driving corporate adoption of Git.

Platform support

Git, the most widely used DVC, was developed specifically for Linux kernel development. A Mac OS X port was achieved at a later date, but Windows lagged behind. Today Git can be run on Windows using Cygwin for POSIX emulation, or with the native msysGit can be run on Windows using Cygwin or native port called msysgit. Both these tools have greatly improved since their early releases, but the Git experience on Windows may still be behind Linux and Mac OS X. Mercurial and Bazaar do not suffer from the same platform fragmentation problems, probably owing to their multi-platform histories and largely Python (as opposed to C) implementations.

Development tool integration

Centralized systems like Subversion and CVS are widely used through graphical user interface (GUI) tools, especially in the Windows community. TortoiseSVN offers tight integration with Windows Explorer, and AnkSVN brings Subversion to Microsoft Visual Studio through a plugin. GUIs are also widely used in the Java community, the Subclipse and Subversive Eclipse plugins, for example. GUIs and IDE integration are standard for proprietary systems like Microsoft Visual SourceSafe or Team Foundation Server.

Many DVCs however were born from the Linux community which values GUIs and Integrated Development Environments (IDEs) rather less. Those accustomed to working with IDE GUI plugins may have difficulty transitioning to a terminal-based workflow. Efforts are ongoing to bring GUI support to the popular DVCs.

TortoiseHG has succeeded in replicating many of the ToirtoiseSVN features on Windows, but its GNOME (Linux) offerings less mature. An OS X port has not yet been released. JGit is a pure Java implementation of Git used by the Eclipse EGit plugin. EGit and JGit are officially supported by the Eclipse Foundation, so JGit-based plugins may become common in Java-based IDEs like Eclipse, NetBeans, and IntelliJ Idea. To date, however, these tools have not been widely used. Early versions were blamed for corrupting Git repositories, and they are not supported by GitHub.[4] Many other GUIs are being actively developed, but they are generally not as mature as GUIs for CVS and Subversion.

References

  1. 1.0 1.1 Jon Loeliger, Version Control with Git, chapter 1, ISBN 0596520123
  2. BitKeeper Sales
  3. Kiln Support
  4. GitHub Help - Fixing egit corruption

Popular DVCSes

Git

Git was started by Linus Torvalds for use on the development of the Linux kernel. Originally, Linus had been using the commerical BitKeeper software, but this caused controversy among advocates of free software. Git was developed as an alternative. It is written in C with some modules written in Perl.

Git has become popular with quite a number of developers especially in the Ruby community, especially due to the availability of Github, a commercial Git hosting site that provides free hosting for open source projects, and Gitorious, an open source Git hosting system. Integration of Git exists with TextMate, Vim, Redmine and many other systems.

In addition to the main implementation in C, there are implementations of Git in other languages: JGit (Java) and Dulwich (Python, named after the town in which "Mr and Mrs Git" live in a Monty Python sketch). Dulwich can be used for interoperability between Git and Mercurial and JGit is used for Java IDE integration and to push and pull Git repostiories to Amazon's S3 cloud storage platform.

Mercurial

Mercurial is an open source DVCS written in Python.

Others

Other DVCSes include Bazaar (bzr; used heavily by Ubuntu's developers and development platform, Launchpad.net), Darcs (written in Haskell) and Monotone (written in Java).

Comparison with centralized version control

Merging

Workflow

Adoption

Open source software

Open source DVCSes have swept the open source community. Within just a few years of their release Git and Mercurial in particular boast some of the largest open source projects as partial or complete adopters. These include Android, Debian, Eclipse, GNOME, GTK+, Mozilla, Netbeans, the Linux Kernel, OpenJDK, Perl, Qt, and Ruby on Rails. Likewise, all of the major open source code hosting services now support multiple DVCs.

DVCSes were designed with open source development in mind, so many of their features mesh nicely with the open source workflow. For example, the emphasis on change sets (instead of versions) allows patters of non-linear development that are less widespread in proprietary software.

GitHub and BitBucket

GitHub.com was launched in 2008 as a hosting service for Git repositories. Improving upon many of the perceived flaws of older hosting services, GitHub has captured the vast majority of the Git hosting market share. GitHub's popularity has been partly responsible for the rising popularity of Git itself. BitBucket.org was also launched in 2008, offering a GitHub-like interface for Mercurial repositories. GitHub, BitBucket, and similar sites have swept the open source community, with many of the largest open source projects migrating to DVC and new DVC hosts simultaneously. Many of these new DVC hosts offer free public repositories for open source projects, charging only for private repositories.

Barriers to adoption

Support

Many companies are wary of using software that doesn't offer vendor support. While users of open source version control systems will turn to mailing lists, Internet Relay Chat, forums, and the like for help, this type of informal support may be perceived as risky by large companies. This perception has fueled some of the demand for proprietary VCSes like Perforce and Microsoft Visual SourceSafe.

Since proprietary version control systems are mostly centralized (BitKeeper is a notable exception), and DVCSes are nearly all open source, the lack of support may slow industry adoption. At present, companies seeking commercially supported DVCs can purchase BitKeeper's Pro or Enterprise licenses.[1] Kiln, from Fog Creek Software, offers a Mercurial-based DVC that simplifies deployment and code review. It too includes technical support.[2]

Auditing

Reliable auditing of the major DVCSes is generally impossible. DVCs allow users to permanently delete data and alter saved history. In some cases it may be impossible to recover data or to determine which user introduced a given change. These abilities have little to do with the distributed model per se, but do separate the popular centralized systems from the popular decentralized ones. Some source control users may require reliable auditing in order to protect intellectual property or comply with record-keeping laws. git-svn is a tool that offers some Git features on top of an existing Subversion repository, and as such may offer some of Subversion's auditing abilities.

Access controls controls are difficult to enforce in a DVCS, since DVCSes are designed for each user to have a complete history of the repository stored locally on his or her machine. Gitolite and Gitosis were developed to offer per-repository and per-branch/tag access controls, but it is unclear what success, if any, they have had in driving corporate adoption of Git.

Platform support

Git, the most widely used DVC, was developed specifically for Linux kernel development. A Mac OS X port was achieved at a later date, but Windows lagged behind. Today Git can be run on Windows using Cygwin for POSIX emulation, or with the native msysGit can be run on Windows using Cygwin or native port called msysgit. Both these tools have greatly improved since their early releases, but the Git experience on Windows may still be behind Linux and Mac OS X. Mercurial and Bazaar do not suffer from the same platform fragmentation problems, probably owing to their multi-platform histories and largely Python (as opposed to C) implementations.

Development tool integration

Centralized systems like Subversion and CVS are widely used through graphical user interface (GUI) tools, especially in the Windows community. TortoiseSVN offers tight integration with Windows Explorer, and AnkSVN brings Subversion to Microsoft Visual Studio through a plugin. GUIs are also widely used in the Java community, the Subclipse and Subversive Eclipse plugins, for example. GUIs and IDE integration are standard for proprietary systems like Microsoft Visual SourceSafe or Team Foundation Server.

Many DVCs however were born from the Linux community which values GUIs and Integrated Development Environments (IDEs) rather less. Those accustomed to working with IDE GUI plugins may have difficulty transitioning to a terminal-based workflow. Efforts are ongoing to bring GUI support to the popular DVCs.

TortoiseHG has succeeded in replicating many of the ToirtoiseSVN features on Windows, but its GNOME (Linux) offerings less mature. An OS X port has not yet been released. JGit is a pure Java implementation of Git used by the Eclipse EGit plugin. EGit and JGit are officially supported by the Eclipse Foundation, so JGit-based plugins may become common in Java-based IDEs like Eclipse, NetBeans, and IntelliJ Idea. To date, however, these tools have not been widely used. Early versions were blamed for corrupting Git repositories, and they are not supported by GitHub.[3] Many other GUIs are being actively developed, but they are generally not as mature as GUIs for CVS and Subversion.

References