|
Version control systems
Web Design & Development Guide
Version control systems
Home | Up | List of revision control software | Comparison of revision control software
Revision control (also known as version control,
source control or (source) code management (SCM)) is the
management of multiple revisions of the same unit of information. It is
most commonly used in
engineering and software development to manage ongoing development of
digital documents like application source code, art resources such as
blueprints or electronic models and other critical information that may
be worked on by a team of people. Changes to these documents are
identified by incrementing an associated number or letter code, termed
the "revision number", "revision level", or simply "revision"
and associated historically with the person making the change. A simple
form of revision control, for example, has the initial issue of a
drawing assigned the revision number "1". When the first change is made,
the revision number is incremented to "2" and so on.
Software tools for revision control are increasingly recognized as being
necessary for the organization of multi-developer projects.
[1]
Overview
Engineering revision control developed from formalized processes based on
tracking revisions of early blueprints or bluelines. Implicit in this control
was the ability to return to any earlier state of the design, for cases in which
an engineering dead-end was reached in the development of the design. Likewise,
in computer software engineering, revision control is any practice that tracks
and provides control over changes to source code. Software developers sometimes
use revision control software to maintain documentation and configuration files
as well as source code. Also, version control is widespread in business and law.
Indeed, "contract redline" and "legal blackline" are some of the earliest forms
of revision control, and are still employed with more or less sophistication. An
entire industry has emerged to service the document revision control needs of
business and other users, and some of the revision control technology employed
in these circles is subtle, powerful, and innovative. The most sophisticated
techniques are beginning to be used for the electronic tracking of changes to
CAD files (see Product Data Management), supplanting the "manual" electronic implementation
of traditional revision control.
As software is designed, developed and deployed, it is extremely common for
multiple versions of the same software to be deployed in different sites, and
for the software's developers to be working simultaneously on updates.
Bugs
and other issues with software are often only present in certain versions
(because of the fixing of some problems and the introduction of others as the
program develops). Therefore, for the purposes of locating and fixing bugs, it
is vitally important to be able to retrieve and run different versions of the
software to determine in which version(s) the problem occurs. It may also be
necessary to develop two versions of the software concurrently (for instance,
where one version has bugs fixed, but no new features, while the other version
is where new features are worked on).
At the simplest level, developers could simply retain multiple copies of the
different versions of the program, and number them appropriately. This simple
approach has been used on many large software projects. While this method can
work, it is inefficient as many near-identical copies of the program have to be
maintained. This requires a lot of self-discipline on the part of developers,
and often leads to mistakes. Consequently, systems to automate some or all of
the revision control process have been developed.
Moreover, in software development and other environments, including in legal
and business practice, it is increasingly common for a single document or
snippet of code to be edited by a team, the members of which may be
geographically diverse and/or may pursue different and even contrary interests.
Sophisticated revision control that tracks and accounts for ownership of changes
to documents and code may be extremely helpful or even necessary in such
situations.
Another use for revision control is to track changes to configuration files,
such as those typically stored in /etc or /usr/local/etc on Unix systems. This
gives system administrators another way to easily track changes to configuration
files and a way to roll back to earlier versions should the need arise.
Compression
Most revision control software can use
delta compression, which retains only the differences between successive
versions of files. This allows more efficient storage of many different versions
of files.
Source management models
Traditional revision control systems use a centralized model, where all the
revision control functions are performed on a shared
server. If two developers try to change the same file at the same time,
without some method of managing access the developers may end up overwriting
each other's work. Centralized revision control systems solve this problem in
one of 2 different "source management models": file locking and version merging.
File locking
The simplest method of preventing "concurrent
access" problems is to lock files so that only one developer at a time has
write access to the central "repository" copies of those files. Once one
developer "checks out" a file, others can read that file, but no one else is
allowed to change that file until that developer "checks in" the updated version
(or cancels the checkout).
File locking has merits and drawbacks. It can provide some protection against
difficult merge conflicts when a user is making radical changes to many sections
of a large file (or group of files). But if the files are left exclusively
locked for too long, other developers can be tempted to simply bypass the
revision control software and change the files locally anyway. That can lead to
more serious problems.
Version merging
Most version control systems, such as
CVS, allow multiple developers to be editing the same file at the same time.
The first developer to "check in" changes to the central repository always
succeeds. The system provides facilities to merge changes into the central
repository, so the improvements from the first developer are preserved when the
other programmers check in.
The concept of a reserved edit can provide an optional means to
explicitly lock a file for exclusive write access, even though a merging
capability exists.
Distributed revision control
Distributed revision control takes a peer-to-peer approach, as opposed to the
client-server approach of centralized systems. Rather than a single, central
repository on which clients synchronize, each peer's working copy of the
codebase is a bona-fide
repository.[2]
Synchronization is conducted by exchanging patches (change-sets) from peer to
peer. This results in some striking differences from a centralized system:
- No canonical, reference copy of the codebase exists by default; only
working copies.
- Common operations such as commits, viewing history, and reverting
changes are fast, because there is no need to communicate with a central
server.[3]
- Each working copy is effectively a remoted backup of the codebase and
change history, providing natural security against data loss.[3]
There are two types of distributed systems: open and closed. Open systems are
tuned more to open-source development, and closed systems to traditional, single
baseline, development.
Open Systems
An open system of distributed revision control is characterized by its
support for independent branches, and its heavy reliance on merge operations.
Its general characteristics are:
- Every working copy is effectively a branch.
- Each branch is actually implemented as a working copy, with merges
conducted by ordinary patch exchange, from branch to branch.
- It may be possible to "cherry-pick" single changes, selectively pulling
them from peer to peer.
- New peers can freely join, without applying for access to a server.
One of the first open systems was
BitKeeper, noteable for its use in the development of the Linux kernel. A later decision by the makers of BitKeeper to restrict its
licensing led the Linux developers on a search for a free replacement[4].
Common open systems now in free use are:
- Bazaar
Darcs
Git
Mercurial
|
|
Closed Systems
A closed system of distributed revision control is based on a
Replicated Database. A check-in is equivalent to a distributed commit.
Successfull commits create a single baseline. An example of a closed distributed
system is Code Co-op.
Integration
Some of the more advanced revision control tools offer many other facilities,
allowing deeper integration with other tools and software engineering processes.
Plugins are often available for IDEs such as IntelliJ IDEA, Eclipse and Visual
Studio. NetBeans IDE comes with integrated version control support.
Common vocabulary
Terminology can vary from system to system, but here are some terms in common
usage.[5][6]
- Baseline
- An approved revision of a document or source file from which subsequent
changes can be made.
-
Branch
- A set of files under version control may be branched or forked
at a point in time so that, from that time forward, two copies of those
files may be developed at different speeds or in different ways
independently of the other.
- Check-out
- A check-out (or checkout or co) creates a local
working copy from the repository. Either a specific revision is specified,
or the latest is obtained.
- Commit
- A commit (check-in, ci or, more rarely, install
or submit) occurs when a copy of the changes made to the working copy
is written or merged into the repository.
- Conflict
- A conflict occurs when two changes are made by different parties to the
same document, and the system is unable to reconcile the changes. A user
must resolve the conflict by combining the changes, or by selecting
one change in favour of the other.
- Change
- A change (or diff, or delta) represents a specific
modification to a document under version control. The granularity of the
modification considered a change varies between version control systems.
- Change list
- On many version control systems with
atomic multi-change commits, a changelist, change set, or
patch identifies the set of changes made in a single commit.
This can also represent a sequential view of the source code, allowing
source to be examined as of any particular changelist ID.
- Dynamic stream
- A stream (a data structure that implements a configuration of the
elements in a particular repository) whose configuration changes over time,
with new versions promoted from child workspaces and/or from other dynamic
streams. It also inherits versions from its parent stream.
- Export
- An export is similar to a check-out except that it creates
a clean directory tree without the version control metadata used in a
working copy. Often used prior to publishing the contents.
- Head
- The most recent commit.
- Import
- An import is the action of copying a local directory tree (that
is not currently a working copy) into the repository for the first time.
- Mainline
- Similar to Trunk, but there can be a Mainline for each branch.
- Merge
- A merge or integration brings together two sets of changes
to a file or set of files into a unified revision of that file or files.
- This may happen when one user, working on those files, updates
their working copy with changes made, and checked into the repository,
by other users. Conversely, this same process may happen in the
repository when a user tries to check-in their changes.
- It may happen after a set of files has been branched, then a
problem that existed before the branching is fixed in one branch and
this fix needs merging into the other.
- It may happen after files have been branched, developed
independently for a while and then are required to be merged back into a
single unified trunk.
- Repository
- The repository is where the current and historical file data is
stored, often on a server. Sometimes also called a depot (e.g. with
SVK, AccuRev and Perforce).
- Reverse integration
- The process of merging different team branches into the main trunk of
the versioning system.
- Revision
- A revision or version is one version in a chain of
changes.
- Tag
- A tag or release refers to an important snapshot in time,
consistent across many files. These files at that point may all be tagged
with a user-friendly, meaningful name or revision number.
- Trunk
- The unique line of development that is not a branch (sometimes also
called Baseline or Mainline)
- Resolve
- The act of user intervention to address a conflict between different
changes to the same document.
- Update
- An update (or sync) merges changes that have been made in
the repository (e.g. by other people) into the local working copy.
- Working copy
- The working copy is the local copy of files from a repository, at
a specific time or revision. All work done to the files in a repository is
initially done on a working copy, hence the name. Conceptually, it is a
sandbox.
References
-
^
Rapid Subversion Adoption Validates Enterprise Readiness and Challenges
Traditional Software Configuration Management Leaders. EETimes
(2007-5-17). Retrieved on 2007-6-1.
-
^
Wheeler, David A..
Comments on Open Source Software / Free Software (OSS/FS) Software
Configuration Management (SCM) Systems. Retrieved on
2007-05-08.
- ^
a
b O\u2019Sullivan, Bryan.
Distributed revision control with Mercurial. Retrieved on 2007-07-13.
-
^
"Bitmover
ends free Bitkeeper, replacement sought for managing Linux kernel code",
Wikinews, 2005-04-07.
-
^
Collins-Sussman, Ben;
Fitzpatrick, B.W. and Pilato, C.M. (2004).
Version Control with Subversion. O'Reilly.
ISBN 0-596-00448-6.
-
^
Wingerd, Laura (2005).
Practical Perforce. O'Reilly.
ISBN 0-596-10185-6.
See also
External links
Home | Up | Blog software | Open source content management systems | Version control systems | Wiki | Zope | List of content management systems | List of content management frameworks | Document management system | Enterprise content management | Geospatial Content Management System | Web content
Web Design & Development Guide, made by MultiMedia | Websites for sale
This guide is licensed under the GNU
Free Documentation License. It uses material from the Wikipedia.
|
|