SvnMigration

From Git SCM Wiki
Jump to: navigation, search

OBSOLETE CONTENT

This wiki has been archived and the content is no longer updated. Please visit git-scm.com/doc for up-to-date documentation.

Table of contents:

Contents


Moving an organisation from SVN to git involves convincing people to unlearn a lot of SVN stuff they're comfortable with, then learn a lot of new git stuff they don't understand. It takes several months of advocacy and education to move even a small team, and productivity will probably go down before it comes back up. This page presents some tools and tips to help migrate an organisation from SVN to git. If you just want to migrate yourself, you might prefer the crash course.

At present, this page concentrates on command-line SVN and git users - contributions are welcome from people that have made the switch in a GUI environment.

Advocacy

A move to git generally starts with advocating it to the rest of your team. Directly recommending git to other people will usually meet with resistance until people are convinced that it will make them happier and more productive. The first step in migrating to git is politely demonstrating how you are happier and more productive using git as an SVN client.

This stage is as much about promoting a healthy level of envy as making a logical argument. Saying “you should get git-svn and bisect that bug” won't have much effect even if you're right, whereas “I bisected your bug over lunch - turns out Tony introduced it in revision 1234” will impress people even if it's no help in that instance.

SVN users usually look at git's features with confusion or disinterest, except for the one thing they immediately identify as their killer app. Trying to guess who will be interested in which feature turns out to be very inefficient, because the things that SVN users react to are almost never the things that git users expect. It's far more effective to publicly use as many git features as possible, without making a big deal about them. This lets people find their own killer app and ignore the rest - people that like git add -p mention it if they spot you using it, and people that like git commit --amend will ask when they hear you talk about uncommitting things.

Git as an SVN client

Main page: git-svn

It's recommended to move people one at a time to git as an SVN client before thinking about git on the server. As well as ensuring you're never overrun by a whole team making newbie errors, git-svn lets you defer decisions about your git architecture until you have a better feel for how git works with your particular team.

The first thing you'll need to explain is the variety of places where git stores files. SVN just has local files and the remote repository; git has local files, the index, the local repository and the remote repository. These are quite hard for SVN users to grasp, so you might want to just say the index is “a secondary set of local files that only git can access”, the local repository is “a copy of the server's repository”, and that both are “designed to make git faster”. You can explain the power of these two stores later on, when people have had a few weeks to get used to them. Oliver Steele created a useful diagram that nicely illustrates git's various stores (taken from an article on his blog).

People will need to know about git stash when they have merge conflicts, but it's best not to explain ahead of time. Explaining git stash involves advanced concepts like commits that aren't on a branch, whereas demonstrating it live teaches them everything they need to know and lets them fill in the details as they come to understand git.

You should warn people not to use git rebase, because it isn't compatible with git-svn (git svn rebase is fine). For similar reasons, you should ensure people always use the --squash option to git-merge. Unfortunately, git merge --squash doesn't set a useful commit message, so you have to manually add something like "Merged branch '<from>' to <to>".

When people use git merge without --squash, they can undo it by immediately typing git reset --hard HEAD@{1}. You might want to set the rerere.enabled option in git-config for everyone, so that any resolutions they made will be reused when they re-merge with --squash.

You might also want to caution people about creating new git branches, because the results can be surprising at first. For example, a git svn dcommit from a git branch will commit your changes to the SVN branch it was based on.

You will need to think a lot about how much to modify your git environment to suit your team, and how much to educate your team about git. New migraters have no preconceptions about git, so are quite willing to accept that things are different than they were with SVN. But new migraters are also overwhelmed by the little differences (like why their commit numbers are hidden in the body of the commit message), and it'll be weeks or months before they're ready to start learning some of git's deeper differences. Consider adding aliases, enabling bash completion for git, and handing out a git-svn cheatsheet (.doc). Many excellent guides are also available for people that prefer to learn by reading.

Git on the server

Once your team has largely migrated to git-svn, you've developed a migration plan for all your scripts, and you've developed workarounds for the people that really don't want to switch, you can move to git on the server. This is the last chance for your team to unlearn old SVN stuff - e.g. to remove SVN-compatibility aliases created during the move to git-svn.

Most of the issues you face during this stage will be specific to your team - do you host your code on GitHub or roll your own with gitosis? Do you try to keep all your old post-commit hooks, or convert them to post-receive hooks? A general guide such as this can't help much with these issues.

Philosophical differences

One of the hardest issues in migrating from SVN to git is as subtle as it is pervasive: the change of focus from the lines in your commit graph to the points.

SVN users focus on branches as the centre of their development history, and see commits as these funny little things that punctuate the progress of their branches. Git users focus on commits (or even trees) as the centre of their history, and see branches as just one of many handy labels to track them.

This subtle distinction affects many of the expectations people have about version control. For example, SVN users like to think of a commit as being “on a branch”, meaning that it marks an event in the lifetime of one SVN directory. Whereas git users would say a commit is “reachable from the tip of zero or more branches”, meaning that it informs you about the state of various other commits.

It's very hard to make SVN users grok this change of emphasis. It can help to explain how git internally stores whole trees instead of just deltas (so you can e.g. diff two arbitrary commits without examining the ones in between). It can also help to describe SVN as using “directory-based branches”, compared to git's “label-based branches”. Using different names makes it easier to explain how label-branches let you do things that directory-branches don't, like making commits that aren't associated with any branch. This will still only get you part of the way - if you find something better, please contribute to this guide!

Traps for the unwary

You can expect to face various problems during your migration that you didn't see coming, or that you had assumed wouldn't be a problem. By getting people to skip-read the following list when they start using git, you can tackle more of these problems earlier than you otherwise would have done.

SVN properties (e.g. svn:ignore)

SVN makes extended use of properties - human-readable messages attached to files and repositories. git-svn supports the propget and proplist commands, but properties aren't supported natively in git. There are equivalents for the most common uses of properties, however.

The most common SVN property is svn:ignore, which lets you ignore specified files. Git uses gitignore files, which can either be called .gitignore and version-controlled in the normal way, or .git/info/exclude and be maintained by each user.

Git uses files in the .git directory for most of the jobs where SVN would use properties. For example, merge information is stored in files instead of the svn:mergeinfo property.

Git has a long and proud scripting tradition, so many of the problems that SVN would encourage you to solve with properties are better solved in git by parsing the output of git commands. For example, web developers wanting to manage the datestamps for their files might want to use the date of the relevant commit. Most git commands are designed to be usable from scripts - see the git manual for a complete list.

Empty directories

SVN makes you manually svn add and svn rm your directories, whereas git automatically adds and removes directories when you add and remove files in them. While this is often convenient, it means that git can't track empty directories. Git-svn will create empty directories when you git svn rebase, but can't really track them after that. It's recommended to create an empty .gitignore file in any directories you want to track with git.

Git and SVN branches

Because git uses label-based branches instead of SVN's directory-based branches, scripts which expect stable code to be in the trunk directory will need to be rewritten. The root directory of a git repository always contains the current branch, and it's usually enough for your scripts to git checkout <branch-you-want> or create a zipfile with git archive <branch-you-want>.

See also

This page was originally based on a thread on the git mailing list.


Personal tools