Defying Classification

by Malcolm Tredinnick

Tue 17 Jul 2007

Version Control Storage Sizes

Posted at 21:44 +1000

A slightly more technical post today. Apologies to my non-IT readers. Normal interest will resume tomorrow.

Without wishing to start any kind of "my version control system choice is better than yours" debate — because I really, truly, don't care — here are two interesting numbers:

35M     django_working
25M     django.git

The top one (django_working) is a normal subversion checkout, completely up to date (to r5722) and with all .pyc files removed. The second directory is the same thing, except that I also imported all the subversion trunk changes since Django was released to the public (so, from r1 up to r5722).

As the name suggests, the second repository is a git repository, created with git-svn so that I can write back changes to the main subversion repository as well. The initial import was around 140M, but I ran git-repack on it to compress it. There are trade-offs here, obviously; packs take a bit longer to access. By and large, though, I don't need access to all the history and checkins. When I do and I'm offline, this will be invaluable. The ability to use git-bisect on a local copy to find problems will come in handy, too, I suspect. I didn't really care too much how large the storage requirement was (although my laptop is getting a little full, so it might have been an issue there), but the small size was a little unexpected.

(Yes, I know subversion storage is fairly large in general. Yes, I know there are other distributed version control systems out there. Really, I know this. You don't have to email me to tell me. It's not that important.)

Topics: software/django, software/version control