reduce the disk space used by a subversion working copy
scord reduces the disk space used by a subversion working copy with large and/or many files; ignoring other subversion overhead, scord reduces the size of a working copy to just the size of the working files, instead of 2x their size. Subversion users may find the disk savings useful for large source code or media repositories. scord stands for "Subversion Check Out, Reduced Disk" and is under the MIT license. scord currently only reduces disk space usage for Subversion ≤1.6.
A subversion working copy (without scord) uses 2x the disk space of the working files because it contains two copies of each file; the second is a pristine version used by subversion to compute diffs and report status without contacting the repository [SVN Book]. While this saves bandwidth (sometimes significantly, eg large working copies!) and permits offline operation, it is unfortunate when most working versions are unmodified (rendering the pristine versions unnecessary). The subversion developers and other users have long understood this drawback, but difficult issues have and continue to postpone solutions (see Why not ... for the why).
scord solves this issue by interposing between filesystem access and the filesystem itself, where it detects when the working version becomes modified or is reverted and is able to respond by recreating or eliminating the pristine version. scord is thus able to store explicit pristine versions for only locally modified files.
To use scord you must be able to trust it with your local modifications; scord's internal file updates use the same techniques to safeguard against crashes as subversion uses for working copy updates.
scord serves a filesystem in which you keep your subversion working copy(ies). A scord filesystem is no different from any other filesystem; scord merely detects disk space reduction opportunities in subversion working copies. To start using scord:
$ mkdir .scord_photos/ photos/ # scord will expose the filesystem in photos/ and store scord data in .scord_photos/: $ scord .scord_photos/ photos/ $ svn checkout <URL> photos/ $ svn info photos/ Path: ... # see the disk savings: $ du -sh .scord_photos/ photos/ 20G .scord_photos/ 40G photos/ # stop scord, on Linux, via fusermount or kill -SIGTERM: $ fusermount -u photos/ # (or on Mac OS X, by unmounting the file system or using kill -SIGTERM)You may also add startscord to your startup programs and create a ~/.startscordrc to automatically start scord for each scord checkout. StartScord.app is a Mac OS X frontend to startscord. See the startscord man page for details.
$ sudo automount -cYou should then see the directory in mount's output:
$ mount ... map -static on /Users/bob/photos (autofs, automounted)automount support requires MacFUSE >= 2.0.
Current release: 1.1.2 (RSS Feed of Release Announcements):
Development (in subversion): https://scord.svn.sourceforge.net/svnroot/scord/trunk/scord/ (Browse)
While scord can save substantial disk space for a working copy, it also (currently) includes the following drawbacks:
The first three of these four drawbacks are primarily implementation limitations; scord development assistance is welcome!
There were two reasons to create scord instead of modify subversion:
A rewrite of the working copy API appears to be a required step to enhance subversion to reduce working copy disk usage, and even a solid design for this large change has remained elusive.
Subversion developers documented the pristine version storage tradeoff at least as far back as 2001. The tradeoff persists because all discussed (subversion!) solutions appear to require a working copy API rewrite (breaking compatibility) and because, while several solutions have been discussed, no single solution has shown through as the obvious end solution. Many of these solutions include benefits scord does not even try to obtain — so there is most definitely reason to continue pursual! — but scord achieves its goal now and with complete compatibility (...on the comparatively few platforms it supports).
Someday, it's theoretically possible to improve subversion such that it does not rely on the existence of cached, pristine files in .svn/text-base/. It's a big change, though. ...
Source: Ben Collins-Sussman: Subversion issue 525, 2001-10-11
Related subversion issues and discussions (and if you know of other useful information, I would be most interested):
scord detects file edits itself (no explicit, prior user notification).
Whereas several subversion design ideas (and many other SCMs, eg Perforce and BitKeeper) require the user to inform the SCM client before editing a file, scord detects file changes itself as they are made. This simplifies normal user interaction and also easily supports mass file regeneration (eg by the build system). Unlike SCMs with explicit edit commands, scord does not give subversion the ability to find all modified files without crawling the entire working copy (but, perhaps scord could!).
An existing solution to obtain smaller working copies is to export the working copy and manually track local and repository changes. scord achieves the same disk savings (ignoring other subversion overhead) while maintaining the features of a subversion working copy.
The subversion-related SVK lists its disk space savings as one of the two reasons to use it instead of subversion [WhySVK]. scord-housed subversion working copies use even less disk space than SVK working copies (which store the entire repository's file and history set) and scord does not come with a change in development model. At the same time, SVK provides an entire, local repository mirror.
fsvs is an alternative subversion client targeted at backing up directory trees. As a client replacement, fsvs uses its own working copy metadata format and stores only the working version of each file. As such, fsvs provides many benefits similar to and beyond scord's, at the cost of replacing the subversion client and breaking compatibility with subversion working copy tools.