Monday, March 19, 2007

How to backup SourceSafe database

Why back up the SourceSafe databases?

People think that by adding their sources to a VSS database they are "safe" from losing their sources. They are wrong, and here are a couple of examples:
  • Often, the SourceSafe database is stored on the same machine where the project sources are. A hard drive failure will make you lose both the project sources and the VSS database! People discover this only when they try to restore the sources from the VSS database and realize this is gone, too. This is more of a problem for individual developers because in a team you can recover at least latest sources from a teammate's harddrive; however, you lose the source history in that case. Having a database backup on a different machine/tape/dvd will solve this problem
  • SourceSafe is not transactional. A crash of VSS application in the middle of a write operation in the database could cause database corruption. Analyze tool provided with VSS can fix some errors, but sometimes analyze.exe can't fix the corruption (and sometimes even analyze.exe can cause more corruption). Having a database backup can save you a call to product support and can restore the database to a known good state.

How often should I back up the SourceSafe database?

Well, that depends on you, but I'd recommend a daily backup. Thus, if something really bad happens you can restore the database to the previous day's backup and only lose a day's work...

Deciding the backup strategy

Before writing a backup script you have to decide on a backup strategy. First, decide what exactly are you trying to backup:

1) You can choose to back up only the latest version of the sources. You'll find on the Internet a lot of scripts that claim to do "SourceSafe backup"; they simply do a GetLatest in a folder. Personally I don't consider this a real backup solution: the reason you add the sources in a VSS database in the first place instead of creating daily archives is to have history of changes; this method does not preserve history. In case you're interested in latest sources only you can probably use directly Shadow Folders feature of SourceSafe instead of such "backup" process...

2) You can choose to backup only a portion of the database containing a project of interest. SourceSafe comes with two utilities, SSArc.exe and SSRestor.exe (Archive and Restore) that can help with this task. For their usage please see the MSDN documentation.
SSArc/SSRestor use for the backup file a binary proprietary format, and the archive size cannot be larger than 2GB (4GB in VSS2005). VSS6 does not warn when this limit is reached, causing silent corruption of the archive. Because of this limit I wouldn't recommend this method for backing up large amounts of data. A possible advantage of this method is that you can specify an option to archive tool to delete the old versions of the files that you just archived - this is nice to use to reduce the database size in case you don't care about files history. While you can specify the root of the database as the project to be backed up, when restoring such archive the files will end up in a subfolder. If you're interested in backing up the database root I'd strongly recommend using the 3rd method.

3) You can choose to backup the whole database. The VSS database is just a collection of files stored on your local disk or on a network share, so any file backup utility or directory replication service/program should work just fine.
The "poor man's solution" is to use "xcopy /s" command to simply copy the whole database in the backup target location.
However, as a single developer I prefer using robocopy.exe (Robust File Copy utility). For older operating systems this utility is distributed with Windows Resource Kitt; if you have Vista or later you already have this utility installed. robocopy will only copy newer/different files so it will save bandwidth when copying the database on a network machine. Robocopy will also retry the operation (in case you're backing up to a network location and the network goes down), and supports a /Mir parameter useful if you're using the same target location every day (it will also delete unnecessary files that would otherwise be orphaned).

Running Analyze.exe during backup

A good practice for a SourceSafe database administrator is to run regularly Analyze.exe (SourceSafe Analyze Tool) on the database to indentify and fix possible corruption. The sooner corruption is identified the easier will be to fix it; waiting more could only cause more things to become corrupted, too. Thus, it is a good practice to run analyze.exe at the same time you backup the database.
There is one catch though: sometimes analyze.exe may cause corruption, too, or may fix things but not in the way you'd expect. While analyze.exe creates backups of the files it touches, it is often easier to simply have the database copy before running analyze (so you can easily revert to it in case something bad happens with analyze).
I'd recommend that in your nightly script you first back up the VSS database location, then run analyze on the database.

Things to take care of before running backup/analyze

Before running backup or analyze you'd want to make sure no user accesses the VSS database. Letting users access and modifying the database during backup could cause the backup copy to be only half way updated. Letting users access and modify the database during analyze could cause analyze to fail, detect false-positive errors or fix errors that don't really exist.

You can prevent VSS users from opening new connections to the VSS database by locking the database. From a batch script this can be done easily by creating a 0-byte file data\loggedin\admin.lck in the database. (Don't forget to unlock the database by deleting this file when you're done with the backup).

However, closing existing connections to the database may be a tricky task. The best time to run backup/analyze is during the night, when the chance of someone having connections and writing in the VSS database is small. While you should remind your colleagues to close VS and VSS before leaving home, it is likely sooner or later someone will still leave applications opened with connections to the database.

If you have a shared VSS database you should worry about remote connections. Sourcesafe is a distributed client application, and remote machines will access the database network share directly. There is no good way of closing these connections. You can for instance delete and recreate the database network share (using net share /delete command), but then you'll have hard time restoring the user permissions on the share. A better way may be to close directly the shared opened files (use the "net file" command, parse its output and extract the file IDs then use "net file /close" for the IDs of the file in the VSS database). Of course, disconnecting opened files or file share can cause database corruption, too (again the need for running analyze frequently on the database). A small mitigation is disconnecting the files at night, when they are likely opened in read-mode rather than write, and when the chance of causing corruption is smaller.

And then you have to worry about processes running on the machine hosting the VSS database. These can be your own processes (e.g. VisualStudio, VSSExplorer, SSAdmin, SS.exe, etc) or other user's processes (if the machine is a Terminal Services server). Here you can choose to simply kill the applications (devenv.exe, ssexp.exe, ssadmin.exe, ss.exe, etc) using tools like pskill.exe from Sysinternal's/Microsoft's PsTools suite. Again, killing processes may introduce database corruption (or with VSS2005 you may see warning messages about possible corruption); try as much as possible to close manually the applications instead of killing them.

If you're running SourceSafe 2005 you may also have services on the database machine that keep connections to the database. The VSS Lan Service should be stopped before the backup (net stop ssservice) and restarted after backup is complete (net start ssservice). Also, if your database is enabled for Internet access, the IIS host process and VSS service may access the database so you should stop and restart IIS (net stop w3svc / net start w3svc) after locking the VSS database (such that further incoming VSS Internet connections will be denied).

Stopping and restarting IIS host process may be required even for SourceSafe 6.0 server installations if you're using VSS-controlled FrontPage web projects. On the server, FrontPage uses ssapi.dll to open connections to the local SourceSafe database, therefore you must restart the web server to force FrontPage to close his connections.

Similar considerations apply for 3rd party services and applications that provide remote/Internet access to the VSS database.

When you're done with the backup

Don't forget to restart any services you may have stopped during the backup. Also, don't forget to unlock the database by deleting the ssadmin.lck file.
You may also want to parse the Analyze's output and mail you a summary report, so you can review the next day any fixes analyze may have done to the database.

Backup scripts

Sorry, I won't provide you here with any backup scripts. Just search the Internet for "SourceSafe backup" and you'll find a couple of free scripts. Note however that not all these scripts will take care of all the situations described above, so you may need to tweak and improve them yourself to better suit your needs.

No comments: