Friday, 28 December 2012

Repairing Corrupt SVN Revisions With Can't set position pointer in file : Invalid argument

Ignoring the pros and cons of large binary files in Subversion... there is a really nasty 32 bit signed/unsigned defect in the default MacOS svn shipped inside Xcode 4.5 / 4.6 (and probably other platforms) that can cause revision corruption with combined commits over 2GB.

svn --version
svn, version 1.6.18 (r1303927)
   compiled Aug 4 2012, 19:46:53

The defect appears when you have a very large combined commit that creates a revision file greater than 2GB.

Due to a 32 bit signed/unsigned defect, subsequent revisions that reference any part of the original revision over 2GB are written with negative offsets that will fail when accessed.

Example

r3072 is a 3GB combined commit and r3080 references an offset at 2684354560 (>2GB) in r3072.

svnadmin verify -r 3080 /pathto/Repository/
svnadmin: Can't set position pointer in file 'db/revs/3/3072': Invalid argument


Even though errors are referencing r3072 the negative offsets are in r3080... everything should be OK in r3072.

svnadmin verify -r 3072 /pathto/Repository/
* Verified revision 3072.


Luckily you can easily fix the negative offsets in these corrupt revisions :-)

Repairing Negative Offsets in Corrupt SVN Revisions

Make sure you have a backup of your svn repository and you'll need a file editor that can handle direct editing of large files such as UltraEdit.

If you're using UltraEdit then change the following preferences:
  1. Display > Misc. > Disable line numbers 
  2. Display > File Handling > Temp Files > Disable 
These changes will allow you to view and edit large revision files directly using the Hex Editor.

In the above example, if you looked inside r3080 (db/revs/3/3080) you would find a negative offset in the key value pairs for one of the entries which references r3072.

K 7
example
V 30
dir 0-3071.0.r3072/-1610612736


The -1610612736 should be 2684354560. ie 4294967296 - 1610612736.

You can check the validity of this offset in r3072 (db/revs/3/3072) using the UltraEdit Hex Editor and Search > Goto > 2684354560.

id: 0-3071.0.r3072/2684354560

To avoid changes in value length and file size, you'll need to replace the -1610612736 in r3080 with 02684354560 (leading 0 in place of the minus sign).

In the UltraEdit Hex Editor you can use Search > Replace with the Find ASCII option.

Once all the negative offsets have been corrected you'll then need to fix the checksums for each entry in the revision.

svnadmin verify -r 3080 /pathto/Repository/
svnadmin: Checksum mismatch while reading representation:
   expected: b613a54458b83800bc44c567d3c8fa2d
   actual:   51a324f9303212e8605232569a367adc

Replace the expected checksum in the revision file with the actual checksum and verify again.

svnadmin verify -r 3080 /pathto/Repository/
* Verified revision 3080.

There may be multiple negative offsets and checksums to fix in each corrupt revision.

You'll need to repeat this process for each corrupt revision that references r3072.