Discussion:
SHA-1 collision in repository?
Myria
2018-02-22 20:30:32 UTC
Permalink
When we try to commit a very specific version of a very specific
binary file, we get a SHA-1 collision error from the Subversion
repository:

D:\confidential>svn commit secret.bin -m "Testing broken commit"
Sending secret.bin
Transmitting file data .svn: E160000: Commit failed (details follow):
svn: E160000: SHA1 of reps '604440 34 134255 136680
c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0
134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches
(db11617ef1454332336e00abc311d44bc698f3b3) but contents differ


What can cause this? This file is a binary pixel shader compiled from
a build process. It's most certainly not Google's SHA-1 collision PDF
files. We also scanned the repository to confirm that nobody has
committed Google's collision files.

Occam's Razor suggests that something is wrong with our repository or
Subversion itself, rather than this being a true SHA-1 collision. In
that case, what is wrong with our repository?

If this really is a SHA-1 collision, it would be major cryptography
news that someone randomly ran into a second collision without even
trying. In that case, is there a method by which we could recover the
two files that supposedly have the same SHA-1? The collision doesn't
appear to be in the file itself, but in some sort of diff or revision
output?

Thanks,

Melissa
Matt Simmons
2018-02-22 22:04:27 UTC
Permalink
Hi Melissa,

That definitely is interesting.

I assume you have read
http://blogs.collab.net/subversion/subversion-sha1-collision-problem-statement-prevention-remediation-options


If you do an svnsync to another location and attempt the commit there, does
the problem replicate itself?

--Matt
Post by Myria
When we try to commit a very specific version of a very specific
binary file, we get a SHA-1 collision error from the Subversion
D:\confidential>svn commit secret.bin -m "Testing broken commit"
Sending secret.bin
svn: E160000: SHA1 of reps '604440 34 134255 136680
c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0
134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches
(db11617ef1454332336e00abc311d44bc698f3b3) but contents differ
What can cause this? This file is a binary pixel shader compiled from
a build process. It's most certainly not Google's SHA-1 collision PDF
files. We also scanned the repository to confirm that nobody has
committed Google's collision files.
Occam's Razor suggests that something is wrong with our repository or
Subversion itself, rather than this being a true SHA-1 collision. In
that case, what is wrong with our repository?
If this really is a SHA-1 collision, it would be major cryptography
news that someone randomly ran into a second collision without even
trying. In that case, is there a method by which we could recover the
two files that supposedly have the same SHA-1? The collision doesn't
appear to be in the file itself, but in some sort of diff or revision
output?
Thanks,
Melissa
--
"Today, vegetables... Tomorrow, the world!"
Myria
2018-02-22 22:29:29 UTC
Permalink
That was one document we ran into when searching, yes.

We can do an svnsync, but this will take about a week to run--the
repository is 43 GB with 600,000 commits. I guess we'll start it now.
Post by Matt Simmons
Hi Melissa,
That definitely is interesting.
I assume you have read
http://blogs.collab.net/subversion/subversion-sha1-collision-problem-statement-prevention-remediation-options
If you do an svnsync to another location and attempt the commit there, does
the problem replicate itself?
--Matt
Post by Myria
When we try to commit a very specific version of a very specific
binary file, we get a SHA-1 collision error from the Subversion
D:\confidential>svn commit secret.bin -m "Testing broken commit"
Sending secret.bin
svn: E160000: SHA1 of reps '604440 34 134255 136680
c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0
134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches
(db11617ef1454332336e00abc311d44bc698f3b3) but contents differ
What can cause this? This file is a binary pixel shader compiled from
a build process. It's most certainly not Google's SHA-1 collision PDF
files. We also scanned the repository to confirm that nobody has
committed Google's collision files.
Occam's Razor suggests that something is wrong with our repository or
Subversion itself, rather than this being a true SHA-1 collision. In
that case, what is wrong with our repository?
If this really is a SHA-1 collision, it would be major cryptography
news that someone randomly ran into a second collision without even
trying. In that case, is there a method by which we could recover the
two files that supposedly have the same SHA-1? The collision doesn't
appear to be in the file itself, but in some sort of diff or revision
output?
Thanks,
Melissa
--
"Today, vegetables... Tomorrow, the world!"
Matt Simmons
2018-02-22 22:45:56 UTC
Permalink
I would get more advice from people here before you invest that time. I'm a
relative amateur and would listen to people with more experience than
myself.

--Matt
Post by Myria
That was one document we ran into when searching, yes.
We can do an svnsync, but this will take about a week to run--the
repository is 43 GB with 600,000 commits. I guess we'll start it now.
Post by Matt Simmons
Hi Melissa,
That definitely is interesting.
I assume you have read
http://blogs.collab.net/subversion/subversion-sha1-
collision-problem-statement-prevention-remediation-options
Post by Matt Simmons
If you do an svnsync to another location and attempt the commit there,
does
Post by Matt Simmons
the problem replicate itself?
--Matt
Post by Myria
When we try to commit a very specific version of a very specific
binary file, we get a SHA-1 collision error from the Subversion
D:\confidential>svn commit secret.bin -m "Testing broken commit"
Sending secret.bin
svn: E160000: SHA1 of reps '604440 34 134255 136680
c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0
134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches
(db11617ef1454332336e00abc311d44bc698f3b3) but contents differ
What can cause this? This file is a binary pixel shader compiled from
a build process. It's most certainly not Google's SHA-1 collision PDF
files. We also scanned the repository to confirm that nobody has
committed Google's collision files.
Occam's Razor suggests that something is wrong with our repository or
Subversion itself, rather than this being a true SHA-1 collision. In
that case, what is wrong with our repository?
If this really is a SHA-1 collision, it would be major cryptography
news that someone randomly ran into a second collision without even
trying. In that case, is there a method by which we could recover the
two files that supposedly have the same SHA-1? The collision doesn't
appear to be in the file itself, but in some sort of diff or revision
output?
Thanks,
Melissa
--
"Today, vegetables... Tomorrow, the world!"
--
"Today, vegetables... Tomorrow, the world!"
Myria
2018-02-23 21:06:36 UTC
Permalink
I'm not subscribed to this mailing list, so I have no standard way to
reply to Philip's email. I don't even know his email address.
Post by Philip Martin
That pattern, all of MD5, SHA1 and size matching, is exactly what
happens if a SHA1 collision is committed using an old version of
Subversion where the rep-cache does not detect collisions. The first
part of the collision would have been committed in r604440 and the
second part in r605556.
svnadmin verify -r604440 path/to/repository
svnadmin verify -r605556 path/to/repository
will fail with an MD5 checksum error.
If this is what you see then unfortunately the colliding r605556 content
has been elided and the r605556 revision is corrupt.
The revision 605556 is simply the current revision number of the
repository at the time of the attempted commit, and is unrelated to
the problem. If I attempt the commit now, it's a higher number, but
otherwise the same error message.

Something I did notice is that the commit I'm trying to do is a
reversion to an older version of the same file. The revision of the
file throwing the error at 604440 is identical to the file I'm trying
to commit, but the file currently in the repository is different.

If I commit a dummy version of the file, then commit the version I
actually want, the latter commit works. Could the collision be in a
"diff" instead of the files themselves?

Melissa
Post by Philip Martin
I would get more advice from people here before you invest that time. I'm a
relative amateur and would listen to people with more experience than
myself.
--Matt
Post by Myria
That was one document we ran into when searching, yes.
We can do an svnsync, but this will take about a week to run--the
repository is 43 GB with 600,000 commits. I guess we'll start it now.
Post by Matt Simmons
Hi Melissa,
That definitely is interesting.
I assume you have read
http://blogs.collab.net/subversion/subversion-sha1-collision-problem-statement-prevention-remediation-options
If you do an svnsync to another location and attempt the commit there, does
the problem replicate itself?
--Matt
Post by Myria
When we try to commit a very specific version of a very specific
binary file, we get a SHA-1 collision error from the Subversion
D:\confidential>svn commit secret.bin -m "Testing broken commit"
Sending secret.bin
svn: E160000: SHA1 of reps '604440 34 134255 136680
c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0
134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches
(db11617ef1454332336e00abc311d44bc698f3b3) but contents differ
What can cause this? This file is a binary pixel shader compiled from
a build process. It's most certainly not Google's SHA-1 collision PDF
files. We also scanned the repository to confirm that nobody has
committed Google's collision files.
Occam's Razor suggests that something is wrong with our repository or
Subversion itself, rather than this being a true SHA-1 collision. In
that case, what is wrong with our repository?
If this really is a SHA-1 collision, it would be major cryptography
news that someone randomly ran into a second collision without even
trying. In that case, is there a method by which we could recover the
two files that supposedly have the same SHA-1? The collision doesn't
appear to be in the file itself, but in some sort of diff or revision
output?
Thanks,
Melissa
--
"Today, vegetables... Tomorrow, the world!"
--
"Today, vegetables... Tomorrow, the world!"
Stefan Sperling
2018-02-23 21:25:14 UTC
Permalink
Post by Myria
I'm not subscribed to this mailing list, so I have no standard way to
reply to Philip's email. I don't even know his email address.
Post by Philip Martin
That pattern, all of MD5, SHA1 and size matching, is exactly what
happens if a SHA1 collision is committed using an old version of
Subversion where the rep-cache does not detect collisions. The first
part of the collision would have been committed in r604440 and the
second part in r605556.
svnadmin verify -r604440 path/to/repository
svnadmin verify -r605556 path/to/repository
will fail with an MD5 checksum error.
If this is what you see then unfortunately the colliding r605556 content
has been elided and the r605556 revision is corrupt.
The revision 605556 is simply the current revision number of the
repository at the time of the attempted commit, and is unrelated to
the problem. If I attempt the commit now, it's a higher number, but
otherwise the same error message.
Something I did notice is that the commit I'm trying to do is a
reversion to an older version of the same file. The revision of the
file throwing the error at 604440 is identical to the file I'm trying
to commit, but the file currently in the repository is different.
If I commit a dummy version of the file, then commit the version I
actually want, the latter commit works. Could the collision be in a
"diff" instead of the files themselves?
Melissa
Hi Melissa,

What is the output of the 'svnadmin verify' commands which Philip
wrote about above?

I think the cause of the problem is still unclear, and we probably
won't find a good answer without more information such as this.

Stefan
Philip Martin
2018-02-23 22:50:30 UTC
Permalink
Post by Matt Simmons
Post by Myria
The revision 605556 is simply the current revision number of the
repository at the time of the attempted commit, and is unrelated to
the problem. If I attempt the commit now, it's a higher number, but
otherwise the same error message.
Yes, sorry, I misinterpreted the error message.

svn: E160000: SHA1 of reps '604440 34 134255 136680 c9f4fabc4d093612fece03c339401058 db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0 134255 136680 c9f4fabc4d093612fece03c339401058 db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches (db11617ef1454332336e00abc311d44bc698f3b3) but contents differ

The commit is attempting to create r605556 by reusing data from r604440
but this fails the final stage of the commit when the fultext in the new
revision doesn't match the fulltext in the transaction.
Post by Matt Simmons
Post by Myria
Something I did notice is that the commit I'm trying to do is a
reversion to an older version of the same file. The revision of the
file throwing the error at 604440 is identical to the file I'm trying
to commit, but the file currently in the repository is different.
In that case there is probably no SHA1 collision.
Post by Matt Simmons
Post by Myria
If I commit a dummy version of the file, then commit the version I
actually want, the latter commit works. Could the collision be in a
"diff" instead of the files themselves?
The SHA1 reuse doesn't rely on the history of the file, it is a global
mapping across the repository of file content to checksum. The
collision is between fulltexts of the files, not diffs.

The only way I can explain the dummy commit is if the real file being
committed had svn:keywords or svn:eol-style properties and you did not
have the same properties when you committed the dummy file.

I think this might be the case since you mentioned earlier that you
could not find a file with the given checksum. The checksums apply to
the repository format, i.e. before keyword/eol transformation, and if
you were calculating checksums from working copy files the values would
be different.

One way to check repository format checksums is to use "svn info" on
working copy files. The checksum reported for one of the files modified
in r604440 should be db11617ef1454332336e00abc311d44bc698f3b3
Post by Matt Simmons
Post by Myria
Melissa
Hi Melissa,
What is the output of the 'svnadmin verify' commands which Philip
wrote about above?
That does require server access.

Since you refer to the r604440 content does that mean you can
successfully checkout, or update to, that revision? If so that would
indicate that the revision is not corrupt in the repository.
--
Philip
Myria
2018-02-24 00:09:20 UTC
Permalink
Post by Philip Martin
I think this might be the case since you mentioned earlier that you
could not find a file with the given checksum. The checksums apply to
the repository format, i.e. before keyword/eol transformation, and if
you were calculating checksums from working copy files the values would
be different.
One way to check repository format checksums is to use "svn info" on
working copy files. The checksum reported for one of the files modified
in r604440 should be db11617ef1454332336e00abc311d44bc698f3b3
db11617ef1454332336e00abc311d44bc698f3b3 is the SHA-1 of the actual
pixel shader file I'm trying to commit; in other words, it doesn't
seem to be a hash of a special format. Similarly,
c9f4fabc4d093612fece03c339401058 is its MD5.
Post by Philip Martin
Post by Matt Simmons
Post by Myria
Melissa
Hi Melissa,
What is the output of the 'svnadmin verify' commands which Philip
wrote about above?
That does require server access.
I had our server admin run the commands:

[***@meow ~]# svnadmin verify -r604440 /srv/subversion/repositories/meow/
* Verifying repository metadata ...
* Verified revision 604440.
[***@meow ~]# svnadmin verify -r605556 /srv/subversion/repositories/meow/
* Verifying repository metadata ...
* Verified revision 605556.

("meow" replacing confidential names. No I don't know why the server
admin is running svnadmin as root >.<)
Post by Philip Martin
Since you refer to the r604440 content does that mean you can
successfully checkout, or update to, that revision? If so that would
indicate that the revision is not corrupt in the repository.
I was able to branch (svn copy) the affected branch to a new branch,
and committing the same file to the new branch has the same error.
Checking out that revision works fine; only that commit is affected.

I started an svnsync yesterday to clone the repository to my desktop
machine. It's at revision ~150000 now, so maybe on Monday or Tuesday
I'll be able to try things safely on my local machine.

Once it's on my local machine, I'll be able to compile TortoiseSVN and
debug it while pointing to a file:// repository. (TortoiseSVN instead
of command-line svn because TortoiseSVN is compiled with Visual C++
and is therefore many times easier to debug.)

Melissa
Philip Martin
2018-02-24 01:13:31 UTC
Permalink
Post by Myria
I was able to branch (svn copy) the affected branch to a new branch,
and committing the same file to the new branch has the same error.
Checking out that revision works fine; only that commit is affected.
I suspect the problem is that the repository revision files are OK but
that the rep-cache mapping is corrupt. You would need server-side
access to verify this.

There are a couple of options:

A) disable rep-caching by editing fsfs.conf inside the repository

B) reset the mapping by deleting/renaming the file db/rep-cache.db
inside the repository (but please rename rather than delete if you
want to help us identify the corruption)

Doing either of these should allow the commit to succeed.
--
Philip
Philip Martin
2018-02-24 01:42:02 UTC
Permalink
Post by Philip Martin
A) disable rep-caching by editing fsfs.conf inside the repository
B) reset the mapping by deleting/renaming the file db/rep-cache.db
inside the repository (but please rename rather than delete if you
want to help us identify the corruption)
Doing either of these should allow the commit to succeed.
To verify the corruption start with the rep-cache:

sqlite3 db/rep-cache.db "select * from rep_cache where hash='db11617ef1454332336e00abc311d44bc698f3b3'"

That should give you five numbers: the hash, the revision (604440), the
offset, the size and the expanded size.

Then examine the revision file for r604440. It could be unpacked:

grep -a "^text: 604440.*/_" db/revs/604/604440

or packed:

grep -a "^text: 604440.*/_" db/revs/604.pack/pack

One of the lines from grep should contain the hash and that line should
start:

text: 604440

followed by three more numbers then hashes and other stuff. The three
numbers are the offset, size and expanded size and should match the
values from the rep-cache but I suspect the rep-cache has the wrong
offset.
--
Philip
Myria
2018-02-26 21:41:05 UTC
Permalink
-bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where
hash='db11617ef1454332336e00abc311d44bc698f3b3'"
db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680

The line from the grep -a command containing that hash is below. They
all match.
text: 604440 34 134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 604439-cyqm/_13


In other news, unknown whether related to the current problem, my
attempt to clone the repository to my local computer is failing:

D:\>svnsync sync file:///d:/svnclone
Transmitting file data
.....................................................................................................................................................svnsync:
E160000: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
svnsync: E200014: Checksum mismatch while reading representation:
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3

This is odd, because revision 227185 (the revision it's trying to
commit) verifies fine on the originating server:

-bash-4.1$ sudo svnadmin verify -r227170 /srv/subversion/repositories/meow
* Verifying repository metadata ...
* Verifying metadata at revision 227170 ...
* Verified revision 227170.
-bash-4.1$ sudo svnadmin verify -r227185 /srv/subversion/repositories/meow
* Verifying repository metadata ...
* Verified revision 227185.
Post by Philip Martin
Post by Philip Martin
A) disable rep-caching by editing fsfs.conf inside the repository
B) reset the mapping by deleting/renaming the file db/rep-cache.db
inside the repository (but please rename rather than delete if you
want to help us identify the corruption)
Doing either of these should allow the commit to succeed.
sqlite3 db/rep-cache.db "select * from rep_cache where hash='db11617ef1454332336e00abc311d44bc698f3b3'"
That should give you five numbers: the hash, the revision (604440), the
offset, the size and the expanded size.
grep -a "^text: 604440.*/_" db/revs/604/604440
grep -a "^text: 604440.*/_" db/revs/604.pack/pack
One of the lines from grep should contain the hash and that line should
text: 604440
followed by three more numbers then hashes and other stuff. The three
numbers are the offset, size and expanded size and should match the
values from the rep-cache but I suspect the rep-cache has the wrong
offset.
--
Philip
Branko Čibej
2018-02-27 07:45:17 UTC
Permalink
Post by Myria
-bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where
hash='db11617ef1454332336e00abc311d44bc698f3b3'"
db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680
The line from the grep -a command containing that hash is below. They
all match.
text: 604440 34 134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 604439-cyqm/_13
In other news, unknown whether related to the current problem, my
D:\>svnsync sync file:///d:/svnclone
Transmitting file data
E160000: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
This is odd, because revision 227185 (the revision it's trying to
-bash-4.1$ sudo svnadmin verify -r227170 /srv/subversion/repositories/meow
* Verifying repository metadata ...
* Verifying metadata at revision 227170 ...
* Verified revision 227170.
-bash-4.1$ sudo svnadmin verify -r227185 /srv/subversion/repositories/meow
* Verifying repository metadata ...
* Verified revision 227185.
It is a very, *very* bad idea to perform any operations on the
repository as root! You should not have to do that.

Please check file ownership and permission throughout the repository;
none of the files should be owned by root.

-- Brane
Post by Myria
Post by Philip Martin
Post by Philip Martin
A) disable rep-caching by editing fsfs.conf inside the repository
B) reset the mapping by deleting/renaming the file db/rep-cache.db
inside the repository (but please rename rather than delete if you
want to help us identify the corruption)
Doing either of these should allow the commit to succeed.
sqlite3 db/rep-cache.db "select * from rep_cache where hash='db11617ef1454332336e00abc311d44bc698f3b3'"
That should give you five numbers: the hash, the revision (604440), the
offset, the size and the expanded size.
grep -a "^text: 604440.*/_" db/revs/604/604440
grep -a "^text: 604440.*/_" db/revs/604.pack/pack
One of the lines from grep should contain the hash and that line should
text: 604440
followed by three more numbers then hashes and other stuff. The three
numbers are the offset, size and expanded size and should match the
values from the rep-cache but I suspect the rep-cache has the wrong
offset.
--
Philip
Philip Martin
2018-02-27 13:54:23 UTC
Permalink
Post by Myria
-bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where
hash='db11617ef1454332336e00abc311d44bc698f3b3'"
db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680
The line from the grep -a command containing that hash is below. They
all match.
text: 604440 34 134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 604439-cyqm/_13
The rep-cache looks correct. There doesn't seem to be any corruption in
the repository: you confirmed that you could retreive the revision in
question, and that you could verify the revision, and the rep-cache
looks OK. So why is the commit that attempts to reuse the data in the
revision failing? I don't know :-(
Post by Myria
In other news, unknown whether related to the current problem, my
D:\>svnsync sync file:///d:/svnclone
Transmitting file data
E160000: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
This is odd, because revision 227185 (the revision it's trying to
That's an error committing to the new repository on your local machine,
i.e. the problem is in the new repository not the repository on the
originating server. Can you run "svnadmin verify" on the new
repository? You may want to use -M to increase the cache size for the
verify command as the default is small.

It would be odd for svnsync to create a corrupt repository, so I half
expect verify to report no problems. If that is the case it seems to be
the original pproblem again: an apparently valid repository with a
checksum error only on commit. So this problem is happening on two
repositories, on two machines with different OS.
--
Philip
Myria
2018-02-27 21:09:17 UTC
Permalink
Post by Philip Martin
Post by Myria
-bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where
hash='db11617ef1454332336e00abc311d44bc698f3b3'"
db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680
The line from the grep -a command containing that hash is below. They
all match.
text: 604440 34 134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 604439-cyqm/_13
The rep-cache looks correct. There doesn't seem to be any corruption in
the repository: you confirmed that you could retreive the revision in
question, and that you could verify the revision, and the rep-cache
looks OK. So why is the commit that attempts to reuse the data in the
revision failing? I don't know :-(
Post by Myria
In other news, unknown whether related to the current problem, my
D:\>svnsync sync file:///d:/svnclone
Transmitting file data
E160000: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
This is odd, because revision 227185 (the revision it's trying to
That's an error committing to the new repository on your local machine,
i.e. the problem is in the new repository not the repository on the
originating server. Can you run "svnadmin verify" on the new
repository? You may want to use -M to increase the cache size for the
verify command as the default is small.
It would be odd for svnsync to create a corrupt repository, so I half
expect verify to report no problems. If that is the case it seems to be
the original pproblem again: an apparently valid repository with a
checksum error only on commit. So this problem is happening on two
repositories, on two machines with different OS.
Not to mention that the two revisions complained about are unrelated, and
2/3 the repository history apart.

One thing that's interesting is that the commit the svnsync failed on is a
gigantic commit. It's 1.8 GB. Maybe that svnsync is failing because of a
Subversion bug with huge files...?

I started an svnadmin verify on my incomplete local copy last night, and no
problems were reported when it finished this morning. I'll try again with
this -M option you mention.

I'll also start an svnsync from a Linux machine.

I'm going to see how hard it would be to just copy the 43 GB repository
directly. We'd have to shut down Subversion service during the copy, so it
might be a while before I have a chance to.
Post by Philip Martin
--
Philip
Johan Corveleyn
2018-02-27 22:22:41 UTC
Permalink
Post by Myria
Post by Philip Martin
Post by Myria
-bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where
hash='db11617ef1454332336e00abc311d44bc698f3b3'"
db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680
The line from the grep -a command containing that hash is below. They
all match.
text: 604440 34 134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 604439-cyqm/_13
The rep-cache looks correct. There doesn't seem to be any corruption in
the repository: you confirmed that you could retreive the revision in
question, and that you could verify the revision, and the rep-cache
looks OK. So why is the commit that attempts to reuse the data in the
revision failing? I don't know :-(
Post by Myria
In other news, unknown whether related to the current problem, my
D:\>svnsync sync file:///d:/svnclone
Transmitting file data
E160000: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
This is odd, because revision 227185 (the revision it's trying to
That's an error committing to the new repository on your local machine,
i.e. the problem is in the new repository not the repository on the
originating server. Can you run "svnadmin verify" on the new
repository? You may want to use -M to increase the cache size for the
verify command as the default is small.
It would be odd for svnsync to create a corrupt repository, so I half
expect verify to report no problems. If that is the case it seems to be
the original pproblem again: an apparently valid repository with a
checksum error only on commit. So this problem is happening on two
repositories, on two machines with different OS.
Not to mention that the two revisions complained about are unrelated, and
2/3 the repository history apart.
One thing that's interesting is that the commit the svnsync failed on is a
gigantic commit. It's 1.8 GB. Maybe that svnsync is failing because of a
Subversion bug with huge files...?
I started an svnadmin verify on my incomplete local copy last night, and no
problems were reported when it finished this morning. I'll try again with
this -M option you mention.
I'll also start an svnsync from a Linux machine.
I'm going to see how hard it would be to just copy the 43 GB repository
directly. We'd have to shut down Subversion service during the copy, so it
might be a while before I have a chance to.
What version of SVN server are you using actually? AFAICS you never
mentioned this.

I'm wondering whether this is related to the bug that was fixed for 1.8.x here:

http://svn.apache.org/viewvc?view=revision&revision=1803435

... or a similar problem.
I'm actually not sure whether that bugfix was released already (it's
not mentioned in CHANGES).

See also the users@ thread it references (an false positive of a SHA-1
collision occurred during 'svnadmin load'):
https://lists.apache.org/thread.html/b475d74442bdf93b21c8656ab2289b4c61e0d90efdafc8a16ddca694@%3Cusers.subversion.apache.org%3E

OTOH there was also another report of a SHA-1 collision during
'svnadmin load', this time with 1.9.7. We never got to the bottom of
that one either:
https://svn.haxx.se/users/archive-2018-01/0062.shtml
--
Johan
Johan Corveleyn
2018-02-28 07:57:40 UTC
Permalink
[ Please keep the users list in cc. ]
Post by Johan Corveleyn
Post by Myria
Post by Philip Martin
Post by Myria
-bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where
hash='db11617ef1454332336e00abc311d44bc698f3b3'"
db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680
The line from the grep -a command containing that hash is below. They
all match.
text: 604440 34 134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 604439-cyqm/_13
The rep-cache looks correct. There doesn't seem to be any corruption in
the repository: you confirmed that you could retreive the revision in
question, and that you could verify the revision, and the rep-cache
looks OK. So why is the commit that attempts to reuse the data in the
revision failing? I don't know :-(
Post by Myria
In other news, unknown whether related to the current problem, my
D:\>svnsync sync file:///d:/svnclone
Transmitting file data
E160000: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
This is odd, because revision 227185 (the revision it's trying to
That's an error committing to the new repository on your local machine,
i.e. the problem is in the new repository not the repository on the
originating server. Can you run "svnadmin verify" on the new
repository? You may want to use -M to increase the cache size for the
verify command as the default is small.
It would be odd for svnsync to create a corrupt repository, so I half
expect verify to report no problems. If that is the case it seems to be
the original pproblem again: an apparently valid repository with a
checksum error only on commit. So this problem is happening on two
repositories, on two machines with different OS.
Not to mention that the two revisions complained about are unrelated, and
2/3 the repository history apart.
One thing that's interesting is that the commit the svnsync failed on is a
gigantic commit. It's 1.8 GB. Maybe that svnsync is failing because of a
Subversion bug with huge files...?
I started an svnadmin verify on my incomplete local copy last night, and no
problems were reported when it finished this morning. I'll try again with
this -M option you mention.
I'll also start an svnsync from a Linux machine.
I'm going to see how hard it would be to just copy the 43 GB repository
directly. We'd have to shut down Subversion service during the copy, so it
might be a while before I have a chance to.
What version of SVN server are you using actually? AFAICS you never
mentioned this.
http://svn.apache.org/viewvc?view=revision&revision=1803435
... or a similar problem.
I'm actually not sure whether that bugfix was released already (it's
not mentioned in CHANGES).
OTOH there was also another report of a SHA-1 collision during
'svnadmin load', this time with 1.9.7. We never got to the bottom of
https://svn.haxx.se/users/archive-2018-01/0062.shtml
Both the server and my desktop system are on Subversion 1.9.7.
--
Johan
Philip Martin
2018-02-28 11:31:43 UTC
Permalink
Post by Johan Corveleyn
http://svn.apache.org/viewvc?view=revision&revision=1803435
... or a similar problem.
I'm actually not sure whether that bugfix was released already (it's
not mentioned in CHANGES).
That fix has been nominated in STATUS but not yet approved or released.
It affects 1.8 only.
--
Philip
Nico Kadel-Garcia
2018-02-28 14:17:17 UTC
Permalink
Post by Myria
Not to mention that the two revisions complained about are unrelated, and
2/3 the repository history apart.
One thing that's interesting is that the commit the svnsync failed on is a
gigantic commit. It's 1.8 GB. Maybe that svnsync is failing because of a
Subversion bug with huge files...?
Hmm. Could 2 GB filesize limites be involved?

When someone starts encountering this kind of issue with such large
commits, it leads me to think "what the heck was in that commit"?
There are various tools more likely to break when hammered that hard,
wuch as pre-commit hooks written carelessly in Python that try to
preload a hash with the contents of the file and just say "holy sone
of a !@#$, I'm out of resources!!!". Been there, done that, had to
explain the concept of reading a text file with a loop to the
programmer in question.

Also, I'd like to think outside the box at such a point and say "can
that commit be skipped? is there anything in it that we actually need?
can we just do an export/import to a new repo, discard the old repo's
history, and get back to work?" And, what has been a useful tool to
re-arrange and discard undesired branches and tags and history with,
"can we do a git-svn export, flush history we don't need", and publish
it back up to the new canonical Subversion repository"? I've use that
approach now for several Subversion upgrades effectively, especially
to allow some sanitization of the Subversion history. I know such
history clean up is often discouraged, that the history is considered
the critical component of the source control system, but there are
many environments where legacy history is no longer needed. I wonder
if this is one, and the questionable commit itself could be dumped.
Post by Myria
I started an svnadmin verify on my incomplete local copy last night, and no
problems were reported when it finished this morning. I'll try again with
this -M option you mention.
I'll also start an svnsync from a Linux machine.
I'm going to see how hard it would be to just copy the 43 GB repository
directly. We'd have to shut down Subversion service during the copy, so it
might be a while before I have a chance to.
Would you? If you can use a "rsync" based operation, such as mounting
the share on the Linux system via CIFS and using "rsync", you should
be able to verify that no operations occur during the filesystem based
replication and re-run the "rsync" command when completed, to catch
any dangling operations. If the Subversion server is busy, you might
have to block "write" operations for a while to support consistent
replication.

With tools like CygWin, the files could also be rsynced and copied
locally on the Windows box, then sent over to the Linux box. or an
external USB used, or many other tools.
Post by Myria
Post by Philip Martin
--
Philip
Myria
2018-03-02 02:45:38 UTC
Permalink
Post by Nico Kadel-Garcia
Post by Myria
Not to mention that the two revisions complained about are unrelated, and
2/3 the repository history apart.
One thing that's interesting is that the commit the svnsync failed on is a
gigantic commit. It's 1.8 GB. Maybe that svnsync is failing because of a
Subversion bug with huge files...?
Hmm. Could 2 GB filesize limites be involved?
When someone starts encountering this kind of issue with such large
commits, it leads me to think "what the heck was in that commit"?
There are various tools more likely to break when hammered that hard,
wuch as pre-commit hooks written carelessly in Python that try to
preload a hash with the contents of the file and just say "holy sone
explain the concept of reading a text file with a loop to the
programmer in question.
The error with the 2 GB file occurred when trying to replicate the
repository in order to diagnose the original problem. The original
problem does not involve large files.

Also, I have no control over what was in the repository five years
ago. The huge files were compiled versions of WebKit libraries. The
alternative to committing these very large files would have been to
quadruple the build times, because it takes four times longer to build
WebKit than it does to build our project.


In other news, I can now reproduce the huge file problem in
TortoiseSVN committing to my "file:" partial copy of the repository.
However, with SourceForge down due to a DDoS, I cannot get the source
code to TortoiseSVN in order to debug it.

This does mean that this is very likely to be a Subversion bug,
probably something in 1.8.x or 1.9.x. The commit that prevented
"svnsync" from working was probably during 1.6 or 1.7, which
succeeded.

Melissa
Myria
2018-03-02 03:25:09 UTC
Permalink
I just found out that the file causing the error from the large commit
is not the large file - it's one of the smaller files, about 55 KB.
If I commit that single smaller file from the large commit, it errors
the same way as the original 227185 would. This is exactly like the
original problem with committing the pixel shader.

I managed to get the db/transactions/227184-4vb2.txn directory by
breakpointing kernel32!DeleteFileW in TortoiseSVN (so I could get the
contents before TortoiseSVN deleted them at failure). I don't know
how they're useful, though.

The only way I know how to proceed is to wait until the source code to
TortoiseSVN is available so that I can debug it in Visual Studio. Is
there something else I can do?
Post by Myria
Post by Nico Kadel-Garcia
Post by Myria
Not to mention that the two revisions complained about are unrelated, and
2/3 the repository history apart.
One thing that's interesting is that the commit the svnsync failed on is a
gigantic commit. It's 1.8 GB. Maybe that svnsync is failing because of a
Subversion bug with huge files...?
Hmm. Could 2 GB filesize limites be involved?
When someone starts encountering this kind of issue with such large
commits, it leads me to think "what the heck was in that commit"?
There are various tools more likely to break when hammered that hard,
wuch as pre-commit hooks written carelessly in Python that try to
preload a hash with the contents of the file and just say "holy sone
explain the concept of reading a text file with a loop to the
programmer in question.
The error with the 2 GB file occurred when trying to replicate the
repository in order to diagnose the original problem. The original
problem does not involve large files.
Also, I have no control over what was in the repository five years
ago. The huge files were compiled versions of WebKit libraries. The
alternative to committing these very large files would have been to
quadruple the build times, because it takes four times longer to build
WebKit than it does to build our project.
In other news, I can now reproduce the huge file problem in
TortoiseSVN committing to my "file:" partial copy of the repository.
However, with SourceForge down due to a DDoS, I cannot get the source
code to TortoiseSVN in order to debug it.
This does mean that this is very likely to be a Subversion bug,
probably something in 1.8.x or 1.9.x. The commit that prevented
"svnsync" from working was probably during 1.6 or 1.7, which
succeeded.
Melissa
Nico Kadel-Garcia
2018-03-02 05:09:46 UTC
Permalink
Post by Myria
I just found out that the file causing the error from the large commit
is not the large file - it's one of the smaller files, about 55 KB.
If I commit that single smaller file from the large commit, it errors
the same way as the original 227185 would. This is exactly like the
original problem with committing the pixel shader.
I managed to get the db/transactions/227184-4vb2.txn directory by
breakpointing kernel32!DeleteFileW in TortoiseSVN (so I could get the
contents before TortoiseSVN deleted them at failure). I don't know
how they're useful, though.
The only way I know how to proceed is to wait until the source code to
TortoiseSVN is available so that I can debug it in Visual Studio. Is
there something else I can do?
Sorry that I've not been paying attention to every detail. Do you see
the same issues if you use the Subversion from CygWin, which is
proably a lot easier to recompile?
Myria
2018-03-02 20:28:47 UTC
Permalink
The problem is identical on Windows command line, Windows TortoiseSVN,
Ubuntu-Linux, Ubuntu-Linux on Windows, and macOS. I'm just bad at
GDB.
Post by Nico Kadel-Garcia
Post by Myria
I just found out that the file causing the error from the large commit
is not the large file - it's one of the smaller files, about 55 KB.
If I commit that single smaller file from the large commit, it errors
the same way as the original 227185 would. This is exactly like the
original problem with committing the pixel shader.
I managed to get the db/transactions/227184-4vb2.txn directory by
breakpointing kernel32!DeleteFileW in TortoiseSVN (so I could get the
contents before TortoiseSVN deleted them at failure). I don't know
how they're useful, though.
The only way I know how to proceed is to wait until the source code to
TortoiseSVN is available so that I can debug it in Visual Studio. Is
there something else I can do?
Sorry that I've not been paying attention to every detail. Do you see
the same issues if you use the Subversion from CygWin, which is
proably a lot easier to recompile?
Philip Martin
2018-03-02 23:16:34 UTC
Permalink
Post by Myria
I just found out that the file causing the error from the large commit
is not the large file - it's one of the smaller files, about 55 KB.
If I commit that single smaller file from the large commit, it errors
the same way as the original 227185 would. This is exactly like the
original problem with committing the pixel shader.
If I understand correctly you are committing a single file using a
file:// URL and getting the error. If so then you may be able to
produce a much smaller testcase, see later.
Post by Myria
I managed to get the db/transactions/227184-4vb2.txn directory by
breakpointing kernel32!DeleteFileW in TortoiseSVN (so I could get the
contents before TortoiseSVN deleted them at failure). I don't know
how they're useful, though.
The only way I know how to proceed is to wait until the source code to
TortoiseSVN is available so that I can debug it in Visual Studio. Is
there something else I can do?
Are you able to share your repository, either in public or privately
with me? How big is the repository now it has fewer revisions?



The file:// commit does not have any cache from previous commits, unlike
svnsync or apache, so the error must involve data read explicitly during
the failed commit. That means the ancestors of the file being
committed, the parent directories, and possibly some other file (not an
ancestor) referenced by SHA1 in the rep-cache.

If you want to debug it yourself then as well as the transaction
directory db/transactions/227184-4vb2.txn/ there is also the protorev
file db/txn-protorevs/227184-4vb2.rev which contains the new content of
the committed files. When the content is sent by the client it gets
written as a delta to the txn-protorev file:

DELTA
SVN....
ENDREP

Since the file being committed matched a SHA1 in the rep-cache the
commit process will attempt to remove this delta but will first verify
that the fulltext obtained by expanding the delta in the protorev file
matches the fulltext in the repository, see get_shared_rep() in
subversion/libsvn_fs_fs/transaction.c.

/* Compare the two representations.
* Note that the stream comparison might also produce MD5 checksum
* errors or other failures in case of SHA1 collisions. */
SVN_ERR(svn_fs_fs__get_contents_from_file(&contents, fs, rep, file,
offset, scratch_pool));
SVN_ERR(svn_fs_fs__get_contents(&old_contents, fs, &old_rep_norm,
FALSE, scratch_pool));
err = svn_stream_contents_same2(&same, contents, old_contents,
scratch_pool);

Normally they compare equal and the protorev file is truncated to remove
the delta, but in your case they do not match and the commit fails.



As far as producing a smaller testcase: it may be possible to trim out
all the files and directories in other parts of the repository. For
example if the repository path to the parent directory of the commit is
/project/branch/foo/bar then you can use

svndumpfilter include /project/branch/foo/bar

in an

svnadmin dump ... | svndumpfilter ... | svnadmin load

pipeline to produce a smaller repository and this smaller repository may
reproduce the error. Dump and load tend to go faster with a larger
than default -M parameter.

There are some reasons why this may not work:

- it may be necessary to expand the included tree to cope with copies.

- the rep-cache might refer to a totally different file that happens
to have the same SHA1/content, in which case the included tree may
need to include this file as well.

- the reduced repository will have the same files but the directories
will be different/smaller and their content may be necessary to
trigger the bug

- the reduced repository will contain less data meaning smaller file
offsets and the larger offsets may be necessary to trigger the bug

To determine whether the rep-cache SHA1 refers to a different file you
first need the repository form of the file being committed, i.e. with
svn:keywords and svn:eol-style detranslated. Then calculate the SHA1
and lookup the hash in the rep-cache:

sqlite3 db/rep-cache.db "select revision from rep_cache where hash='xxxxxxx'"

This tells you which revision is involved, then you look in the revision
file db/revs/nnn/nnnnn to find the hash and determine the file path.


Thank you for persisting with the investigation!
--
Philip
Nathan Hartman
2018-03-04 02:08:41 UTC
Permalink
Post by Philip Martin
Since the file being committed matched a SHA1 in the rep-cache the
commit process will attempt to remove this delta but will first verify
that the fulltext obtained by expanding the delta in the protorev file
matches the fulltext in the repository, see get_shared_rep() in
subversion/libsvn_fs_fs/transaction.c.
Does this mean that content being committed to the repository is never elided based on the SHA hash alone but only after a fulltext verification that the content actually already exists in the repository?
Stefan Sperling
2018-03-04 10:28:05 UTC
Permalink
Post by Nathan Hartman
Post by Philip Martin
Since the file being committed matched a SHA1 in the rep-cache the
commit process will attempt to remove this delta but will first verify
that the fulltext obtained by expanding the delta in the protorev file
matches the fulltext in the repository, see get_shared_rep() in
subversion/libsvn_fs_fs/transaction.c.
Does this mean that content being committed to the repository is never elided
based on the SHA hash alone but only after a fulltext verification that the
content actually already exists in the repository?
Yes. And if the content differs, it must be rejected, because an FSFS
repository can only store one content per SHA1 checksum.
Philip Martin
2018-03-04 11:12:00 UTC
Permalink
Post by Stefan Sperling
Yes. And if the content differs, it must be rejected, because an FSFS
repository can only store one content per SHA1 checksum.
To be accurate the server-side code can handle the files perfectly well
if rep-caching is disabled. One can retreive either file, dump/load
into another repository, etc.

# Import 2 files from shattered.io
svnadmin create repo1
echo [rep-sharing] >> repo1/db/fsfs.conf
echo enable-rep-sharing=false >> repo1/db/fsfs.conf
svn import -mm shattered-1.pdf file://`pwd`/repo1/f
svn import -mm shattered-2.pdf file://`pwd`/repo1/g

# dump/load
svnadmin create repo2
echo [rep-sharing] >> repo2/db/fsfs.conf
echo enable-rep-sharing=false >> repo2/db/fsfs.conf
svnadmin dump repo1 | svnadmin load repo2

# verify files have same SHA1 but different MD5
$ svnlook cat repo2 f | sha1sum
38762cf7f55934b34d179ae6a4c80cadccbb7f0a -
$ svnlook cat repo2 g | sha1sum
38762cf7f55934b34d179ae6a4c80cadccbb7f0a -
$ svnlook cat repo2 f | md5sum
ee4aa52b139d925f8d8884402b0a750c -
$ svnlook cat repo2 g | md5sum
5bd9d8cabc46041579a311230539b8d1 -

The on the client side the working copy code doesn't handle the files as
the pristines collide.
--
Philip
Stefan Sperling
2018-03-04 11:37:10 UTC
Permalink
Post by Philip Martin
Post by Stefan Sperling
Yes. And if the content differs, it must be rejected, because an FSFS
repository can only store one content per SHA1 checksum.
To be accurate the server-side code can handle the files perfectly well
if rep-caching is disabled.
Yes, that's true. But turning the rep-cache off is *not* recommended.
Post by Philip Martin
The on the client side the working copy code doesn't handle the files as
the pristines collide.
The network protocol over HTTP will have problems as well: ra_serf won't
fetch files it believes the working copy already has in the pristine store.
Nathan Hartman
2018-03-04 12:48:23 UTC
Permalink
Post by Stefan Sperling
Post by Nathan Hartman
Does this mean that content being committed to the repository is never elided
based on the SHA hash alone but only after a fulltext verification that the
content actually already exists in the repository?
Yes. And if the content differs, it must be rejected, because an FSFS
repository can only store one content per SHA1 checksum.
Just a thought here: perhaps in Myria's case no collision was ever stored, but rather one of the repository's files on disk became corrupt for some reason. Bit rot? Bad disk sector? Now it doesn't match its SHA sum and throws an error when trying to commit a file that is identical to the one committed previously but which doesn't match the fulltext of that earlier commit because of the bit rot / corruption.
Philip Martin
2018-03-04 10:29:09 UTC
Permalink
Post by Nathan Hartman
Does this mean that content being committed to the repository is never
elided based on the SHA hash alone but only after a fulltext
verification that the content actually already exists in the
repository?
That's correct. Fulltext matching was added in 1.9.6 and 1.8.18, older
versions of Subversion relied on the SHA1 match alone.
--
Philip
Myria
2018-03-04 10:31:24 UTC
Permalink
How can I dump out the two things that Subversion thinks have the same
SHA-1 checksum but don't match? This seems to be rather difficult to do.

That said, it's far more likely that there's a bug in Subversion than that
we randomly collided SHA-1.
Post by Philip Martin
Post by Nathan Hartman
Does this mean that content being committed to the repository is never
elided based on the SHA hash alone but only after a fulltext
verification that the content actually already exists in the
repository?
That's correct. Fulltext matching was added in 1.9.6 and 1.8.18, older
versions of Subversion relied on the SHA1 match alone.
--
Philip
Philip Martin
2018-03-04 11:16:24 UTC
Permalink
Post by Myria
How can I dump out the two things that Subversion thinks have the same
SHA-1 checksum but don't match? This seems to be rather difficult to do.
On the server side:

svnlook cat repository path-in-repository
svnlook cat -r N repository path-in-repository
svnlook cat -t TXN repository path-in-repository
--
Philip
Myria
2018-03-06 02:56:59 UTC
Permalink
GMail keeps doing reply instead of reply all. I'm having to manually
add the users list back now.

Below is the thread I sent.


---------- Forwarded message ----------
From: Myria <***@gmail.com>
Date: Mon, Mar 5, 2018 at 6:37 PM
Subject: Re: SHA-1 collision in repository?
To: Philip Martin <***@codematters.co.uk>


I now know where the checksum error happens, but not why.

svn: E200014: Checksum mismatch while reading representation:
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3

It's calculating the MD5 of only the first 16 KB of the input file and
comparing against the MD5 of the entire file. The 16 KB number seems
to be SVN__STREAM_CHUNK_SIZE.

bb52be764a04d511ebb06e1889910dcf is the MD5 of the entire file.
80a10d37de91cadc604ba30e379651b3 is the MD5 of the first 16384 bytes.
I managed to compile a subversion command line client with debugging
information and optimizations disabled, and can reproduce the problem
with GDB attached.
Here is a backtrace at the time at which the error occurs. A few line
numbers in stream.c will be wrong by a few lines due to a few printf's
I added.
#0 svn_checksum_mismatch_err (expected=0x7ffffffdcf00,
actual=0x7ffffa0700a0, scratch_pool=0x7ffffa070028,
fmt=0x7ffffc259ac0 "Checksum mismatch while reading
representation") at subversion/libsvn_subr/checksum.c:638
#1 0x00007ffffc2123de in rep_read_contents (baton=0x7ffffa1f6190,
buf=0x7ffffa1f66a8 "// <redacted>"..., len=0x7ffffffdcf88)
at subversion/libsvn_fs_fs/cached_data.c:2062
#2 0x00007ffffe5645fd in svn_stream_read_full (stream=0x7ffffa1f6470,
buffer=0x7ffffa1f66a8 "// <redacted>"..., len=0x7ffffffdcf88)
at subversion/libsvn_subr/stream.c:193
#3 0x00007ffffe5653f3 in svn_stream_contents_same2
(same=0x7ffffffdd01c, stream1=0x7ffffa1f6470,
stream2=0x7ffffa1f6650, pool=0x7ffffa1e0028) at
subversion/libsvn_subr/stream.c:589
#4 0x00007ffffc247226 in get_shared_rep (old_rep=0x7ffffffdd188,
fs=0x7fffff601030, rep=0x7ffffa0e20b8,
file=0x7ffffa1e0390, offset=0, reps_hash=0x0,
result_pool=0x7fffff5e0028, scratch_pool=0x7ffffa1e0028)
at subversion/libsvn_fs_fs/transaction.c:2280
#5 0x00007ffffc247734 in rep_write_contents_close
(baton=0x7ffffa232ff0) at subversion/libsvn_fs_fs/transaction.c:2370
#6 0x00007ffffe56492b in svn_stream_close (stream=0x7ffffa233140) at
subversion/libsvn_subr/stream.c:274
#7 0x00007ffffe841001 in apply_window (window=0x0,
baton=0x7ffffa1000a0) at subversion/libsvn_delta/text_delta.c:732
#8 0x00007ffffc2520d2 in window_consumer (window=0x0,
baton=0x7fffff5f1ab8) at subversion/libsvn_fs_fs/tree.c:2935
#9 0x00007ffffe8405ef in svn_txdelta_run (source=0x7fffff5f1a18,
target=0x7fffff5f1298,
handler=0x7ffffc25209f <window_consumer>,
handler_baton=0x7fffff5f1ab8, checksum_kind=svn_checksum_md5,
checksum=0x7ffffffdd458, cancel_func=0x0, cancel_baton=0x0,
result_pool=0x7fffff5e0028,
scratch_pool=0x7fffff5e0028) at subversion/libsvn_delta/text_delta.c:454
#10 0x00007ffffee98a57 in svn_wc__internal_transmit_text_deltas (tempfile=0x0,
new_text_base_md5_checksum=0x7ffffffdd5b0,
new_text_base_sha1_checksum=0x7ffffffdd5b8, db=0x7fffff6c17d8,
local_abspath=0x7fffff672d08
"/mnt/d/svntest/repository/directory/Redacted.cpp",
fulltext=0, editor=0x7fffff673700, file_baton=0x7fffff510110,
result_pool=0x7fffff6c0028,
scratch_pool=0x7fffff5e0028) at subversion/libsvn_wc/adm_crawler.c:1109
#11 0x00007ffffee98d68 in svn_wc_transmit_text_deltas3
(new_text_base_md5_checksum=0x7ffffffdd5b0,
new_text_base_sha1_checksum=0x7ffffffdd5b8, wc_ctx=0x7fffff6c17c0,
local_abspath=0x7fffff672d08
"/mnt/d/svntest/repository/directory/Redacted.cpp",
fulltext=0, editor=0x7fffff673700, file_baton=0x7fffff510110,
result_pool=0x7fffff6c0028,
scratch_pool=0x7fffff5e0028) at subversion/libsvn_wc/adm_crawler.c:1199
#12 0x00007fffff18eb12 in svn_client__do_commit (
base_url=0x7fffff6142c0 "file:///mnt/d/svntest/repository/directory",
commit_items=0x7fffff672c48, editor=0x7fffff673700,
edit_baton=0x7fffff6300a0,
notify_path_prefix=0x7fffff672900 "/mnt/d/svntest/repository",
sha1_checksums=0x7ffffffdd750,
ctx=0x7fffff6c16f0, result_pool=0x7fffff6c0028, scratch_pool=0x7fffff650028)
at subversion/libsvn_client/commit_util.c:1920
#13 0x00007fffff18a5f9 in svn_client_commit6 (targets=0x7fffff670a18,
depth=svn_depth_infinity, keep_locks=0,
keep_changelists=0, commit_as_operations=1,
include_file_externals=0, include_dir_externals=0,
changelists=0x7fffff6c0780, revprop_table=0x0,
commit_callback=0x42c6a0 <svn_cl__print_commit_info>,
commit_baton=0x0, ctx=0x7fffff6c16f0, pool=0x7fffff6c0028) at
subversion/libsvn_client/commit.c:901
#14 0x000000000040b744 in svn_cl__commit (os=0x7fffff6c0520,
baton=0x7ffffffddc60, pool=0x7fffff6c0028)
at subversion/svn/commit-cmd.c:171
#15 0x000000000042b351 in sub_main (exit_code=0x7ffffffddf3c, argc=5,
argv=0x7ffffffde038, pool=0x7fffff6c0028)
at subversion/svn/svn.c:3041
#16 0x000000000042b5ee in main (argc=5, argv=0x7ffffffde038) at
subversion/svn/svn.c:3126
Post by Philip Martin
Post by Myria
I just found out that the file causing the error from the large commit
is not the large file - it's one of the smaller files, about 55 KB.
If I commit that single smaller file from the large commit, it errors
the same way as the original 227185 would. This is exactly like the
original problem with committing the pixel shader.
If I understand correctly you are committing a single file using a
file:// URL and getting the error. If so then you may be able to
produce a much smaller testcase, see later.
Post by Myria
I managed to get the db/transactions/227184-4vb2.txn directory by
breakpointing kernel32!DeleteFileW in TortoiseSVN (so I could get the
contents before TortoiseSVN deleted them at failure). I don't know
how they're useful, though.
The only way I know how to proceed is to wait until the source code to
TortoiseSVN is available so that I can debug it in Visual Studio. Is
there something else I can do?
Are you able to share your repository, either in public or privately
with me? How big is the repository now it has fewer revisions?
The file:// commit does not have any cache from previous commits, unlike
svnsync or apache, so the error must involve data read explicitly during
the failed commit. That means the ancestors of the file being
committed, the parent directories, and possibly some other file (not an
ancestor) referenced by SHA1 in the rep-cache.
If you want to debug it yourself then as well as the transaction
directory db/transactions/227184-4vb2.txn/ there is also the protorev
file db/txn-protorevs/227184-4vb2.rev which contains the new content of
the committed files. When the content is sent by the client it gets
DELTA
SVN....
ENDREP
Since the file being committed matched a SHA1 in the rep-cache the
commit process will attempt to remove this delta but will first verify
that the fulltext obtained by expanding the delta in the protorev file
matches the fulltext in the repository, see get_shared_rep() in
subversion/libsvn_fs_fs/transaction.c.
/* Compare the two representations.
* Note that the stream comparison might also produce MD5 checksum
* errors or other failures in case of SHA1 collisions. */
SVN_ERR(svn_fs_fs__get_contents_from_file(&contents, fs, rep, file,
offset, scratch_pool));
SVN_ERR(svn_fs_fs__get_contents(&old_contents, fs, &old_rep_norm,
FALSE, scratch_pool));
err = svn_stream_contents_same2(&same, contents, old_contents,
scratch_pool);
Normally they compare equal and the protorev file is truncated to remove
the delta, but in your case they do not match and the commit fails.
As far as producing a smaller testcase: it may be possible to trim out
all the files and directories in other parts of the repository. For
example if the repository path to the parent directory of the commit is
/project/branch/foo/bar then you can use
svndumpfilter include /project/branch/foo/bar
in an
svnadmin dump ... | svndumpfilter ... | svnadmin load
pipeline to produce a smaller repository and this smaller repository may
reproduce the error. Dump and load tend to go faster with a larger
than default -M parameter.
- it may be necessary to expand the included tree to cope with copies.
- the rep-cache might refer to a totally different file that happens
to have the same SHA1/content, in which case the included tree may
need to include this file as well.
- the reduced repository will have the same files but the directories
will be different/smaller and their content may be necessary to
trigger the bug
- the reduced repository will contain less data meaning smaller file
offsets and the larger offsets may be necessary to trigger the bug
To determine whether the rep-cache SHA1 refers to a different file you
first need the repository form of the file being committed, i.e. with
svn:keywords and svn:eol-style detranslated. Then calculate the SHA1
sqlite3 db/rep-cache.db "select revision from rep_cache where hash='xxxxxxx'"
This tells you which revision is involved, then you look in the revision
file db/revs/nnn/nnnnn to find the hash and determine the file path.
Thank you for persisting with the investigation!
--
Philip
Myria
2018-03-06 03:41:46 UTC
Permalink
When Subversion gets to this part of rep_read_contents, rb->len is
16384. It thinks it is then done reading the entire file, and can
compare the checksum, but it's not done with the file yet.

rb->rep.expanded_size is correct at the error point, 57465.
rep_read_get_baton sets rb->len to rb->rep.expanded_size, so I don't
know why the value changed by the time rep_read_contents got its paws
on the baton. I saw that rb->len might be getting clobbered by
rep_read_content's call to build_rep_list, which has the following
line of code:

*expanded_size = first_rep->expanded_size;

expanded_size is &rep->len. I haven't had a chance to debug this area
yet, so it might be fine.

I verified with sqlite3 that the rep-cache.db has the correct size (57465):

$ sqlite3 /mnt/d/svnclone/db/rep-cache.db "select * from rep_cache
where hash='e6291ab119036eb783d0136afccdb3b445867364'"
e6291ab119036eb783d0136afccdb3b445867364|227170|153|193|57465
Post by Myria
GMail keeps doing reply instead of reply all. I'm having to manually
add the users list back now.
Below is the thread I sent.
---------- Forwarded message ----------
Date: Mon, Mar 5, 2018 at 6:37 PM
Subject: Re: SHA-1 collision in repository?
I now know where the checksum error happens, but not why.
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
It's calculating the MD5 of only the first 16 KB of the input file and
comparing against the MD5 of the entire file. The 16 KB number seems
to be SVN__STREAM_CHUNK_SIZE.
bb52be764a04d511ebb06e1889910dcf is the MD5 of the entire file.
80a10d37de91cadc604ba30e379651b3 is the MD5 of the first 16384 bytes.
I managed to compile a subversion command line client with debugging
information and optimizations disabled, and can reproduce the problem
with GDB attached.
Here is a backtrace at the time at which the error occurs. A few line
numbers in stream.c will be wrong by a few lines due to a few printf's
I added.
#0 svn_checksum_mismatch_err (expected=0x7ffffffdcf00,
actual=0x7ffffa0700a0, scratch_pool=0x7ffffa070028,
fmt=0x7ffffc259ac0 "Checksum mismatch while reading
representation") at subversion/libsvn_subr/checksum.c:638
#1 0x00007ffffc2123de in rep_read_contents (baton=0x7ffffa1f6190,
buf=0x7ffffa1f66a8 "// <redacted>"..., len=0x7ffffffdcf88)
at subversion/libsvn_fs_fs/cached_data.c:2062
#2 0x00007ffffe5645fd in svn_stream_read_full (stream=0x7ffffa1f6470,
buffer=0x7ffffa1f66a8 "// <redacted>"..., len=0x7ffffffdcf88)
at subversion/libsvn_subr/stream.c:193
#3 0x00007ffffe5653f3 in svn_stream_contents_same2
(same=0x7ffffffdd01c, stream1=0x7ffffa1f6470,
stream2=0x7ffffa1f6650, pool=0x7ffffa1e0028) at
subversion/libsvn_subr/stream.c:589
#4 0x00007ffffc247226 in get_shared_rep (old_rep=0x7ffffffdd188,
fs=0x7fffff601030, rep=0x7ffffa0e20b8,
file=0x7ffffa1e0390, offset=0, reps_hash=0x0,
result_pool=0x7fffff5e0028, scratch_pool=0x7ffffa1e0028)
at subversion/libsvn_fs_fs/transaction.c:2280
#5 0x00007ffffc247734 in rep_write_contents_close
(baton=0x7ffffa232ff0) at subversion/libsvn_fs_fs/transaction.c:2370
#6 0x00007ffffe56492b in svn_stream_close (stream=0x7ffffa233140) at
subversion/libsvn_subr/stream.c:274
#7 0x00007ffffe841001 in apply_window (window=0x0,
baton=0x7ffffa1000a0) at subversion/libsvn_delta/text_delta.c:732
#8 0x00007ffffc2520d2 in window_consumer (window=0x0,
baton=0x7fffff5f1ab8) at subversion/libsvn_fs_fs/tree.c:2935
#9 0x00007ffffe8405ef in svn_txdelta_run (source=0x7fffff5f1a18,
target=0x7fffff5f1298,
handler=0x7ffffc25209f <window_consumer>,
handler_baton=0x7fffff5f1ab8, checksum_kind=svn_checksum_md5,
checksum=0x7ffffffdd458, cancel_func=0x0, cancel_baton=0x0,
result_pool=0x7fffff5e0028,
scratch_pool=0x7fffff5e0028) at subversion/libsvn_delta/text_delta.c:454
#10 0x00007ffffee98a57 in svn_wc__internal_transmit_text_deltas (tempfile=0x0,
new_text_base_md5_checksum=0x7ffffffdd5b0,
new_text_base_sha1_checksum=0x7ffffffdd5b8, db=0x7fffff6c17d8,
local_abspath=0x7fffff672d08
"/mnt/d/svntest/repository/directory/Redacted.cpp",
fulltext=0, editor=0x7fffff673700, file_baton=0x7fffff510110,
result_pool=0x7fffff6c0028,
scratch_pool=0x7fffff5e0028) at subversion/libsvn_wc/adm_crawler.c:1109
#11 0x00007ffffee98d68 in svn_wc_transmit_text_deltas3
(new_text_base_md5_checksum=0x7ffffffdd5b0,
new_text_base_sha1_checksum=0x7ffffffdd5b8, wc_ctx=0x7fffff6c17c0,
local_abspath=0x7fffff672d08
"/mnt/d/svntest/repository/directory/Redacted.cpp",
fulltext=0, editor=0x7fffff673700, file_baton=0x7fffff510110,
result_pool=0x7fffff6c0028,
scratch_pool=0x7fffff5e0028) at subversion/libsvn_wc/adm_crawler.c:1199
#12 0x00007fffff18eb12 in svn_client__do_commit (
base_url=0x7fffff6142c0 "file:///mnt/d/svntest/repository/directory",
commit_items=0x7fffff672c48, editor=0x7fffff673700,
edit_baton=0x7fffff6300a0,
notify_path_prefix=0x7fffff672900 "/mnt/d/svntest/repository",
sha1_checksums=0x7ffffffdd750,
ctx=0x7fffff6c16f0, result_pool=0x7fffff6c0028, scratch_pool=0x7fffff650028)
at subversion/libsvn_client/commit_util.c:1920
#13 0x00007fffff18a5f9 in svn_client_commit6 (targets=0x7fffff670a18,
depth=svn_depth_infinity, keep_locks=0,
keep_changelists=0, commit_as_operations=1,
include_file_externals=0, include_dir_externals=0,
changelists=0x7fffff6c0780, revprop_table=0x0,
commit_callback=0x42c6a0 <svn_cl__print_commit_info>,
commit_baton=0x0, ctx=0x7fffff6c16f0, pool=0x7fffff6c0028) at
subversion/libsvn_client/commit.c:901
#14 0x000000000040b744 in svn_cl__commit (os=0x7fffff6c0520,
baton=0x7ffffffddc60, pool=0x7fffff6c0028)
at subversion/svn/commit-cmd.c:171
#15 0x000000000042b351 in sub_main (exit_code=0x7ffffffddf3c, argc=5,
argv=0x7ffffffde038, pool=0x7fffff6c0028)
at subversion/svn/svn.c:3041
#16 0x000000000042b5ee in main (argc=5, argv=0x7ffffffde038) at
subversion/svn/svn.c:3126
Myria
2018-03-06 03:54:22 UTC
Permalink
Final email for the night >.<

What's clobbering the expanded_size is this in build_rep_list:

/* The value as stored in the data struct.
0 is either for unknown length or actually zero length. */
*expanded_size = first_rep->expanded_size;

first_rep->expanded_size here is zero for the last call to this
function before the error. In every other case before the error, the
two values are equal.

Then this code executes:

if (*expanded_size == 0)
if (rep_header->type == svn_fs_fs__rep_plain || first_rep->size != 4)
*expanded_size = first_rep->size;

first_rep->size is 16384, and this is why rb->len becomes 16384,
leading to the error.

I don't know what all this code is doing, but that's the proximate
cause of the failure.

Melissa
Post by Myria
When Subversion gets to this part of rep_read_contents, rb->len is
16384. It thinks it is then done reading the entire file, and can
compare the checksum, but it's not done with the file yet.
rb->rep.expanded_size is correct at the error point, 57465.
rep_read_get_baton sets rb->len to rb->rep.expanded_size, so I don't
know why the value changed by the time rep_read_contents got its paws
on the baton. I saw that rb->len might be getting clobbered by
rep_read_content's call to build_rep_list, which has the following
*expanded_size = first_rep->expanded_size;
expanded_size is &rep->len. I haven't had a chance to debug this area
yet, so it might be fine.
$ sqlite3 /mnt/d/svnclone/db/rep-cache.db "select * from rep_cache
where hash='e6291ab119036eb783d0136afccdb3b445867364'"
e6291ab119036eb783d0136afccdb3b445867364|227170|153|193|57465
Post by Myria
GMail keeps doing reply instead of reply all. I'm having to manually
add the users list back now.
Below is the thread I sent.
---------- Forwarded message ----------
Date: Mon, Mar 5, 2018 at 6:37 PM
Subject: Re: SHA-1 collision in repository?
I now know where the checksum error happens, but not why.
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
It's calculating the MD5 of only the first 16 KB of the input file and
comparing against the MD5 of the entire file. The 16 KB number seems
to be SVN__STREAM_CHUNK_SIZE.
bb52be764a04d511ebb06e1889910dcf is the MD5 of the entire file.
80a10d37de91cadc604ba30e379651b3 is the MD5 of the first 16384 bytes.
I managed to compile a subversion command line client with debugging
information and optimizations disabled, and can reproduce the problem
with GDB attached.
Here is a backtrace at the time at which the error occurs. A few line
numbers in stream.c will be wrong by a few lines due to a few printf's
I added.
#0 svn_checksum_mismatch_err (expected=0x7ffffffdcf00,
actual=0x7ffffa0700a0, scratch_pool=0x7ffffa070028,
fmt=0x7ffffc259ac0 "Checksum mismatch while reading
representation") at subversion/libsvn_subr/checksum.c:638
#1 0x00007ffffc2123de in rep_read_contents (baton=0x7ffffa1f6190,
buf=0x7ffffa1f66a8 "// <redacted>"..., len=0x7ffffffdcf88)
at subversion/libsvn_fs_fs/cached_data.c:2062
#2 0x00007ffffe5645fd in svn_stream_read_full (stream=0x7ffffa1f6470,
buffer=0x7ffffa1f66a8 "// <redacted>"..., len=0x7ffffffdcf88)
at subversion/libsvn_subr/stream.c:193
#3 0x00007ffffe5653f3 in svn_stream_contents_same2
(same=0x7ffffffdd01c, stream1=0x7ffffa1f6470,
stream2=0x7ffffa1f6650, pool=0x7ffffa1e0028) at
subversion/libsvn_subr/stream.c:589
#4 0x00007ffffc247226 in get_shared_rep (old_rep=0x7ffffffdd188,
fs=0x7fffff601030, rep=0x7ffffa0e20b8,
file=0x7ffffa1e0390, offset=0, reps_hash=0x0,
result_pool=0x7fffff5e0028, scratch_pool=0x7ffffa1e0028)
at subversion/libsvn_fs_fs/transaction.c:2280
#5 0x00007ffffc247734 in rep_write_contents_close
(baton=0x7ffffa232ff0) at subversion/libsvn_fs_fs/transaction.c:2370
#6 0x00007ffffe56492b in svn_stream_close (stream=0x7ffffa233140) at
subversion/libsvn_subr/stream.c:274
#7 0x00007ffffe841001 in apply_window (window=0x0,
baton=0x7ffffa1000a0) at subversion/libsvn_delta/text_delta.c:732
#8 0x00007ffffc2520d2 in window_consumer (window=0x0,
baton=0x7fffff5f1ab8) at subversion/libsvn_fs_fs/tree.c:2935
#9 0x00007ffffe8405ef in svn_txdelta_run (source=0x7fffff5f1a18,
target=0x7fffff5f1298,
handler=0x7ffffc25209f <window_consumer>,
handler_baton=0x7fffff5f1ab8, checksum_kind=svn_checksum_md5,
checksum=0x7ffffffdd458, cancel_func=0x0, cancel_baton=0x0,
result_pool=0x7fffff5e0028,
scratch_pool=0x7fffff5e0028) at subversion/libsvn_delta/text_delta.c:454
#10 0x00007ffffee98a57 in svn_wc__internal_transmit_text_deltas (tempfile=0x0,
new_text_base_md5_checksum=0x7ffffffdd5b0,
new_text_base_sha1_checksum=0x7ffffffdd5b8, db=0x7fffff6c17d8,
local_abspath=0x7fffff672d08
"/mnt/d/svntest/repository/directory/Redacted.cpp",
fulltext=0, editor=0x7fffff673700, file_baton=0x7fffff510110,
result_pool=0x7fffff6c0028,
scratch_pool=0x7fffff5e0028) at subversion/libsvn_wc/adm_crawler.c:1109
#11 0x00007ffffee98d68 in svn_wc_transmit_text_deltas3
(new_text_base_md5_checksum=0x7ffffffdd5b0,
new_text_base_sha1_checksum=0x7ffffffdd5b8, wc_ctx=0x7fffff6c17c0,
local_abspath=0x7fffff672d08
"/mnt/d/svntest/repository/directory/Redacted.cpp",
fulltext=0, editor=0x7fffff673700, file_baton=0x7fffff510110,
result_pool=0x7fffff6c0028,
scratch_pool=0x7fffff5e0028) at subversion/libsvn_wc/adm_crawler.c:1199
#12 0x00007fffff18eb12 in svn_client__do_commit (
base_url=0x7fffff6142c0 "file:///mnt/d/svntest/repository/directory",
commit_items=0x7fffff672c48, editor=0x7fffff673700,
edit_baton=0x7fffff6300a0,
notify_path_prefix=0x7fffff672900 "/mnt/d/svntest/repository",
sha1_checksums=0x7ffffffdd750,
ctx=0x7fffff6c16f0, result_pool=0x7fffff6c0028, scratch_pool=0x7fffff650028)
at subversion/libsvn_client/commit_util.c:1920
#13 0x00007fffff18a5f9 in svn_client_commit6 (targets=0x7fffff670a18,
depth=svn_depth_infinity, keep_locks=0,
keep_changelists=0, commit_as_operations=1,
include_file_externals=0, include_dir_externals=0,
changelists=0x7fffff6c0780, revprop_table=0x0,
commit_callback=0x42c6a0 <svn_cl__print_commit_info>,
commit_baton=0x0, ctx=0x7fffff6c16f0, pool=0x7fffff6c0028) at
subversion/libsvn_client/commit.c:901
#14 0x000000000040b744 in svn_cl__commit (os=0x7fffff6c0520,
baton=0x7ffffffddc60, pool=0x7fffff6c0028)
at subversion/svn/commit-cmd.c:171
#15 0x000000000042b351 in sub_main (exit_code=0x7ffffffddf3c, argc=5,
argv=0x7ffffffde038, pool=0x7fffff6c0028)
at subversion/svn/svn.c:3041
#16 0x000000000042b5ee in main (argc=5, argv=0x7ffffffde038) at
subversion/svn/svn.c:3126
Nathan Hartman
2018-03-07 17:02:26 UTC
Permalink
Post by Myria
Final email for the night >.<
/* The value as stored in the data struct.
0 is either for unknown length or actually zero length. */
*expanded_size = first_rep->expanded_size;
first_rep->expanded_size here is zero for the last call to this
function before the error. In every other case before the error, the
two values are equal.
if (*expanded_size == 0)
if (rep_header->type == svn_fs_fs__rep_plain || first_rep->size != 4)
*expanded_size = first_rep->size;
first_rep->size is 16384, and this is why rb->len becomes 16384,
leading to the error.
I don't know what all this code is doing, but that's the proximate
cause of the failure.
Melissa
Has it been possible to determine what is setting expanded_size to 0 before that last call? I wonder if there is specific logic that decides (perhaps incorrectly?) to do that?

Alternatively is it being clobbered by some out-of-bounds access, use-after-free, or another such issue?

Is it possible in your debugger setup to determine the address of that variable and set a breakpoint that triggers when that memory is written? (It may be called a watchpoint?)

Which leads me to another thought: if you can set such a breakpoint / watchpoint and it does not trigger, then this expanded_size might not be the same instance in that final call. Perhaps a shallow copy of an enclosing structure is made which leaves out the known size and sets it to 0 for some reason, and that final call is given that (incomplete) copy.

Caveat: I am not familiar with the codebase but these are my thoughts based on adventures in other code bases.
Myria
2018-03-07 20:58:26 UTC
Permalink
During rep_write_contents_close, there is a call to get_shared_rep.
get_shared_rep calls svn_fs_fs__get_contents_from_file, which has the
code pasted below.


/* Build the representation list (delta chain). */
if (rh->type == svn_fs_fs__rep_plain)
{
rb->rs_list = apr_array_make(pool, 0, sizeof(rep_state_t *));
rb->src_state = rs;
}
else if (rh->type == svn_fs_fs__rep_self_delta)
{
rb->rs_list = apr_array_make(pool, 1, sizeof(rep_state_t *));
APR_ARRAY_PUSH(rb->rs_list, rep_state_t *) = rs;
rb->src_state = NULL;
}
else
{
representation_t next_rep = { 0 };

/* skip "SVNx" diff marker */
rs->current = 4;

/* REP's base rep is inside a proper revision.
* It can be reconstructed in the usual way. */
next_rep.revision = rh->base_revision;
next_rep.item_index = rh->base_item_index;
next_rep.size = rh->base_length;
svn_fs_fs__id_txn_reset(&next_rep.txn_id);

SVN_ERR(build_rep_list(&rb->rs_list, &rb->base_window,
&rb->src_state, &rb->len, rb->fs, &next_rep,
rb->filehandle_pool));


The bug is occurring because build_rep_list is being called with
first_rep->expanded_size set to zero. Well, the reason it's zero is
because first_rep is the second to last parameter to build_rep_list,
and the above code initialized expanded_size to zero:

representation_t next_rep = { 0 };

Does the code just need this? I don't know this call >.<

next_rep.expanded_size = rb->rep.expanded_size;

Melissa
Post by Nathan Hartman
Post by Myria
Final email for the night >.<
/* The value as stored in the data struct.
0 is either for unknown length or actually zero length. */
*expanded_size = first_rep->expanded_size;
first_rep->expanded_size here is zero for the last call to this
function before the error. In every other case before the error, the
two values are equal.
if (*expanded_size == 0)
if (rep_header->type == svn_fs_fs__rep_plain || first_rep->size != 4)
*expanded_size = first_rep->size;
first_rep->size is 16384, and this is why rb->len becomes 16384,
leading to the error.
I don't know what all this code is doing, but that's the proximate
cause of the failure.
Melissa
Has it been possible to determine what is setting expanded_size to 0 before that last call? I wonder if there is specific logic that decides (perhaps incorrectly?) to do that?
Alternatively is it being clobbered by some out-of-bounds access, use-after-free, or another such issue?
Is it possible in your debugger setup to determine the address of that variable and set a breakpoint that triggers when that memory is written? (It may be called a watchpoint?)
Which leads me to another thought: if you can set such a breakpoint / watchpoint and it does not trigger, then this expanded_size might not be the same instance in that final call. Perhaps a shallow copy of an enclosing structure is made which leaves out the known size and sets it to 0 for some reason, and that final call is given that (incomplete) copy.
Caveat: I am not familiar with the codebase but these are my thoughts based on adventures in other code bases.
Daniel Shahaf
2018-03-02 23:00:09 UTC
Permalink
Post by Myria
Also, I have no control over what was in the repository five years
ago. The huge files were compiled versions of WebKit libraries.
Note that in 2017, WebKit intentionally committed a SHA-1 collision into their
repository. If you have the WebKit sources in your tree too, that'd be something
to watch out for, but it can't explain the r227185 instance.

Cheers,

Daniel
Daniel Shahaf
2018-03-02 22:57:51 UTC
Permalink
Post by Myria
In other news, unknown whether related to the current problem, my
D:\>svnsync sync file:///d:/svnclone
Transmitting file data
E160000: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
When this error happens, could you print the first lines of the two reps
identical? The first line is "PLAIN\n" or "DELTA\n" or "DELTA 42 43 44\n".
(I wonder whether we have some stray whitespace that's transparent to parsing
but breaks checksums.)

Do you happen to have a copy of the repository lying around that you can run
'grep -a 80a10d37de91cadc604ba30e379651b3 db/revs/{0,1,2,...,227}' on?
Admittedly that's a bit of a shot in the dark.

Cheers,

Daniel
Daniel Shahaf
2018-03-02 23:07:57 UTC
Permalink
Post by Daniel Shahaf
Post by Myria
In other news, unknown whether related to the current problem, my
D:\>svnsync sync file:///d:/svnclone
Transmitting file data
E160000: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
When this error happens, could you print the first lines of the two reps
identical? The first line is "PLAIN\n" or "DELTA\n" or "DELTA 42 43 44\n".
(I wonder whether we have some stray whitespace that's transparent to parsing
but breaks checksums.)
In second thought I'm not sure this makes sense. A better question is: can we
obtain the fulltext whose checksum is 80a10d37de91cadc604ba30e379651b3?
Post by Daniel Shahaf
Do you happen to have a copy of the repository lying around that you can run
'grep -a 80a10d37de91cadc604ba30e379651b3 db/revs/{0,1,2,...,227}' on?
Admittedly that's a bit of a shot in the dark.
Cheers,
Daniel
Myria
2018-03-07 21:09:58 UTC
Permalink
The fulltext whose checksum is 80a10d37de91cadc604ba30e379651b3 I
found out is the first 16384 bytes of the file (see other parts of
this thread). 16384 is SVN__STREAM_CHUNK_SIZE.
Post by Daniel Shahaf
Post by Daniel Shahaf
Post by Myria
In other news, unknown whether related to the current problem, my
D:\>svnsync sync file:///d:/svnclone
Transmitting file data
E160000: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
expected: bb52be764a04d511ebb06e1889910dcf
actual: 80a10d37de91cadc604ba30e379651b3
When this error happens, could you print the first lines of the two reps
identical? The first line is "PLAIN\n" or "DELTA\n" or "DELTA 42 43 44\n".
(I wonder whether we have some stray whitespace that's transparent to parsing
but breaks checksums.)
In second thought I'm not sure this makes sense. A better question is: can we
obtain the fulltext whose checksum is 80a10d37de91cadc604ba30e379651b3?
Post by Daniel Shahaf
Do you happen to have a copy of the repository lying around that you can run
'grep -a 80a10d37de91cadc604ba30e379651b3 db/revs/{0,1,2,...,227}' on?
Admittedly that's a bit of a shot in the dark.
Cheers,
Daniel
Branko Čibej
2018-02-24 07:10:34 UTC
Permalink
Post by Myria
Once it's on my local machine, I'll be able to compile TortoiseSVN and
debug it while pointing to a file:// repository. (TortoiseSVN instead
of command-line svn because TortoiseSVN is compiled with Visual C++
and is therefore many times easier to debug.)
In fact, on Windows, the command-line Subversion is also compiled with
Visual C++ and should be quite as easy to debug. What probably is harder
is setting up the initial build environment ...


-- Brane
Branko Čibej
2018-02-22 22:13:39 UTC
Permalink
Post by Myria
When we try to commit a very specific version of a very specific
binary file, we get a SHA-1 collision error from the Subversion
D:\confidential>svn commit secret.bin -m "Testing broken commit"
Sending secret.bin
svn: E160000: SHA1 of reps '604440 34 134255 136680
c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0
134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches
(db11617ef1454332336e00abc311d44bc698f3b3) but contents differ
What can cause this?
The simplest explanation would be a corruption of the existing
representation on disk. Note that both the MD5 and the SHA1 checksums
appear to match, as do the sizes; which makes it even more likely that
it's the same file but the copy in the repository is somehow corrupted.

-- Brane
Philip Martin
2018-02-23 01:33:33 UTC
Permalink
Post by Branko Čibej
Post by Myria
When we try to commit a very specific version of a very specific
binary file, we get a SHA-1 collision error from the Subversion
D:\confidential>svn commit secret.bin -m "Testing broken commit"
Sending secret.bin
svn: E160000: SHA1 of reps '604440 34 134255 136680
c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0
134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches
(db11617ef1454332336e00abc311d44bc698f3b3) but contents differ
What can cause this?
The simplest explanation would be a corruption of the existing
representation on disk. Note that both the MD5 and the SHA1 checksums
appear to match, as do the sizes; which makes it even more likely that
it's the same file but the copy in the repository is somehow corrupted.
That pattern, all of MD5, SHA1 and size matching, is exactly what
happens if a SHA1 collision is committed using an old version of
Subversion where the rep-cache does not detect collisions. The first
part of the collision would have been committed in r604440 and the
second part in r605556.

If that is the case, and a SHA1 collision did occur, then:

svnadmin verify -r604440 path/to/repository

will succeed while:

svnadmin verify -r605556 path/to/repository

will fail with an MD5 checksum error.

If this is what you see then unfortunately the colliding r605556 content
has been elided and the r605556 revision is corrupt.

You should be able to retrieve the first part of the collision from
r604440, it will be one of the files given by:

svn log -v -r604440

The second part in r605556 is missing :-( but it will be one of the
files given by:

svn log -v -r605556

However your failing commit would also be a SHA1 collision with the
r604440 content (it might be identical to the missing content in
r605556).
--
Philip
Loading...