Chris
2018-10-30 13:36:09 UTC
Hi,
I just wanted to say that I finally managed to get the dump-filter-load cycle done and deploy the filtered repo. It did get rid of about 90% of the repository size so that's good for us. A big thanks to all who helped out with information in this mail thread! I would definitely have stranded somewhere without you.
One thing that was a bit annoying was when the dumpfilter threw an error because of a source of a file was missing when I filtered out a certain path and it turned out it had been copied to another location. The error message only prints out the missing source and not the destination, so I had to go into the repo to check the revision it crashed on to find the copy destination and add it to my filter list. Would have been nice if the error message could list both the source and the destination.
/Chris
--------------------------------------------
On Wed, 10/10/18, Johan Corveleyn <***@gmail.com> wrote:
Subject: Re: svndumpfilter and svnsync?
To: "Chris" <***@yahoo.se>
Cc: "Daniel Shahaf" <***@daniel.shahaf.name>, "Ryan Schmidt" <subversion-***@ryandesign.com>, "Subversion" <***@subversion.apache.org>
Date: Wednesday, October 10, 2018, 12:11 PM
On Wed, Oct 10, 2018 at 11:18 AM
Chris <***@yahoo.se>
wrote:
...
The syntax I used: svnadmin dump -q MYREPO | svndumpfilter
exclude
filterfile filterdump svnadmin load -q --no-flush-to-disk
--bypass- prop-validation ./NEWREPO < filterdump
some newline issues
message, similar to this one
https://groups.google.com/forum/#!topic/subversion_users/P3ohZ-hKhCA,
newlines, but the repo works as it is
version 1.10, svnadmin finally has an option to normalize
these (either with the
--normalize-props option for 'svnadmin load' or by
using svndumptool)
"bypass" them. Otherwise you'll run into this
again later (if
dump+load again sometime in the future).
why I
suggested.
In that case the
culprit might be another property than svn:log (or it
might be something like "non UTF-8
encoded" but not EOL-related in
svn:log). Possibly a "versioned"
property like svn:ignore or some
other
property in the svn: namespace. This is more difficult to
fix,
but still it might be best to get rid
of it or you'll run into it
again in the
future.
See the very last
bullet in:
http://subversion.apache.org/faq.html#dumpload
If that's indeed the
problem, then you'll have to use that svndumptool
that Ryan pointed you to.
Quoting from that last bullet in the FAQ entry
above:
"This is more
difficult to repair, because 'svn:ignore' is not
a
revision property (unlike svn:log, which
can be manipulated with
svnadmin
setrevprop), but a versioned property (so it's part
of
history). Again, you can ignore this with
--bypass-prop-validation.
But since this is
a corruption "in history", this can only be
repaired
with a dump+load, so this might be
a good time to try and fix this (or
you'll run into this again in the future).
To repair it you can use a
tool like
svndumptool. But it only works on dump files, not as part
of
a pipe. So a possible way to go about it
is: dump that single
(corrupt) revision to a
file, repair it ('svndumptool.py eolfix-prop
svn:ignore svn.dump svn.dump.repaired'),
load that single dumpfile,
and then continue
with a new "piped" command (like step (6) above).
"
I should note here
that svnsync is more powerful in this regard: it
does have the ability to normalize all of these
on the fly. It's a
real pity that
'svnadmin load' doesn't (except for the svn:log
EOL
fixing). Doesn't *yet* that is,
until a volunteer comes along that
submits a
patch for it ;-).
Anyway, I
hope you succeed in cleaning this up eventually :-).
--
Johan
I just wanted to say that I finally managed to get the dump-filter-load cycle done and deploy the filtered repo. It did get rid of about 90% of the repository size so that's good for us. A big thanks to all who helped out with information in this mail thread! I would definitely have stranded somewhere without you.
One thing that was a bit annoying was when the dumpfilter threw an error because of a source of a file was missing when I filtered out a certain path and it turned out it had been copied to another location. The error message only prints out the missing source and not the destination, so I had to go into the repo to check the revision it crashed on to find the copy destination and add it to my filter list. Would have been nice if the error message could list both the source and the destination.
/Chris
--------------------------------------------
On Wed, 10/10/18, Johan Corveleyn <***@gmail.com> wrote:
Subject: Re: svndumpfilter and svnsync?
To: "Chris" <***@yahoo.se>
Cc: "Daniel Shahaf" <***@daniel.shahaf.name>, "Ryan Schmidt" <subversion-***@ryandesign.com>, "Subversion" <***@subversion.apache.org>
Date: Wednesday, October 10, 2018, 12:11 PM
On Wed, Oct 10, 2018 at 11:18 AM
Chris <***@yahoo.se>
wrote:
...
The syntax I used: svnadmin dump -q MYREPO | svndumpfilter
exclude
--targets
--force-uuid -M 2048
(I had to use the bypass-prop-validation due to
in old log
https://groups.google.com/forum/#!topic/subversion_users/P3ohZ-hKhCA,
don't know why they have wrong
now...)
newlines, you could fix them usingInstead of ignoring wrong
svndumptool (using its eolfix-revprop command),
http://svn.borg.ch/svndumptool/
https://github.com/jwiegley/svndumptool
Also, as ofhttps://github.com/jwiegley/svndumptool
these on-the-fly during
http://subversion.apache.org/docs/release-notes/1.10.html#normalize-
props
It's a lot better to normalize
http://subversion.apache.org/docs/release-notes/1.10.html#normalize-
props
It's a lot better to normalize
--normalize-props option for 'svnadmin load' or by
using svndumptool)
than to
again later (if
you would
I tried
--normalize-props and I still got the same error which iswhy I
switched over to bypass. Maybe
I've run into some bug with --normalize-props.Unfortunately, I don't think I'll
be able to create a script for reproducingthe error since it happens far into a
monster dump load.So I'll stick
with the bypass for now or try the tool that Ryansuggested.
In that case the
culprit might be another property than svn:log (or it
might be something like "non UTF-8
encoded" but not EOL-related in
svn:log). Possibly a "versioned"
property like svn:ignore or some
other
property in the svn: namespace. This is more difficult to
fix,
but still it might be best to get rid
of it or you'll run into it
again in the
future.
See the very last
bullet in:
http://subversion.apache.org/faq.html#dumpload
If that's indeed the
problem, then you'll have to use that svndumptool
that Ryan pointed you to.
Quoting from that last bullet in the FAQ entry
above:
"This is more
difficult to repair, because 'svn:ignore' is not
a
revision property (unlike svn:log, which
can be manipulated with
svnadmin
setrevprop), but a versioned property (so it's part
of
history). Again, you can ignore this with
--bypass-prop-validation.
But since this is
a corruption "in history", this can only be
repaired
with a dump+load, so this might be
a good time to try and fix this (or
you'll run into this again in the future).
To repair it you can use a
tool like
svndumptool. But it only works on dump files, not as part
of
a pipe. So a possible way to go about it
is: dump that single
(corrupt) revision to a
file, repair it ('svndumptool.py eolfix-prop
svn:ignore svn.dump svn.dump.repaired'),
load that single dumpfile,
and then continue
with a new "piped" command (like step (6) above).
"
I should note here
that svnsync is more powerful in this regard: it
does have the ability to normalize all of these
on the fly. It's a
real pity that
'svnadmin load' doesn't (except for the svn:log
EOL
fixing). Doesn't *yet* that is,
until a volunteer comes along that
submits a
patch for it ;-).
Anyway, I
hope you succeed in cleaning this up eventually :-).
--
Johan