Discussion:
Ignore date changes to files
Rob Brandt
2005-11-21 17:40:46 UTC
Permalink
Hi. I'm new to subversion. I'm using it with TortoiseSVN.

One of the project I want it to manage is a desktop app being designed in a 4GL
programming tool. It's project files are saved as binary. Each time the
project is loaded *each* of it's libraries is touched and the file date is
updated, even if no changes are made to the code. This causes svn to update
each of the libraries in the commit, which is incredibly wasteful. Several
megabytes of files is uploaded with each commit even if I opened the project
just to change one line of code in a 25k file.

Is there any way around this? I previously used a commercial scm tool called
NG3, and it worked nicely for this project as it ignored dates and looked only
for actual binary changes in the files.

Rob
Gale, David
2005-11-21 18:13:16 UTC
Permalink
Post by Rob Brandt
Hi. I'm new to subversion. I'm using it with TortoiseSVN.
One of the project I want it to manage is a desktop app being
designed in a 4GL programming tool. It's project files are saved as
binary. Each time the project is loaded *each* of it's libraries is
touched and the file date is updated, even if no changes are made to
the code. This causes svn to update each of the libraries in the
commit, which is incredibly wasteful. Several megabytes of files is
uploaded with each commit even if I opened the project just to change
one line of code in a 25k file.
Is there any way around this? I previously used a commercial scm
tool called NG3, and it worked nicely for this project as it ignored
dates and looked only for actual binary changes in the files.
I'm not sure I understand. Have you verified that there're several megs
of data sent to the repository per commit? Subversion uses a binary
diff, and only sends the differences, so if the only change is really to
the file modification date, there shouldn't be any appreciable network
traffic at all. (It may flag the files as having been changed, but
that's relatively minor.) Have you watched the size of your repository
grow?

-David
Mark Phippard
2005-11-21 18:21:47 UTC
Permalink
Post by Gale, David
Post by Rob Brandt
Hi. I'm new to subversion. I'm using it with TortoiseSVN.
One of the project I want it to manage is a desktop app being
designed in a 4GL programming tool. It's project files are saved as
binary. Each time the project is loaded *each* of it's libraries is
touched and the file date is updated, even if no changes are made to
the code. This causes svn to update each of the libraries in the
commit, which is incredibly wasteful. Several megabytes of files is
uploaded with each commit even if I opened the project just to change
one line of code in a 25k file.
Is there any way around this? I previously used a commercial scm
tool called NG3, and it worked nicely for this project as it ignored
dates and looked only for actual binary changes in the files.
I'm not sure I understand. Have you verified that there're several megs
of data sent to the repository per commit? Subversion uses a binary
diff, and only sends the differences, so if the only change is really to
the file modification date, there shouldn't be any appreciable network
traffic at all. (It may flag the files as having been changed, but
that's relatively minor.) Have you watched the size of your repository
grow?
Subversion does not consider a file as changed simply because its date has
changed. Go ahead and try it. Open a file insert a space, hit backspace
and then save it.

Subversion look at the file date/time and if it has changed:

1) Subversion then looks at the file size. If that has changed, then it
considers the file as modified.

2) If the file size is still the same, then Subversion does a byte by
byte comparison of the two files to see if they are different.

As was pointed out, when you do a commit it then only sends the delta over
the wire. With some binary formats, this can wind up being a very large
delta.

Mark


_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. and SoftLanding Europe Plc by IBM Email Security Management Services powered by MessageLabs.
_____________________________________________________________________________
Rob Brandt
2005-11-21 19:47:22 UTC
Permalink
Post by Mark Phippard
1) Subversion then looks at the file size. If that has changed, then it
considers the file as modified.
2) If the file size is still the same, then Subversion does a byte by
byte comparison of the two files to see if they are different.
Is this really true? If I change an "a" to a "b", then the file size wouldn't
have changed, and the byte comparison wouldn't be done.

Regardless, your comments now make me wonder what was going on with NG3. If
there really are binary changes to the libraries when the 4GL is run, it's
interesting that it somehow knew it wasn't a significant change from a
programming point of view. I never noticed any "missed" updates, where a
commit did not include a file that had really been changed.

Rob
Mark Phippard
2005-11-21 20:07:01 UTC
Permalink
Post by Rob Brandt
Post by Mark Phippard
1) Subversion then looks at the file size. If that has changed, then it
considers the file as modified.
2) If the file size is still the same, then Subversion does a byte by
byte comparison of the two files to see if they are different.
Is this really true? If I change an "a" to a "b", then the file size wouldn't
have changed, and the byte comparison wouldn't be done.
No, read what I wrote closer. The byte comparison WOULD be done because
the file date/time has changed. The point was that if the file size has
changed then Subversion can very quickly just say ahh yes, this file has
been modified. In general the byte comparison is something you do not
want to have happen if you can avoid it because it slows down all of the
status operations while they do the byte by byte compares. Of course the
compares do stop on the first difference, but if the files are really the
same, and large, it does hurt performance.

Mark




_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. and SoftLanding Europe Plc by IBM Email Security Management Services powered by MessageLabs.
_____________________________________________________________________________
Rob Brandt
2005-11-21 19:32:44 UTC
Permalink
Post by Gale, David
I'm not sure I understand. Have you verified that there're several megs
of data sent to the repository per commit? Subversion uses a binary
diff, and only sends the differences, so if the only change is really to
the file modification date, there shouldn't be any appreciable network
traffic at all. (It may flag the files as having been changed, but
that's relatively minor.) Have you watched the size of your repository
grow?
-David
Interesting. No, I hadn't verified; what I saw was the TortoiseSVN window
showing that each file was being sent at the commit. It is good to know that
this isn't then a network traffic or storage size problem, however it is still
a versioning problem for each file. For example when I browse the repository
right now, it says that each library was updated at revision 11, which isn't
really true from a programming point of view.

Rob
Gale, David
2005-11-21 20:09:02 UTC
Permalink
Post by Rob Brandt
Post by Gale, David
I'm not sure I understand. Have you verified that there're several
megs of data sent to the repository per commit? Subversion uses a
binary diff, and only sends the differences, so if the only change
is really to the file modification date, there shouldn't be any
appreciable network traffic at all. (It may flag the files as
having been changed, but that's relatively minor.) Have you watched
the size of your repository grow?
-David
Interesting. No, I hadn't verified; what I saw was the TortoiseSVN
window showing that each file was being sent at the commit. It is
good to know that this isn't then a network traffic or storage size
problem, however it is still a versioning problem for each file. For
example when I browse the repository right now, it says that each
library was updated at revision 11, which isn't really true from a
programming point of view.
Just did some quick tests. Under linux/unix, touching a checked-out
file to change the modified time does not cause subversion to flag it as
modified. Under windows, because there is no touch command, I was a
little more involved. Here're my steps:
Checkout (using TortoiseSvn)
Modify
- The file icon changes to indicate a modified file
Modify back
- The file icon changes back to a "pristine" file
Try to check in
- TortoiseSvn reports no modifications to check in.

So, the only thing I can conclude is that the files that you thought
were only changed by modification time are, in fact, different. Perhaps
they've got a last modified time-stamp encoded in them, or something.
If this is the case, I can understand the diffs being noticeable, and
thus your original complaint. I've no experience with NG3--is it
designed to understand the format of the files you're dealing with? If
so, it may be intelligent enough to open the files and check to see what
the differences between revisions actually are, and filter accordingly.

Anyhow, that's my best guess.

-David
Rob Brandt
2005-11-21 21:26:36 UTC
Permalink
Post by Gale, David
Just did some quick tests. Under linux/unix, touching a checked-out
file to change the modified time does not cause subversion to flag it as
modified. Under windows, because there is no touch command, I was a
Checkout (using TortoiseSvn)
Modify
- The file icon changes to indicate a modified file
Modify back
- The file icon changes back to a "pristine" file
Try to check in
- TortoiseSvn reports no modifications to check in.
I tried this after your last message as well. Same results.
Post by Gale, David
So, the only thing I can conclude is that the files that you thought
were only changed by modification time are, in fact, different. Perhaps
they've got a last modified time-stamp encoded in them, or something.
If this is the case, I can understand the diffs being noticeable, and
thus your original complaint. I've no experience with NG3--is it
designed to understand the format of the files you're dealing with? If
so, it may be intelligent enough to open the files and check to see what
the differences between revisions actually are, and filter accordingly.
I've come to the same conclusion as well, tenatively. No, NG3 doesn't have a
clue about my file formats. I no longer have access to a Windows server to do
immediate testing of NG3, but I do have an archive of it on one of my machines
at home. I'll do some experimentation and see what I can figure out. Are
there any diff tools that will show binary changes that you are aware
of? I've
got a few here, but they all refuse to perform on binaries.

Rob
Duncan Murdoch
2005-11-21 22:30:18 UTC
Permalink
Post by Rob Brandt
Post by Gale, David
Just did some quick tests. Under linux/unix, touching a checked-out
file to change the modified time does not cause subversion to flag it as
modified. Under windows, because there is no touch command, I was a
Checkout (using TortoiseSvn)
Modify
- The file icon changes to indicate a modified file
Modify back
- The file icon changes back to a "pristine" file
Try to check in
- TortoiseSvn reports no modifications to check in.
I tried this after your last message as well. Same results.
Post by Gale, David
So, the only thing I can conclude is that the files that you thought
were only changed by modification time are, in fact, different. Perhaps
they've got a last modified time-stamp encoded in them, or something.
If this is the case, I can understand the diffs being noticeable, and
thus your original complaint. I've no experience with NG3--is it
designed to understand the format of the files you're dealing with? If
so, it may be intelligent enough to open the files and check to see what
the differences between revisions actually are, and filter accordingly.
I've come to the same conclusion as well, tenatively. No, NG3 doesn't have a
clue about my file formats. I no longer have access to a Windows server to do
immediate testing of NG3, but I do have an archive of it on one of my machines
at home. I'll do some experimentation and see what I can figure out. Are
there any diff tools that will show binary changes that you are aware
of? I've
got a few here, but they all refuse to perform on binaries.
The hex viewer in Beyond Compare (http://www.scootersoftware.com/,
Windows only) does a good job of displaying diffs on binary files.

Duncan Murdoch

Loading...