-
-
Notifications
You must be signed in to change notification settings - Fork 758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow --files-cache=size #5686
Comments
Can you tell more about the use cases when size-only makes sense? |
When backing up a dataset where filename+size guarantee unicity (eg. photos taken by a smartphone). Example: Backing up photos from a smartphone accessed via FUSE filesystem with unreliable dates. Without this feature, borg would detect a change because date/time is different and re-read the file, which can be costly in bandwidth/time. |
Well, in that given usecase it might be usually sufficient, but not always. Imagine you take a photo and your photo app decided to put timestamp and gps coordinates into metadata. You make a backup of this, but later you decide to use some special software to remove such metadata from the file and the software just overwrites this information "in-place" with some fake date and fake gps coords. The file size will not change. Then you make a backup again and the modified file will not be detected as modified and will be silently skipped and you won't have a backup of it. |
Also, in case there is some other file type in the same directory as your photos, like some fixed-size records database from some photo software, any modification of the db records will not trigger a fresh backup of the db file (only if the db file size changes e.g. by adding/removing records). |
Indeed, files could be modified in-place and have the same size, hence not be detected as modified. On datasets where files are: 1) only added and 2) remote, allowing the use of size-only detection would save a lot of bandwidth and time, because the file would not have to be re-transfered (eg. MTP or FUSE on slow remote fs as source). |
OK, we can add this as a (non-default) option. There also needs to be some docs about it warning users not to use this except when they specifically know that it will work for them. "if it breaks, you will own the parts." OTOH, maybe users wanting to use this rather need a bugfix in the filesystem they use: having either a valid ctime or mtime should be something a user can expect from a filesystem. |
Thank you ! I totally agree about the warnings. |
Hi, I am a beginner contributor and I'd like to take this up. Please brief me on the changes to be made. |
A good starting point in the code is Also, you can search for Then just navigate the source and use your global search function to find all places dealing with that. |
Hello @ThomasWaldmann, I did as you said and looked for files-cache in archiver.py. As far as I understand, we want the user to be able to do --files-cache=size while suing the borg create command. However, I can't seem to understand what changes are to be made. |
dest='files_cache_mode', type=FilesCacheMode,
This is the attribute name and type - just search the code for that and
you will see how / where it is used.
|
Hi @ThomasWaldmann, I've looked up the FilesCacheMode() function. Also, |
Yes, add 's' to VALID_MODES. Not sure if we should change the autocompletions, the use case for size-only is very special. Also, please check if changes at other places are required, just check all places using this stuff. |
Have you checked borgbackup docs, FAQ, and open Github issues?
Yes
Is this a BUG / ISSUE report or a QUESTION?
Feature suggestion
Describe the problem you're observing.
There are use cases when using size only for file change detection make sense.
rsync supports this feature (--size-only), and it would be convenient to have it in borg too (--files-cache=size)
The text was updated successfully, but these errors were encountered: