Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of delta-sync support on server-side. #29378

Closed
wants to merge 1 commit into from
Closed

Implementation of delta-sync support on server-side. #29378

wants to merge 1 commit into from

Conversation

ahmedammar
Copy link
Contributor

@ahmedammar ahmedammar commented Oct 27, 2017

Description

This commit adds server-side support for delta-sync, this extends the core files app.

The basic approach is to store zsync metadata files in a folder called files_zsync/ which mirrors the structure and layout of the files/ folder but appends .zsync to the metadata files. These files can be requested by the client via a new route api/v1/zsync/$path.

Filesystem hooks are used to mirror any move/copy/delete operation on the base file onto the metadata file.

The upload path is implemented by modifying the ChunkingPlugin, the chunk files are now assumed to be named as the offsets into the original file. The implemenation copies adds a new class AssemblyStreamZsync which extends AssemblyStrem with support to fill in the missing data in between chunk offsets from a backingFile.

A new zsync capability is added to the files app, which can be checked by the client to know if delta-sync is supported or not.

Related Issue

#16162.

How Has This Been Tested?

Mostly tested by using the owncloud command line client and macOS application. With some randomly generated files using dd and various modifications to those files:

dd if=/dev/urandom of=1g.rand.img bs=$[1024*1024] count=1024

cp 1g.rand.img 1g.rand.add300m.img
dd if=/dev/urandom of=1g.rand.add300m.img bs=$[1024*1024] count=300 oseek=1024

cp 1g.rand.img 1g.rand.mod200m.img
dd if=/dev/urandom of=1g.rand.mod200m.img bs=$[1024*1024] count=100 oseek=123 conv=notrunc 
dd if=/dev/urandom of=1g.rand.mod200m.img bs=$[1024*1024] count=100 oseek=532 conv=notrunc

cp 1g.rand.img 1g.rand.mov200m.img
dd if=1g.rand.img of=1g.rand.mov200m.img bs=$[1024*1024] count=200 iseek=118 oseek=418 conv=notrunc

cp 1g.rand.img 1g.rand.del200m.img
dd if=1g.rand.img of=1g.rand.del200m.img bs=$[1024*1024] count=622
dd if=1g.rand.img of=1g.rand.del200m.img bs=$[1024*1024] iseek=822 oseek=622

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@CLAassistant
Copy link

CLAassistant commented Oct 27, 2017

CLA assistant check
All committers have signed the CLA.

@codecov
Copy link

codecov bot commented Oct 29, 2017

Codecov Report

Merging #29378 into master will increase coverage by 0.09%.
The diff coverage is 90.35%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #29378      +/-   ##
============================================
+ Coverage     60.51%   60.61%   +0.09%     
- Complexity    17193    17253      +60     
============================================
  Files          1031     1034       +3     
  Lines         57288    57474     +186     
============================================
+ Hits          34667    34836     +169     
- Misses        22621    22638      +17
Impacted Files Coverage Δ Complexity Δ
apps/files/appinfo/routes.php 100% <ø> (ø) 0 <0> (ø) ⬇️
apps/files/lib/Capabilities.php 0% <0%> (ø) 2 <0> (ø) ⬇️
apps/files/appinfo/app.php 25% <0%> (ø) 0 <0> (ø) ⬇️
apps/dav/lib/Server.php 45.96% <0%> (-2.37%) 11 <0> (ø)
apps/dav/lib/Upload/FutureFile.php 90.24% <100%> (+1.67%) 20 <2> (+4) ⬆️
apps/dav/lib/Upload/AssemblyStream.php 76.47% <100%> (+0.56%) 32 <8> (+1) ⬆️
apps/files/lib/Hooks.php 100% <100%> (ø) 14 <14> (?)
apps/dav/lib/Upload/ChunkingPlugin.php 100% <100%> (+3.7%) 14 <2> (+6) ⬆️
apps/files/lib/Controller/ZsyncApiController.php 100% <100%> (ø) 6 <6> (?)
apps/files/lib/AppInfo/Application.php 30.43% <16.66%> (-2.07%) 2 <0> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 08ba341...7a2927f. Read the comment docs.

@jvillafanez
Copy link
Member

Question to everyone involved: as far as I know, this is another chunking algorithm (or at least it involves modifying it); why isn't this in another app that can be enabled / disabled and why there isn't a switch to choose the algorithm?

I understand that there will be changes in core to support this, but those changes should be oriented to support more chunking algorithms. The algorithm itself should be moved to an application.

(I haven't been in the discussions so there might be important things that I missed)

@ahmedammar
Copy link
Contributor Author

ahmedammar commented Oct 31, 2017

@jvillafanez well I'm here trying to start a discussion :)

I assumed that delta-sync should be part of the core server implementation and not a separate app because it's integrated with the chunked upload path, because there is no other way to do that without some form of persistent locking.

The changes to the chunking algorithm are very minimal and shouldn't break any existing client code.

@jvillafanez
Copy link
Member

Let's suppose we merge this PR now. What "business value" add this feature to the product and who can benefit from it now?

From my point of view, only the desktop client can benefit from this PR, which I guess it will help for faster syncing, but mobile clients nor the web UI will gain anything. In fact they might perform slower. Note that we also need to take into account external storages and where those ".zsync" files will be created.

This is why I think it's better to move it to an app so the admin can choose what he want to do.

@ahmedammar
Copy link
Contributor Author

ahmedammar commented Oct 31, 2017

Well I don't agree to be honest, only desktop will potentially benefit instantly, but nothing is in the way of a mobile implementation too.

Why would anything perform slower because of this PR? I fail to understand that at all.

Nothing in the PR affects existing code paths or breaks any legacy code. Without a supported client, nothing changes.

The final TODO (if you looked at the related issue) is related to disabling for external storage.

This commit adds server-side support for delta-sync, this extends the
core files app.

The basic approach is to store zsync metadata files in a folder called
`files_zsync/` which mirrors the structure and layout of the `files/`
folder but appends `.zsync` to the metadata files. These files can be
requested by the client via a new route `api/v1/zsync/$path`.

Filesystem hooks are used to mirror any `move/copy/delete` operations on
the base file onto the metadata file.

The upload path is implemented by modifying the `ChunkingPlugin`, the
chunk files are now assumed to be named as the offsets into the original
file. The implemenation copies adds a new class `AssemblyStreamZsync`
which extends `AssemblyStrem` with support to fill in the missing data
in between chunk offsets from a `backingFile`.

A new `zsync` capability is added to the files app, which can be checked
by the client to know if delta-sync is supported or not.

This commit closes #1612.
@ahmedammar
Copy link
Contributor Author

ahmedammar commented Oct 31, 2017

I also fail to see how an 'app' can modify the ChunkingPlugin or even override it. So the small amount of supporting app code may as well be part of the existing files app.

@jvillafanez
Copy link
Member

My point is that mobiles don't have this support implemented now, and it will take time to implement it (if ever, due to priorities). Meanwhile, the client will just upload the files as they have been doing all this time.

Why would anything perform slower because of this PR?

In the ChunkingPlugin file, performMove function you've wrapped the "old" code $this->server->tree->move($path, $destination); with your new code. This will increase the cost of the regular uploads without benefit except for the desktop client (yes, mobiles can implement whatever is needed to gain performance with these changes, but meanwhile it's like this). How much cost is added there? Without measuring, I can't say for sure. Maybe 1 ms, 100 ms, 1 sec... maybe I'm raising unneeded alarms (I haven't reviewed the code in depth so maybe it just adds a couple of ms and it's nothing to worry about).

The final TODO (if you looked at the related issue) is related to disabling for external storage.

Disabling or switching? Chunking is still needed, which reinforces the idea of 2 different implementations.

I also fail to see how an 'app' can modify the ChunkingPlugin or even override it. So it may as well be part of the existing files app.

Probably it isn't possible at the moment, but this support can be added somehow.

@ahmedammar
Copy link
Contributor Author

ahmedammar commented Oct 31, 2017

In the ChunkingPlugin file, performMove function you've wrapped the "old" code $this->server->tree->move($path, $destination); with your new code

I've re-ordered the code so it's a bit clearer, it's basically a string modification and two if statements. For anyone not using zsync based chunking that should be a few ms maximum. I'd be surprised otherwise. the regex string replace could be replaced with something faster if that's really a performance issue.

ahmedammar@f525f38#diff-a3fb8f83598c3f5e1d514a34ca578b1eR149

Disabling or switching? Chunking is still needed.

No one said anything about disabling Chunking at all ... Disabling zsync metadata upload and fetch for external storage is what I meant.

Probably it isn't possible at the moment, but this support can be added somehow.

I personally find it hard to believe that multiple chunking algorithms/plugins is even something with enough developer need. The one important reason this modification of ChunkingPlugin needed to happen for zsync metadata is to get around the fact that there is no persistent file locking available in owncloud. If the .zsync is not copied in the same MOVE request as the base file then this leaves the door open for issues with multiple uploaders.

@ahmedammar ahmedammar closed this Oct 31, 2017
@ahmedammar
Copy link
Contributor Author

ahmedammar commented Oct 31, 2017

I closed this because I accidentally deleted my repo on GitHub 😞 and it says from unknown repository at the top now, and none of my pushes are triggering CI.

I opened a new PR #29404.

@jvillafanez Please continue the discussion there

@lock
Copy link

lock bot commented Aug 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Aug 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants