-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not cache file ids in FileSystemTags inside group folders #28774
Do not cache file ids in FileSystemTags inside group folders #28774
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 🐘
@nickvergessen do you remember which motivation the younger @nickvergessen had when doing so in 2016? Performance (significant?) or side effects (loops?), or other? |
Yes basically. The paths are called X times for every request and repeated with the same path over and over again, because we do things like: $file = $folder->get('path'); // Runs it once (2+ queries per nesting level)
if ($file->isWriteable()) { // Runs it again (2+ queries per nesting level)
$file->putContent('c'); // Runs it again (2+ queries per nesting level) etc. Building the system tag list every time again and again is very bad performancewise. But we also only do it on the owner level, so it worked with shares although there were JailCache applied. But then again, also for a groupfolder the parents should be the same for everyone? Or can they be mounted to any nesting level? In that case I wouldn't even know which users tags should be considered. So maybe just ignore the cached fileIds when it's a group folder for now, so it continues to work for the rest, while we try to find a performant solution for groupfolders with the expected result. |
bbedc83
to
9203870
Compare
I refactored it to behave this way. File ids will be cached for any storage other than group folders. |
9203870
to
1bb022e
Compare
1bb022e
to
d434e7f
Compare
/backport to stable22 |
/backport to stable21 |
/backport to stable20 |
d434e7f
to
6a3118c
Compare
Signed-off-by: Richard Steinmetz <richard@steinmetz.cloud>
6a3118c
to
8bd8e97
Compare
In #28774 we disabled the caching for the groupfolder application since it worked due to the fact that in groupfolders, getFileIds could be called with the same $cacheId and path for actually different groupfolders. This revert this change and instead add the folderId from the groupFolder to the cacheId. This solve the issue of the uniqueness of the cacheId inside GroupFolder. Downside is that we introduce groupfolder specific implementation inside the server repo. The seconf optimization is to not consider paths starting with __groupfolders in executeCheck. This is due to the fact that files in the groupfolder application call two times executeCheck one time with the url __groupfolder/<folderId>/<path> and the other time with <path>. The first time will always return an empty systemTags array while the second call will return the correct system tags. Signed-off-by: Carl Schwan <carl@carlschwan.eu>
In #28774 we disabled the caching for the groupfolder application since it worked due to the fact that in groupfolders, getFileIds could be called with the same $cacheId and path for actually different groupfolders. This revert this change and instead add the folderId from the groupFolder to the cacheId. This solve the issue of the uniqueness of the cacheId inside GroupFolder. Downside is that we introduce groupfolder specific implementation inside the server repo. The seconf optimization is to not consider paths starting with __groupfolders in executeCheck. This is due to the fact that files in the groupfolder application call two times executeCheck one time with the url __groupfolder/<folderId>/<path> and the other time with <path>. The first time will always return an empty systemTags array while the second call will return the correct system tags. Signed-off-by: Carl Schwan <carl@carlschwan.eu>
In #28774 we disabled the caching for the groupfolder application since it worked due to the fact that in groupfolders, getFileIds could be called with the same $cacheId and path for actually different groupfolders. This revert this change and instead add the folderId from the groupFolder to the cacheId. This solve the issue of the uniqueness of the cacheId inside GroupFolder. Downside is that we introduce groupfolder specific implementation inside the server repo. The seconf optimization is to not consider paths starting with __groupfolders in executeCheck. This is due to the fact that files in the groupfolder application call two times executeCheck one time with the url __groupfolder/<folderId>/<path> and the other time with <path>. The first time will always return an empty systemTags array while the second call will return the correct system tags. Signed-off-by: Carl Schwan <carl@carlschwan.eu>
In #28774 we disabled the caching for the groupfolder application since it worked due to the fact that in groupfolders, getFileIds could be called with the same $cacheId and path for actually different groupfolders. This revert this change and instead add the folderId from the groupFolder to the cacheId. This solve the issue of the uniqueness of the cacheId inside GroupFolder. Downside is that we introduce groupfolder specific implementation inside the server repo. The seconf optimization is to not consider paths starting with __groupfolders in executeCheck. This is due to the fact that files in the groupfolder application call two times executeCheck one time with the url __groupfolder/<folderId>/<path> and the other time with <path>. The first time will always return an empty systemTags array while the second call will return the correct system tags. Signed-off-by: Carl Schwan <carl@carlschwan.eu>
In #28774 we disabled the caching for the groupfolder application since it worked due to the fact that in groupfolders, getFileIds could be called with the same $cacheId and path for actually different groupfolders. This revert this change and instead add the folderId from the groupFolder to the cacheId. This solve the issue of the uniqueness of the cacheId inside GroupFolder. Downside is that we introduce groupfolder specific implementation inside the server repo. The seconf optimization is to not consider paths starting with __groupfolders in executeCheck. This is due to the fact that files in the groupfolder application call two times executeCheck one time with the url __groupfolder/<folderId>/<path> and the other time with <path>. The first time will always return an empty systemTags array while the second call will return the correct system tags. Signed-off-by: Carl Schwan <carl@carlschwan.eu>
In #28774 we disabled the caching for the groupfolder application since it worked due to the fact that in groupfolders, getFileIds could be called with the same $cacheId and path for actually different groupfolders. This revert this change and instead add the folderId from the groupFolder to the cacheId. This solve the issue of the uniqueness of the cacheId inside GroupFolder. Downside is that we introduce groupfolder specific implementation inside the server repo. The seconf optimization is to not consider paths starting with __groupfolders in executeCheck. This is due to the fact that files in the groupfolder application call two times executeCheck one time with the url __groupfolder/<folderId>/<path> and the other time with <path>. The first time will always return an empty systemTags array while the second call will return the correct system tags. Signed-off-by: Carl Schwan <carl@carlschwan.eu>
In #28774 we disabled the caching for the groupfolder application since it worked due to the fact that in groupfolders, getFileIds could be called with the same $cacheId and path for actually different groupfolders. This revert this change and instead add the folderId from the groupFolder to the cacheId. This solve the issue of the uniqueness of the cacheId inside GroupFolder. Downside is that we introduce groupfolder specific implementation inside the server repo. The seconf optimization is to not consider paths starting with __groupfolders in executeCheck. This is due to the fact that files in the groupfolder application call two times executeCheck one time with the url __groupfolder/<folderId>/<path> and the other time with <path>. The first time will always return an empty systemTags array while the second call will return the correct system tags. Signed-off-by: Carl Schwan <carl@carlschwan.eu>
The numerical storage id returned by a cache might not be unique because it is passed down the CacheWrapper chain. Multiple caches may return the same numerical id. Additionally, when combined with the CacheJail wrapper multiple files may be mapped to the same numerical id + path combination even if they represent separate entries in the file cache.
In this case the first inserted file id for the combination of numerical id and path "wins" and blocks any subsequent insertions. This will lead to invalid tags being returned.
How to reproduce:
Expected: The access should be blocked.
Actual: You can navigate to "Groupfolder/folder-1" and list its contents.