Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: previews are physically deleted from disk and db in an async ma… #40045

Merged
merged 1 commit into from
Jul 1, 2022

Conversation

DeepDiver1975
Copy link
Member

@DeepDiver1975 DeepDiver1975 commented May 4, 2022

…nner

ToDos

Description

All preview hooks are gone now. Previews are cleaned up on disk and db in a background job (alternative occ command)

Related Issue

How Has This Been Tested?

Scenario I

  • disable trashbin
  • upload file which can have previews (e.g. image)
  • wait for the preview to be generated
  • note the fileid (e.g.query filecache)
  • look up the thumbnails for this file (data/$user/thumbnails/$fileid)
  • delete the file
  • make sure the thumbnail is still there
  • run occ p:c --all
  • make sure thumbnail got removed from disk and failcache

Scenario II

  • same as Scenario II - but with trashbin enabled
  • file needs to be deleted from trashbin

Scenario III

  • create new test file
  • enter some text via texteditor
  • see the text on the preview
  • edit the file again aka change the contents
  • make sure the preview holds the new file contents

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Database schema changes (next release will require increase of minor version instead of patch)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Technical debt
  • Tests only (no source changes)

Checklist:

  • Code changes
  • Unit tests added
  • Acceptance tests added
  • Documentation ticket raised:
  • Changelog item, see TEMPLATE

@update-docs
Copy link

update-docs bot commented May 4, 2022

Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes.

@ownclouders
Copy link
Contributor

ownclouders commented May 4, 2022

💥 Acceptance tests pipeline apiProvisioningGroups-v2-mariadb10.2-php7.4 failed. The build has been cancelled.

https://drone.owncloud.com/owncloud/core/35664/57

@DeepDiver1975 DeepDiver1975 force-pushed the feat/async-preview-delete-2 branch from 94f9b39 to 9c92a52 Compare May 4, 2022 13:55
@DeepDiver1975
Copy link
Member Author

DeepDiver1975 commented May 4, 2022

  • oracle hates this

Doctrine\DBAL\Exception\DriverException: An exception occurred while executing 'select "fileid", "name", "user_id" from "oc_filecache" "fc"

		join "oc_mounts" on "storage" = "storage_id"

		where "parent" in (select "fileid" from "oc_filecache" where "storage" in (select "numeric_id" from "oc_storages" where "id" like 'home::%' or "id" like 'object::user:%') and "path" = 'thumbnails')

		  and not exists(select "fileid" from "oc_filecache" where CAST("fc"."name" as INT) = "oc_filecache"."fileid")

		  and "fc"."fileid" > ?

		  order by "user_id" limit 1000' with params [0]:


ORA-00933: SQL command not properly ended

@DeepDiver1975

This comment was marked as resolved.

@DeepDiver1975 DeepDiver1975 force-pushed the feat/async-preview-delete-2 branch 2 times, most recently from d8b7718 to fda123a Compare May 12, 2022 13:55
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 5 Code Smells

56.6% 56.6% Coverage
0.0% 0.0% Duplication

@mmattel
Copy link
Contributor

mmattel commented May 12, 2022

This is docs relevant, hooking myself in for early notice

@DeepDiver1975 DeepDiver1975 requested a review from jvillafanez June 2, 2022 09:03
@@ -242,5 +243,8 @@ public function __construct(array $urlParams= []) {
$container->registerService('TwoFactorAuthManager', static function (SimpleContainer $c) {
return $c->query('ServerContainer')->getTwoFactorAuthManager();
});

# add preview cleanup job
$container->getServer()->getJobList()->add(new PreviewCleanupJob());
Copy link
Member

@jvillafanez jvillafanez Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any better placement?
We'll be hitting the DB on any request even if it just checks whether the job is there or not. In addition, if for whatever reason we want to remove the job, it will be added again, which might be confusing specially for support.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not aware of any other way to add cron jobs ....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to files app - job is added via info.xml

});

$output->writeln("$count orphaned previews deleted");
return 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should return the exit code of the command, which should be 0 if it's successful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken care of

$progress = static function () {
};
}
$root = \OC::$server->getLazyRootFolder();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be injected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lazy root cannot be injected

$lastFileId = $row['fileid'];

$userFiles = $root->getUserFolder($userId);
if ($userFiles->getParent()->nodeExists('thumbnails')) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems better to check for the thumbnails folder directly from the root, something like $root->get("$userId/thumbnails");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the API allow to use a path like that? I am not aware of this - but happy to adopt if this works ...

}

private function cleanFileCache($name) {
$sql = "delete from `*PREFIX*filecache` where path like 'thumbnails/$name/%' or path = 'thumbnails/$name'";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we guarantee somehow that we're deleting from our thumbnails directory? I mean, there could be an external storage containing paths such as "SFTP/thumbnails/plans/January", which might get accidentally deleted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken care of

}

public function run($arguments) {
$cmd = new Cleanup(\OC::$server->getDatabaseConnection());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any chance to inject the service?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the call nature of jobs they cannot have arguments in ctor ..... afaik ....

@mmattel
Copy link
Contributor

mmattel commented Jun 20, 2022

Any progress? would be great if it can go into 10.11

@DeepDiver1975 DeepDiver1975 force-pushed the feat/async-preview-delete-2 branch from fda123a to 19b4034 Compare June 27, 2022 15:54
@mmattel
Copy link
Contributor

mmattel commented Jun 28, 2022

CI reports one error: Doctrine\DBAL\Driver\PDO\Exception: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for integer: "thumbnails"

@DeepDiver1975
Copy link
Member Author

Really having a hard time here to reproduce the postgres failure .... but observed it in other places:

Test\Comments\ManagerTest::testDeleteReferencesOfActorWithUserManagement -

$status = $manager->save($comment);

Doctrine\DBAL\Exception\DriverException : An exception occurred while executing 'SELECT "storage", "path" FROM "oc_filecache" WHERE "fileid" = ?' with params ["file64"]:

SQLSTATE[22P02]: Invalid text representation: 7 ERROR:  invalid input syntax for integer: "file64"

@DeepDiver1975 DeepDiver1975 force-pushed the feat/async-preview-delete-2 branch from 5e916fd to d93b3b1 Compare June 30, 2022 12:52
@mmattel
Copy link
Contributor

mmattel commented Jun 30, 2022

Q: in preparation for docs:
--all means to process all of them, independent how many
chunk_size= limits the process to the number of items given

but…, what happens if you set ˋ--all chunk_size=50ˋ ?
does all overwrite any value set for the chunk size option?

@DeepDiver1975
Copy link
Member Author

chunk size defined the number of files being processed in one loop
if --all is set the loop will loop until all are processed
if --all is not set the loop will run once

chunk_size=50: run once and process 50 files
--all chunk_size=50: process all files in blocks of 50 files

@DeepDiver1975 DeepDiver1975 force-pushed the feat/async-preview-delete-2 branch from 5336c52 to cda67af Compare June 30, 2022 15:59
@mmattel
Copy link
Contributor

mmattel commented Jun 30, 2022

if --all is not set the loop will run once

What is the limiting factor of the loop when it runs only once?
Means if --all and chunk_size are not set?

@mmattel
Copy link
Contributor

mmattel commented Jun 30, 2022

Additional Q:
There is a BG job and an occ command.

  1. When to use which one?
  2. Is the BG job automatically active with the standard BG jobs (when eg. triggered via cron) or what is it necessary to add/run it.

@DeepDiver1975
Copy link
Member Author

Addotional Q: There is a BG job and an occ command.

1. When to use which one?

2. Is the BG job automatically active with the standard BG jobs (when eg. triggered via cron) or what is it necessary to add/run it.

the background job is added by default and suits perfect for smaller aka community installations.
the occ command is to be used in regular system cron on bigger installations - the bg job can stay enabled in such a setup - it will not hurt

@DeepDiver1975 DeepDiver1975 force-pushed the feat/async-preview-delete-2 branch from cda67af to ee94c75 Compare July 1, 2022 12:20
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jul 1, 2022

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 4 Code Smells

41.0% 41.0% Coverage
0.0% 0.0% Duplication

@DeepDiver1975 DeepDiver1975 merged commit 5298002 into master Jul 1, 2022
@delete-merged-branch delete-merged-branch bot deleted the feat/async-preview-delete-2 branch July 1, 2022 13:08
@mmattel
Copy link
Contributor

mmattel commented Sep 12, 2022

During writing the docs, I tried the command and an error raised:

sudo -uwww-data ./occ previews:cleanup -- chunk_size=1
An unhandled exception has been thrown:
TypeError: Argument 2 passed to OC\PreviewCleanup::process() must be of the type int, string given, called in /var/www/owncloud/core/Command/Previews/Cleanup.php on line 63 and defined in /var/www/owncloud/lib/private/PreviewCleanup.php:39
Stack trace:
#0 /var/www/owncloud/core/Command/Previews/Cleanup.php(63): OC\PreviewCleanup->process()
#1 /var/www/owncloud/lib/composer/symfony/console/Command/Command.php(255): OC\Core\Command\Previews\Cleanup->execute()
#2 /var/www/owncloud/core/Command/Base.php(159): Symfony\Component\Console\Command\Command->run()
#3 /var/www/owncloud/lib/composer/symfony/console/Application.php(1009): OC\Core\Command\Base->run()
#4 /var/www/owncloud/lib/composer/symfony/console/Application.php(273): Symfony\Component\Console\Application->doRunCommand()
#5 /var/www/owncloud/lib/composer/symfony/console/Application.php(149): Symfony\Component\Console\Application->doRun()
#6 /var/www/owncloud/lib/private/Console/Application.php(165): Symfony\Component\Console\Application->run()
#7 /var/www/owncloud/console.php(116): OC\Console\Application->run()
#8 /var/www/owncloud/occ(11): require_once('/var/www/ownclo...')

@DeepDiver1975 @jnweiger

@DeepDiver1975
Copy link
Member Author

@mmattel #40361

@jnweiger
Copy link
Contributor

jnweiger commented Sep 12, 2022

Nobody said a string "chunksize=1" should be legal. Casting it to (int) does not return the integer part. It returns always 0.

The syntax description is

occ previews:cleanup -h
Description:
  Remove unreferenced previews

Usage:
  previews:cleanup [options] [--] [<chunk_size>]

Arguments:
  chunk_size              [default: 1000]

Options:
...

Try occ previews:cleanup -- 345

@mmattel
Copy link
Contributor

mmattel commented Sep 12, 2022

@jnweiger is right, my command was wrong, sorry for the noise 😢

@phil-davis
Copy link
Contributor

PR #40362 adds validation of the chunk_size so that the command gives a cleaner error message if an invalid value is entered.

@DeepDiver1975
Copy link
Member Author

PR #40362 adds validation of the chunk_size so that the command gives a cleaner error message if an invalid value is entered.

THX for the effort - but from a consistency point of view this should be added to all commands or none .... 🤷

@phil-davis
Copy link
Contributor

Lots of occ commands validate their inputs before proceeding, user:add:

		// Validate email before we create the user
		if ($input->getOption('email')) {
			// Validate first
			if (!$this->mailer->validateMailAddress($input->getOption('email'))) {
				// Invalid! Error
				$output->writeln('<error>Invalid email address supplied</error>');
				return 1;
			} else {
				$email = $input->getOption('email');
			}
		} else {
			$email = null;
		}

and yes, all inputs should be validated everywhere. All starfish on the beach should be saved, but saving one is better than saving none.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants