-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] assets may become orphaned when moved by the storage template service #2877
Comments
Following some internal discussion on Discord, we believe it may be beneficial to queue asset migrations individually rather a single job which migrates all assets at once. This has some advantages:
The current all-in-one migration will be marked as successful, even if some assets weren't moved correctly which isn't ideal. |
Further, it looks like 163 files failed to move. https://gist.github.com/uhthomas/650ff9f08f83f8f0ffb3008653df395d |
We had a long discussion about the right way to approach this and concluded the best thing to do in addition to individual jobs, is to move some logic into the database. We don't have specifics yet, but some psuedo code: move assetconst dst = await calculateDestination(asset);
if (!checkFileExists(dst)) {
await moveFile(src, dst);
}
await save({}); calculate destinationconst tx = new Transaction(); // idk how this works with typeorm, this is how it works in Go
const results = tx.find({ dst });
if (!results.length) {
tx.save({ dst });
tx.commit();
return;
}
tx.save({ dst, suffix: `-${results.length}` });
tx.commit(); This solution essentially moves the location of an asset from the existing asset table, to a new table and abstracts the location away. The schema of the table could look like: CREATE TABLE asset_location (
id SERIAL PRIMARY KEY,
asset_id INT,
location VARCHAR(255) NOT NULL,
suffix VARCHAR(255),
CONSTRAINT fk_asset
FOREIGN KEY(asset_id)
REFERENCES assets(id)
); |
I wrote this script to move assets back to the right place. https://gist.github.com/uhthomas/87cd39d0bbed044800a982556617f8db |
So I've been thinking about this a bit and in addition to this, it would be really nice to support other file systems like S3. We can do this by dropping support for custom folders and switch to a flat hierarchy of unique directory names. Such |
I'd also like to add that at current, Immich does not actually preserve the original filename if there are duplicate files on the same day. They are prefixed with a number, like |
We used to do that and got a lot of pushback on it, which prompted the current custom folder support. |
Was the implementation identical, or was it slightly different? Links would be really helpful! I can understand pushback from files names |
How would this potentially conflict with the recently added support for external / read-only assets? I imagine the database entry would simply refer to the filesystem location that doesn't match this proposed convention, though it may not matter if no filesystem operations are taking place against those assets. |
I don't see an issue with S3 support - we don't need to use the same scheme on every storage backend, so we can just take this approach only for S3 if we need. #1098 is the PR that added storage templating, it has some (scattered) links to relevant issues/discussions. |
Thanks for the links @bo0tzz, definitely helpful for gathering context. I think that diverging too much with different implementations will make things difficult. Whether it's because of the technical implementation like where file are initially uploaded and then what happens when the file upload is complete, or whether it's because storage migrations and this custom folder structure just wouldn't work with something like S3. It's probably wise to keep behaviour as similar as possible to avoid bugs and maintenance overhead, the only real way to do that imo is to work with the lowest common denominator where S3 does not support a move operation.
If designed correctly, it shouldn't conflict at all. Ideally, it would actually make it easier as there would be more effort put into a proper abstraction for file systems. |
I would have some concern with moving away from an available feature like storage template for a software-specific implementation of managing this type of personal media. I know that Immich came from managing the filesystem structure in the past, but moving back to that makes this type of media highly inaccessible except from within the app itself. Especially in its current state where Immich doesn't (and arguably shouldn't?) fill every role of interacting with these types of files. Example scenario: Immich is used for asset backup from devices, browsing, sharing, etc. But current metadata management and photo editing is not possible in Immich. Moving the filesystem structure to something that is not human friendly makes interacting with and using these assets extremely difficult to navigate. |
I appreciate and understand that perspective, I can definitely see how a bunch of randomly generated directory names would be hard to work with, regardless of whether or not the original filename is preserved. There is a middle-ground, where immich could persist uploaded items to directories created with the current date, like I think we have to come to some sort of conclusion on what we do or don't want to support. I am not sure it's possible to have our cake and eat it too. Either we design Immich to be reliable and scalable without regard for some features, or the existing behaviour is preserved with the caveat of disparate storage implementations, complexity and issues like this one. With respect to one point, I think that if Immich is a backup solution, then users should not be editing their files directly. Further thinking, given the read-only feature, I think that may be a good way to sort of support both at once? The files uploaded to Immich can be in any form they need to be, whereas files that users may want to work with locally can be part of that read-only directory. Would be really interested to hear your further thoughts. |
For read only files and upcoming external libraries, that's not an issue with this proposal. They just stay put where they happened to be, the db points to their filename, and that's it. I think this only relates to uploaded (i.e. internal) assets |
The bug
I uploaded 8.5k files to Immich with the CLI and waited for all the jobs to process. Things look mostly fine, except some assets are missing thumbnails.
A look at the jobs status page shows some did fail.
Any attempt to rerun those jobs fail, and the logs from the microservice show:
A manual storage template migration also shows similar logs:
Further, it's not possible to download the image. Though interestingly there is exif data.
Server logs from this action:
A manual inspection of the filesystem shows the image was successfully moved, but the change was not persisted to the database. The image was supposed to be moved back, but wasn't.
Loki shows there were some issues connecting to the database, which isn't Immich's fault but Immich should handle this better.
The OS that Immich Server is running on
Kubernetes
Version of Immich Server
v1.61.1
Version of Immich Mobile App
N/A
Platform with the issue
Your docker-compose.yml content
N/A
Your .env content
Reproduction steps
Additional information
No response
The text was updated successfully, but these errors were encountered: