-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remote: migrate: deduplicate objects between v2 and v3 #9924
Comments
@12michi34 Is this you on the forum as well https://discuss.dvc.org/t/help-with-upgrading-imported-via-dvc2-x-dvc-data-with-dvc3-0/1750/10? If not, then just linking for the record. |
Regarding the question itself, there is no such feature right now. We've thought about it when implementing |
Just to clarify here, DVC already does support 2.x/3.x deduplication for local cache (via dvc cache migrate). Deduplication is only currently unsupported for remotes. |
After #9938, let's document how best to handle this -- migrate everything to 3.0 and then gc |
@efiop .. sorry about the late reply . Yes, this originated on discord. Somehow missed notification emails that there are new comments on this issue. |
Added some explanation to the migration guide in the docs. No more action is planned at the moment, so closing this one. |
My situation is like this
a) dataFolderA containing fileA.bin fileB.bin, fileC.Bin and I added that via "dvc add dataFolderA" to the remote dvc via 2.0
b) then I changed fileB.bin and added that via "dvc add dataFolderB" to the remove via dvc 3.0
when investigating the remote(and cache) I can see the md5-renamed file for fileA.bin and fileC.bin in both files/md5// and /
it is the same exact md5 hash and the data for fileA.bin and fileC.bin are now twice in the remote (and cache)
(I am simplifying my case there are many fileA,fileB,fileC's involved)
How can I clean up the remote?. I know there exists a "dvc cache migrate" (have not tried it yet though) .
Kindest regards
The text was updated successfully, but these errors were encountered: