Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Path To Shorten ZFS Diff Output #13200

Open
JavaScriptDude opened this issue Mar 12, 2022 · 10 comments
Open

Add Path To Shorten ZFS Diff Output #13200

JavaScriptDude opened this issue Mar 12, 2022 · 10 comments
Labels
Type: Feature Feature request or new feature

Comments

@JavaScriptDude
Copy link

I have found uses for zfs diff in my regular development workflows and lean on it heavily. However I do hit situations where it returns many changes that I don't care about in my use cases. This feature requests will allow the user to pass a path parameter to zfs diff.

If a path parameter is provided, the directory and any children should be included in output. If a file is given, only the file should be included in the output. If the path provided does not exists in the left or the right of the diff, then a clear error should be returned to identify the condition clearly; eg: Path provided not found in left or right sides of diff

This feature would give ZFS some more VCS (Version Control System) style features that will open up more possible use cases for this excellent system.

Ideally update the path parameter processing into diff engine to reduce the process time rather than just be a filter on the output.

@JavaScriptDude JavaScriptDude added the Type: Feature Feature request or new feature label Mar 12, 2022
@rincebrain
Copy link
Contributor

This would be problematic until the caveat where zfs diff sometimes can't figure out what the filename of an object in a vacuum is is resolved, for example. (This can arise when you have multiple hardlinks to a file, because there's only one name field, and if that copy of the file gets deleted, then the object itself doesn't know any more, only the forward references from each hardlink do...)

@JavaScriptDude
Copy link
Author

@rincebrain - Is there an open ticket for that issue?

@rincebrain
Copy link
Contributor

#6335 maybe? There was a talk by a company at the dev summit who fixed most of these in their implementation internally, but they haven't open sourced it (yet?).

@JavaScriptDude
Copy link
Author

Thanks @rincebrain

I have been using zfs as a secondary version control system for my development and have written a multi-purpose python tool for this purpose called zfsvc based on my zfslib library. Its internal only but I will release it eventually. This gives me much more granular vc history's, which I use daily for timekeeping and auditing.

Having more discrete filtering of diffs from zvs diff would greatly improve such tools and workflows.

FYI - Here is an example of zfsvc:
zfsvc diff --discrete -D 8H -p /dpool/vcmain/dev/py
Output:

Dataset: dpool/vcmain (/dpool/vcmain)
     From: 2022-03-17 13:04:48 (as_22-03-17_17:04:47_hr)
       To: 2022-03-17 14:15:03 (as_22-03-17_18:15:03_freq)
-------------------------------------------------------------------------------------------------------------------------------------------------------
      date       |                snapshot     | ? |           file           |                 rpath                       | l_add | l_rem
2022-03-17 13:15 | as_22-03-17_17:15:02_freq   | M | pymssql_test.py          | /dev/py/mycorp/mycorp_sql_stuff             |     4 |     2
2022-03-17 13:30 | as_22-03-17_17:30:28_freq   | M | launch.json              | /dev/py/mycorp/mycorp_sql_stuff/.vscode     |     4 |     2
2022-03-17 13:30 | as_22-03-17_17:30:28_freq   | M | pymssql_test.py          | /dev/py/mycorp/mycorp_sql_stuff             |    17 |     7
2022-03-17 13:45 | as_22-03-17_17:45:12_freq   | M | launch.json              | /dev/py/mycorp/mycorp_sql_stuff/.vscode     |     8 |     4
2022-03-17 13:45 | as_22-03-17_17:45:12_freq   | M | pymssql_test.py          | /dev/py/mycorp/mycorp_sql_stuff             |     8 |     3
2022-03-17 14:00 | as_22-03-17_18:00:03_hr     | M | launch.json              | /dev/py/mycorp/mycorp_sql_stuff/.vscode     |     0 |     0
2022-03-17 14:00 | as_22-03-17_18:00:03_hr     | + | launch.json              | /dev/py/db/pymssql_tester/.vscode           |     - |     -
2022-03-17 14:00 | as_22-03-17_18:00:03_hr     | + | pymssql_tester.py        | /dev/py/db/pymssql_tester                   |     - |     -
2022-03-17 14:15 | as_22-03-17_18:15:03_freq   | M | pymssql_tester.py        | /dev/py/db/pymssql_tester                   |    39 |     8
2022-03-17 14:15 | as_22-03-17_18:15:03_freq   | + | qcorelite.cpython-37.pyc | /dev/py/db/pymssql_tester/__pycache__       |     - |     -
-------------------------------------------------------------------------------------------------------------------------------------------------------

@rincebrain
Copy link
Contributor

You might find using the "punt it to userland and make userland calculate the diff" output from #12837 useful, since you care specifically about the objects changed.

@JavaScriptDude
Copy link
Author

Sounds interesting. Have you seen a practical example of how this type of userland diff would be done? I've never had to use zfs send or recv so far and #12837 is not super explicit on the technique.

Thanks again for the info!

@rincebrain
Copy link
Contributor

rincebrain commented Mar 18, 2022 via email

@almereyda
Copy link

@JavaScriptDude Can you comment on the progress of your organisation's internal discussions about open sourcing zfsvc?

@JavaScriptDude
Copy link
Author

JavaScriptDude commented Sep 25, 2022

I use it daily and its my personal code base with no org to worry about. I will look into putting it on Github in the future. The code depends on my internal 'core' python library that I will need to strip down into a lighter one as its pretty huge.

@almereyda
Copy link

Thank you for the insights, and looking forward to the release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

3 participants