Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert share ID during snapshot task restore #2355 #2507

Conversation

FroggyFlox
Copy link
Member

@FroggyFlox FroggyFlox commented Mar 3, 2023

Fixes #2355
@phillxnet, @Hooverdan96: ready for testing.

We previously recreated a scheduled task of type snapshot literally from the config backup file. As share IDs are highly likely to differ between system installations, this resulted in scheduled tasks of type snapshot being re-created upon config backup restore against the wrong share.

This PR fixes this by:

  1. fetching the share name from the ID in the backup file (source)
  2. fetching the corresponding share ID in the new (target) system
  3. updating the restore API call accordingly

This PR also fixes some related issues by:

Moreover, the current restore_scheduled_tasks() logic does both the preparation/parsing of the API payload, and the API call. This makes it difficult to test the former. This commit thus refactors this part to split into:

  • parsing/preparation: validate_task_definitions()
  • API call: restore_scheduled_tasks()

Demonstration and Testing

  1. Create 1 pool with 2 test shares: test_share01 and test_share02.
  2. Create 1 daily snapshot task for each of these shares: 2 scheduled tasks total: snap_daily_ts01 and snap_daily_ts02.
  3. Create a config backup and download to local disk.
  4. Turn off VM, and attach the disk containing the pool mentioned in step 1 to a new VM, then import Pool using Rockstor's webUI. This step is to increase our chances to have a different share ID in the new system for the two test shares mentioned in step 1 than what we had in our first VM. This was ensured by inspecting the storageadmin.share model.
  5. Manually edit the config backup saved in step 3 above so that snap_daily_ts01 refers to an invalid share ID (one that does not exist in the config backup file). This is to test we do not attempt restoring an invalid share ID and that we do not fail the entire config restore process in case of an invalid taskdefinition in the config backup file; this is failing in our current testing/master branch. In the example below, it now refers to share id 55 which does not exist:
[
  {
    "model": "smart_manager.taskdefinition",
    "pk": 2,
    "fields": {
      "name": "snap_daily_ts01",
      "task_type": "snapshot",
      "json_meta": "{\"writable\": true, \"visible\": true, \"prefix\": \"snap_daily_ts01\", \"share\": \"55\", \"max_count\": \"4\"}",
      "enabled": false,
      "crontab": "42 3 * * *",
      "crontabwindow": "*-*-*-*-*-*"
    }
  },
  {
    "model": "smart_manager.taskdefinition",
    "pk": 3,
    "fields": {
      "name": "snap_daily_ts02",
      "task_type": "snapshot",
      "json_meta": "{\"writable\": true, \"visible\": true, \"prefix\": \"snap_daily_ts02\", \"share\": \"4\", \"max_count\": \"4\"}",
      "enabled": false,
      "crontab": "42 3 * * *",
      "crontabwindow": "*-*-*-*-*-*"
    }
  }
]
  1. Upload this altered config backup file and start the restore process. Relevant part of the logs:
[03/Mar/2023 10:29:26] INFO [storageadmin.views.config_backup:223] Started restoring scheduled tasks.
[03/Mar/2023 10:29:26] INFO [storageadmin.views.config_backup:273] An unexpected error occurred while trying to restore a task (snap_daily_ts01): local variable 'sname' referenced before assignment
[03/Mar/2023 10:29:26] INFO [storageadmin.views.config_backup:57] Successfully created resource: https://localhost/api/sm/tasks. Payload: {'task_type': u'snapshot', 'name': u'snap_daily_ts02', 'crontabwindow': u'*-*-*-*-*-*', 'enabled': False, 'crontab': u'42 3 * * *', 'meta': {u'writable': True, u'visible': True, u'prefix': u'snap_daily_ts02', u'share': u'4', u'max_count': u'4'}}
[03/Mar/2023 10:29:26] INFO [storageadmin.views.config_backup:227] Finished restoring scheduled tasks.
  • parsing the taskdefinition of snap_daily_ts01 failed due to the invalid share ID
  • the restore process moved on despite this failure and restored snap_daily_ts02 successfully

The scheduled tasks webUI confirms the above:
image

Unit testing

Two new unit tests are included:

radmin:/opt/rockstor # cd src/rockstor && poetry run django-admin test -v 2 -p test_config_backup.py
(...)
test_get_sname (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_update_rockon_shares (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_valid_requests (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_install_config (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_service_status (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_task_definitions (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_taskdef_meta (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_update_config (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok

----------------------------------------------------------------------
Ran 8 tests in 0.079s

OK

Overall unit tests:

radmin:/opt/rockstor # cd src/rockstor && poetry run django-admin test
Creating test database for alias 'default'...
Creating test database for alias 'smart_manager'...
System check identified no issues (0 silenced).
..............................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 238 tests in 12.619s

OK
Destroying test database for alias 'default'...
Destroying test database for alias 'smart_manager'...

We previously recreated a scheduled task of type snapshot literally
from the config backup file. As share IDs are highly likely to differ
between system installations, this resulted in scheduled tasks of type
snapshot being re-created upon config backup restore against the wrong
share.

This commit fixes this by:
  1. fetching the share name from the ID in the backup file (source)
  2. fetching the corresponding share ID in the new (target) system
  3. updating the restore API call accordingly

This commit also fixes some related issues by:
  - updating the API endpoint for scheduled tasks following rockstor#2365
  - continuing restore process if a taskdef fails to be validated rockstor#2355

Moreover, the current `restore_scheduled_tasks()` logic does both the
preparation/parsing of the API payload, and the API call. This makes it
difficult to test the former. This commit thus refactors this part to
split into:
  - parsing/preparation: `validate_task_definitions()`
  - API call: `restore_scheduled_tasks()`

Accordingly two new unit tests are included.
@FroggyFlox FroggyFlox added the needs review Ideally by prior rockstor-core contributor label Mar 3, 2023
@FroggyFlox FroggyFlox linked an issue Mar 3, 2023 that may be closed by this pull request
Copy link
Member

@phillxnet phillxnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing

Restore by share name not share id:

VM1

Creating two snap tasks of two shares:
vm1-snap-tasks

Share IDs
dnsmasq-config: 2
pihole-config: 4

Config-back created and the above id confirmed within the file.

Delete existing share with id: 3
VM1 shutdown and data pool devices moved to VM2

VM2 (4.5.7-2507 rpm upgrade to 4.5.6-0 installer instance)

Create share on ROOT pool to take up share id 2 ('home' is share id 1)
Import prior VM1 pool and check target share ids of to-be-restored snapshot tasks:

Share IDs
dnsmasq-config: 3 # miss-match from VM1 and config save
pihole-config: 4 # same as in VM1 and config save

Upload VM1 config save file
Restore config from uploaded file.

[05/Mar/2023 14:50:02] INFO [storageadmin.views.config_backup:228] Started restoring scheduled tasks.
[05/Mar/2023 14:50:02] INFO [storageadmin.views.config_backup:57] Successfully created resource: https://localhost/api/sm/tasks. Payload: {'task_type': u'snapshot', 'name': u'snap_dnsmasq', 'crontabwindow': u'*-*-*-*-*-*', 'enabled': True, 'crontab': u'*/5 * * * *', 'meta': {u'writable': True, u'visible': True, u'prefix': u'test2507', u'share': u'3', u'max_count': u'5'}}
[05/Mar/2023 14:50:02] INFO [storageadmin.views.config_backup:57] Successfully created resource: https://localhost/api/sm/tasks. Payload: {'task_type': u'snapshot', 'name': u'snap_pihole-config', 'crontabwindow': u'*-*-*-*-*-*', 'enabled': True, 'crontab': u'*/5 * * * *', 'meta': {u'writable': True, u'visible': True, u'prefix': u'test2507', u'share': u'4', u'max_count': u'5'}}
[05/Mar/2023 14:50:02] INFO [storageadmin.views.config_backup:232] Finished restoring scheduled tasks.

5 mins later we have:

Mar 05 14:55:01 buildvm CRON[21032]: (root) CMD (/opt/rockstor/.venv/bin/st-snapshot 2 \*-*-*-*-*-*)
Mar 05 14:55:01 buildvm CRON[21033]: (root) CMD (/opt/rockstor/.venv/bin/st-snapshot 1 \*-*-*-*-*-*)
Mar 05 14:55:02 buildvm CRON[21029]: (root) CMDEND (/opt/rockstor/.venv/bin/st-snapshot 2 \*-*-*-*-*-*)
Mar 05 14:55:02 buildvm kernel: BTRFS warning (device sda): qgroup rescan is already in progress
Mar 05 14:55:02 buildvm kernel: BTRFS info (device sda): qgroup scan completed (inconsistency flag cleared)
Mar 05 14:55:03 buildvm CRON[21030]: (root) CMDEND (/opt/rockstor/.venv/bin/st-snapshot 1 \*-*-*-*-*-*)

And the update in our already restored tasks:

vm2-snap-tasks-restored

So we have a functional test of restore by share name where id matches (between config save and target db) and where it does not.

And the resulting restored scheduled snapshot task is creating new snapshots:
restored-snap-task-snapshot

@FroggyFlox Note this last image:
We have an anomaly regarding the visibility component of the imported and restored snapshots. I very much think this is for another issue and relates to our import of these snapshots.

This is a very important fix. Thanks for seeing it through to conclusion. Much appreciated.

I'll merge so we can include in the next testing channel release.

@phillxnet
Copy link
Member

@FroggyFlox by way of the spin-off issue of the future, when I re-attached the pool to the still off VM1 of the review it successfully picked up from where it left with the following interpretation on the snapshots it and then VM2, and then again itself had taken:

snapshot-visibility-re-import-and-return

@phillxnet phillxnet removed the needs review Ideally by prior rockstor-core contributor label Mar 5, 2023
@phillxnet phillxnet merged commit 4b7cc88 into rockstor:testing Mar 5, 2023
@FroggyFlox FroggyFlox deleted the 2355_restore_scheduled_tasks_incorrect_shares branch May 27, 2023 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Backup/Restore references incorrect shares for Scheduled Tasks
2 participants