Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py3.6 System - Config backups - Backup Current Configuration #2569

Closed
phillxnet opened this issue Jun 6, 2023 · 10 comments · Fixed by #2592
Closed

Py3.6 System - Config backups - Backup Current Configuration #2569

phillxnet opened this issue Jun 6, 2023 · 10 comments · Fixed by #2592
Assignees

Comments

@phillxnet
Copy link
Member

phillxnet commented Jun 6, 2023

Post #2567 in testing branch we have a failure in our Backup Current Configuration button function:

[06/Jun/2023 11:54:56] ERROR [storageadmin.util:45] Exception: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 41, in _handle_exception
    yield
  File "/opt/rockstor/src/rockstor/storageadmin/views/config_backup.py", line 662, in post
    cbo = backup_config()
  File "/opt/rockstor/src/rockstor/system/config_backup.py", line 76, in backup_config
    cbo = ConfigBackup(filename=gz_name, md5sum=md5sum(fp), size=size)
  File "/opt/rockstor/src/rockstor/system/osi.py", line 1444, in md5sum
    for l in tfo.readlines():
  File "/usr/lib64/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

Noting a PyCharm editor warning on: src/rockstor/storageadmin/models/config_backup.py
return os.path.join(self.cb_dir(), self.filename)
"Unexpected Types"

@FroggyFlox
Copy link
Member

That one is due to readlines() now failing when fed a gzipped file. We have 2 options here:

  • uncompress the file and get the md5 disgest from the uncompressed file
  • open the gzipped file in binary mode and get the md5 digest from that

The big thing we may need to account for here, relates to how we use the md5 digest in that area:

  • when we create or upload a new config backup file, we store gzip it if needed, and then create an md5 digest
  • we store that md5 digest in the related model entry
  • when listing all config backups, we verify that the md5 digest of the file on disk matches what is stored in the database and delete the model entry if it doesn't match. We do NOT delete the actual file.

The concern I have here is that the different method of generating the md5 digest may lead to a different digest than before. If that is the case, we would thus have a mismatch for all config backups currently in the database and we would thus delete their entry... Not the greatest user experience. We would thus need to verify whether the digest differ with either method when compared to our previous solution.

@phillxnet
Copy link
Member Author

@FroggyFlox Thanks for taking a look at this one: much apprecaited:

I vote for:

  • open the gzipped file in binary mode and get the md5 digest from that

Re:

We would thus need to verify whether the digest differ with either method when compared to our previous solution.

OK, that's pretty messy.

I wonder if we could just flag as legacy older (non matching) config files - rather than have them wiped from the db completely.

@FroggyFlox
Copy link
Member

Yeah... I'm really not sure how to deal with that one... I'll try to have a clear idea of whether md5 do differ and how we can adapt them accordingly. Hopefully we can indeed find a good solution.

@phillxnet
Copy link
Member Author

@FroggyFlox Noting here our remaining test issues - all of which are around this issues related code I think:

buildvm:/opt/rockstor/src/rockstor # poetry run django-admin test -v 2
...
test_valid_requests (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... FAIL
test_validate_install_config (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_service_status (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_task_definitions (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... FAIL
test_validate_taskdef_meta (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ERROR
test_validate_update_config (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_get_requests (rockstor.storageadmin.tests.test_dashboardconfig.DashboardConfigTests) ... ok
test_post_requests (rockstor.storageadmin.tests.test_dashboardconfig.DashboardConfigTests) ... ok
test_put_requests (rockstor.storageadmin.tests.test_dashboardconfig.DashboardConfigTests) ... ok
...
Ran 253 tests in 26.383s

FAILED (failures=2, errors=1)

@FroggyFlox
Copy link
Member

I tried to test whether the md5 digest would remain identical between an update from current master branch to current testing branch:

  1. Build Rockstor from source using the master branch
  2. Create a config backup and write down the md5 as listed in the ConfigBackup entry
  3. Delete Poetry venv, and src/rockstor, but make sure to keep the static/config-backup intact
  4. Switch to testing branch, and upload to the VM
  5. Change md5sum() to open the file in rb mode (and remove the .encode() bit)
  6. Build Rockstor from that source
  7. Manually recreate the ConfigBackup entry as was created in step 2. We thus have a DB entry with the same MD5 digest as in Python2.
  8. Login to the webUI and visit the Config Backup page. This triggers our current md5 digest verification that would delete the DB entry if the MD5 digest of the file on disk differs from what is in the DB entry. We see the following in the logs:
[07/Jun/2023 18:02:21] DEBUG [storageadmin.views.config_backup:606] Process cbo with /opt/rockstor/static/config-backups/backup-2023-06-07-174643.json.gz
[07/Jun/2023 18:02:21] DEBUG [storageadmin.views.config_backup:654] MD5: file: de08e9eab773190d1455606af90d4cda, DB: de08e9eab773190d1455606af90d4cda

The MD5 is thus still the same in testing and the DB entry is thus not automatically deleted 🙂 .

Now, to further test, delete this config backup using the webUI.

  1. Create a new backup config: Works
  2. Download backup to local disk, and re-upload: Rockstor warns that an entry with the same filename already exists and stops there: Works
  3. Delete the entry using the webUI
  4. Upload the same file: Works. The md5 digest is the same as in step 1 above.
  5. Delete the entry using the webUI
  6. Extract the config backup and upload that: Rockstor gzips the file for us, and the resulting md5 is the same as in steps 1 and 4 above: Works

All of that is reassuring.

@FroggyFlox
Copy link
Member

As briefly mentioned above, the original error reported in this issue can be fixed by opening the gzip file in rb mode before getting its md5 digest. We currently open the file in text mode (I think, as it is the default), and then encode each line manually (md5.update(l.encode()))... Opening in rb mode may thus be a direct equivalent to this manipulation but that needs to be tested as I'm really not sure about that. We can always put a conditional and open in rb mode only for gzip files, but I'd rather not complicate that function is not necessary.

For information, system.osi.md5sum() is used as follows (in addition to the config backup area in question here):

  • scripts.install_or_update_systemd_service(): to verify the target file is already as we need it to be. If md5 digests between target and source files differ, then we overwrite the target file with the source file.
  • smart_manager.views.samba_service._write_smb_service(): same as above.

It thus seems like we could simply switch to rb mode overall without too many consequences? If that does change the md5 digest for these files, they would simply be overwritten again... I still need to verify whether that would be a problem, though.

@FroggyFlox
Copy link
Member

Opening in rb mode may thus be a direct equivalent to this manipulation but that needs to be tested as I'm really not sure about that.

Let's test that using a Python console, based on our newest Py3.6 branch:

>>> import os, hashlib

>>> def md5sum_new(fpath):
>>>     # return the md5sum of the given file
>>>     if not os.path.isfile(fpath):
>>>         return None
>>>     md5 = hashlib.md5()
>>>     with open(fpath, "rb") as tfo:
>>>         for l in tfo.readlines():
>>>             md5.update(l)
>>>     return md5.hexdigest()

>>> def md5sum_old(fpath):
>>>     # return the md5sum of the given file
>>>     if not os.path.isfile(fpath):
>>>         return None
>>>     md5 = hashlib.md5()
>>>     with open(fpath) as tfo:
>>>         for l in tfo.readlines():
>>>             md5.update(l.encode())
>>>     return md5.hexdigest()

>>> ss_src = "/usr/lib/systemd/system/smb.service"

>>> md5sum_new(ss_src) == md5sum_old(ss_src)
True
>>> md5sum_new(ss_src) != md5sum_old(ss_src)
False

>>> md5sum_new(ss_src)
'f96d97329b8314e9662ca24c8484b44d'
>>> md5sum_old(ss_src)
'f96d97329b8314e9662ca24c8484b44d'

Looks like we can safely replace the _old() version with the _new() version...
@phillxnet, @Hooverdan96, what do you think?

@Hooverdan96
Copy link
Member

looks like a "proof" right there, the outcome of your test yields the same md5 sum, so I would tend to say that should work.

@FroggyFlox
Copy link
Member

After fixing...

  • system.osi.md5sum()
  • storageadmin.views.config_backup.validate_taskdef_meta()

... we have...

rockdev:/opt/rockstor # cd src/rockstor/ ; poetry run django-admin test -v 2 -p test_config_backup.py ; cd -
(...)
test_get_sname (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_update_rockon_shares (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_valid_requests (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_install_config (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_service_status (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_task_definitions (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_taskdef_meta (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok
test_validate_update_config (rockstor.storageadmin.tests.test_config_backup.ConfigBackupTests) ... ok

----------------------------------------------------------------------
Ran 8 tests in 0.153s

OK

... and thus...

rockdev:/opt/rockstor # cd src/rockstor/ ; poetry run django-admin test  ; cd -
(...)
.............................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 253 tests in 11.747s

OK

... resulting in a functional restore of a config backup taken on a master branch build, with a few scheduled tasks (snapshot + scrub + reboot):

[08/Jun/2023 17:46:16] INFO [storageadmin.views.config_backup:234] Started restoring scheduled tasks.
[08/Jun/2023 17:46:16] INFO [storageadmin.views.config_backup:57] Successfully created resource: https://localhost/api/sm/tasks. Payload: {'name': 'root_scrub', 'task_type': 'scrub', 'crontab': '42 3 * * 5', 'crontabwindow': '*-*-*-*-*-*', 'enabled': False, 'meta': {'pool_name': 'ROOT', 'pool': '1'}}
[08/Jun/2023 17:46:16] INFO [storageadmin.views.config_backup:57] Successfully created resource: https://localhost/api/sm/tasks. Payload: {'name': 'snap-daily_rs01', 'task_type': 'snapshot', 'crontab': '42 3 * * 5', 'crontabwindow': '*-*-*-*-*-*', 'enabled': True, 'meta': {'writable': True, 'visible': True, 'prefix': 'snap_daily_rs01', 'share': '2', 'max_count': '4'}}
[08/Jun/2023 17:46:16] INFO [storageadmin.views.config_backup:57] Successfully created resource: https://localhost/api/sm/tasks. Payload: {'name': 'reboot_test', 'task_type': 'reboot', 'crontab': '42 3 1 * *', 'crontabwindow': '*-*-*-*-*-*', 'enabled': True, 'meta': {}}
[08/Jun/2023 17:46:16] INFO [storageadmin.views.config_backup:238] Finished restoring scheduled tasks.
[08/Jun/2023 17:46:16] INFO [storageadmin.tasks:64] Task [restore_config], id: cd52c9a2-b5fa-4b0b-bfda-62f95a8f48e0 completed OK

FroggyFlox added a commit to FroggyFlox/rockstor-core that referenced this issue Jun 9, 2023
Following our move to Py3.6, we were unable to create a new config backup
due to a failure to get the md5 digest of a gzip file (such as a config
backup). Our config backup restore was also failing to the presence of a
Py2.7-specific bit of code.

This commit:
- adjusts `md5sum()` to be able to read a compressed file as well.
- ensures the task def validation returns the share/pool id as string
- add type hints to help ensure proper types are used/returned
- minor docstrings adjustments
- Copyright update
@FroggyFlox FroggyFlox linked a pull request Jun 9, 2023 that will close this issue
@FroggyFlox FroggyFlox self-assigned this Jun 9, 2023
phillxnet added a commit that referenced this issue Jun 9, 2023
…ups_Backup_Current_Configuration

Restore config backup functionality #2569
@FroggyFlox
Copy link
Member

Closing as fixed by #2592.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants