-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config restore: samba share exports not consistently restored #2847
Comments
Hi @Hooverdan96 , Thanks a lot for yet another critical report... I'm trying to think about this one and wanted to verify something first: the import of the Pool(s) that contained the problematic Samba shares in question went fine? All shares are listed and without error? |
@FroggyFlox, yes all Samba corresponding shares were imported correctly. The one that I forgot to recreate was a transcode directory on the root drive. But for that one, I also didn't have a Samba share created (so there wasn't anything in the backup for that). None of the Samba shares were connected to the root drive originally, only for the pool related shares. |
Thanks for the confirmation... Sorry for all the questions, but that one is puzzling and we indeed do not seem to have good feedback in the error returned by Rockstor here. Would you be able to paste the full payload or was it truncated as well? I'm interested in the following:
Alternatively, do you see more debug information in the logs if you turn ON debug mode? |
I can paste the full payload. It didn't seem important at the time :). Let me find it again in the logs. I now created some of them manually, since I needed to restore my "prod" system. |
Thanks a lot... Just curious in case there is something that jumps out of the ordinary with it. |
Only change I made to the output from the log is to insert line breaks between the shares. In the log it was one long line. |
could it be that the samba service had not started yet (since there were custom settings to be updated by the time it gets to the creation of shares? Though, I always felt that a samba server startup doesn't take that long ... |
If I understand correctly your first message...
... I would think that the status of the samba service does not come into play here (or at least is not chasing the problem). On the other hand, the intermittent error with the docker service you observed may underline some timeouts or alike. I'd like to focus on the samba but first, though, as that seems to be the one consistently problematic. Can you confirm that creating a new Samba export works just fine? |
I can confirm creating a Samba share manually worked just fine (multiple ones in fact). Can you remind me again where I set the debug flag? Is this the one:
I'll see that I can get to that in a little bit and can then report back. |
Yes, that's it. |
@Hooverdan96 , so far my best guess is on that point above. I've indeed had a quick look at the restore code and we don't seem to have a mechanism in place to update the share id from the old system (the one present on the backup being restored) and the share id on the new system. I thought we had that as I remember some issue like that not too long ago. I'll have to dig deeper but so far that's my best guess. |
I implemented that kind of id conversion for pools and shares related to scheduled tasks but that did not touch samba shares, etc... @Hooverdan96, if you can confirm that the shares indeed do not match, then it's very likely to be the culprit. One could verify that be manually fixing the share IDs in the config backup file but that's probably too cumbersome for your system as you have a lot of samba shares. It'll need to be confirmed and tested on a test machine so that we can have a nice reproducer. |
Thank you so much for testing all of that! |
For reference, one should look at the restore scheduled tasks logic to update the share id. In particular, we already have a simple helper of interest:
I'm not sure about the quality of error handling here, but it should at least return us to a working state and further improvements to error handling (if/where needed) could be the focus of future PR(s). |
Would this also be necessary for NFS and SFTP exports? |
@Hooverdan96, |
As expected, I can reproduce the failure to restore samba shares if their IDs in the config backup mismatch the existing share ids in the target system. I simply edited the config backup file to change the share ids to something else:
Upon restore, we see the following error in the logs:
|
@FroggyFlox Thanks for the simple reproducer here. I was about to assign myself to this issue and see if I could find the same as a starting point. Shall I proceed (with self assignment) - or were you about to do the same? My vague plan, thus far, - based on @Hooverdan96 and your exposition to date, was to hopefully resource your prior code, already referenced, and hopefully apply the same work-around to help make this restore function more robust. We still have to deal with changing share / pool names. But give we don't yet support that, we can cross that bridge when we get to it. I.e. advice on every future share/pool name-change that a refresh config-backup be created. As we have to maintain our config-backup file format backward support capability. Using the id was always going to fail but it's where we find ourselves and have already made some in-roads to addressing the short-falls. |
@phillxnet , please go for it if you were about to as I don't know when I would be able to for sure.
That would have been my plan as well. |
@FroggyFlox OK, I'll have a go - as this is a show-stopper for our next Stable release so wanted to help nudge this along if I can. I'll report progress here as usual: assuming I make any :). |
…ckstor#2847 Account for Share ID native transform requirement re SMB share export. Cross-references, within config-backup file, Share ID used in each SMB export to retrieve prior Share name. Share name is then used to retrieve native DB Share ID for SMB share export restore process. Assumes Share name continuity across config-backup/restore process. Includes: - Incidental additions of more type hinting. - Additional docstrings.
@Hooverdan96 I've narrowed the scope a little in the associated PR re comment: #2864 (comment) As your & @FroggyFlox's investigation/exposition/reproducer similarly narrowed to the SMB share exports restore failure.
I think it takes a little while for each Rock-on to be installed and then show-up in the Web-UI: but I'm not convinced that is what you saw there. So I'm similarly narrowing the title on this issue - give there were failures in the samba share exports restore that could have, in-turn, messed up some other restore element. They are meant to each fail through but alas we have, obviously, a little more robustness work to do as of yet. But all-in I'm pretty sure we have at least now addressed (in the linked PR) the main focus here: SMB exports in config-backup file failed to restore due to DB dump ID's mismatching the native (new install DB) share IDs. But as per @FroggyFlox prior work in this area - I've extended some of that into what we needed for the SMB export restore. I'd like to restore each SMB export under it's own API call: but that also should be for a future PR was my thinking. I.e. we need first to handle any restores with associated Share ID mis-match first. Later we can then enhance the robustness once we have it actually working under said conditions. As this then enables more robust testing anyway. Bit-by-bit and all in good time: hopefully. |
…res-and-rockons-not-consistently-restored Config restore: samba share exports not consistently restored #2847
Closing as: |
After installing the 5.0.9.0 version and restoring from a configuration exported in a 4.x system, i noticed 2 things.
Samba Shares were not created:
after the payload this message shows:
The samba service was actually updated with all the parameters from the previous install and was running.
Once the restore was finished, triggering it again, yielded the same error messages and no samba shares created.
The Rockons behaved a bit differently. I had attached the Rockon service to its prior RockonRoot manually (which was on one of the disk pools as a share).
None of the Rockons were showing in the UI (despite multiple refreshes. However, when running
docker ps
all containers were up and running,But the error messages I saw in the log all indicated that docker was not running. Here is an example for the netdata Rockon:
Even though the service seems to be running.
After the restarting of the restore operation (as described above) the installed Rockons were now visible in the WebUI. So that's good news, but I am stumped why the api was initially not recognizing that the docker service was already running during the first restore.
The text was updated successfully, but these errors were encountered: