-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(t) replication spawn error #2766 #2772
(t) replication spawn error #2766 #2772
Conversation
Update replication code re Py3.* - Modernise previously missed replication imports re Py3.* - Force bytes format for replication messages and commands. Required as zmq needs bytes format for these. - Minor modification re Pythnon 3 behaviour re dict.keys(), we replied on an implicit Python 2 behaviour. - Move to Fstrings. - Parameter/return type hinting. - Removed an unused local variable. - black format update
- complete fstrings conversion for all files modified.
Ongoing dev notes: To date we now have: Sender:
Receiver:[edited]
|
- Improve diagnostic content of receiver failing to retrieve senders IP address from sent appliance ID.
- more debug logging - remove receiver 'latest_snap or b""' to simplify for debug.
Sender
Receiver
|
- More debug logging. - reduce retry iterations from 10 to 3. - more type hints.
- Yet more debug logging. - Remove use of None from within zmq command/message passing: to help with stricter type hinting.
- return prior latest_snap name send as message in receiver-ready. - remove libzmq socker.set_hwm.
- refactor poll -> poller socks -> events for readability. - additional explanatory comments re sockets etc. - formatting typo re fstrings. - additional typing.
- Enable tracker on receiver's response: to assist debug.
- Enable tracker on listender_broker and sender: - add zmq_version and libzmq_version properties to sender, receiver also now has these.
- Enable trackers on listener_broker. - more type-hinting.
Sender
ReceiverNote: in the following (no Web-UI login) there are no further logs. I.e. we have a likely hard crash of the replication subsystem:
Again we have no further log entries here: but expect continued polling log entries on this receiver. Before adding tracking we did not have this hard failure, but we also did not receive (at the sender machine) the 'receiver-ready' command. This initially suggest we have a threading issue with the updated Py3.11 & updated pyzmq. Pyzmq contexts are thread safe, but sockets are not. This situation may relate to:
|
Not so quick, there was a remaining Py3 byte/str encode in the newly added track: now fixed. And we have finally progressed past the sender & receiver setup/nogotiation-chat and onto the still buggy (Py2.7/Py3.11 transition) send fsdata stage:
Sender
|
- more str/byte issues. - iostream behaviour differs, in-dev modifications.
With yet more tinkering we have our first Py3.11 'btrfs send' byte stream (captured via a Popen.stdout PIPE instance) sent over Pyzmq:
With a little more tidy up to do re yet more type issues. |
- additional type-hinting and fix for very low send byte count. - harmonize on btrfs binary location to fs.btrfs for replication. - readability refactoring. - more byte/str fixes re Py2.7/Py3
So prior kb_sent related type handling now resolved, and we have our first successful/error-free full first send, but on the second replication event (incremental=True this time) we have a remaining type issue re: Sender
I.e. we are likely comparing str with bytes: but otherwise this looks to be working as intended given they are equal otherwise!
Next session I address this and we are into testing the full sequence of 3 events prior to replication settling into its stable state. Almost there now. |
- minor additional refactoring for clarity. - keep receiver self.share/snap naming as str, encode before send only. - more type-hints.
- During debug logging we only show the first 180 bytes of the message, this avoids log-spamming MBs of btrfs-send stream data. - Send btrfs-send byte stream in 10MB chunks.
This PR branch now has functional replication, but has only been tested on extremely trivial subvol data size. |
- Avoid logging btrfs data stream contents entirely. - Set read1() bytes read to 100MB max.
This development branch is now basically as-was functional, i.e. we have what looks like our prior function before the Py2.7 to Py3.11 moves of late. I'll move to squash and re-present so we can merge and get this out to a wider testing via the now overdue testing branch release. |
Update replication code re Py3.*
Closes #2766
Supersedes prior draft PR #2769 to accommodate required re-base.