-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handling disconnect/reconnect #9
Comments
@dominictarr i don't think you can to worry about the case where the source data of the range query is updated at the end or updated in the middle based on application logic. A stream will never be "real time" consistent. Doesn't levelup/down uses read consistent views (transient snapshot) for all streams? Even if you put a value that would be sorted after the current seek position in the stream it will not be returned. (https://github.com/rvagg/node-levelup/blob/master/test/snapshot-test.js) This means that no one could have coded against an api behavior related to the update in the middle case for streams. On the other hand with levelup it is possible to have constructed an API, however unlikely, that expects to get no values after the start time of the readStream (as in time series data). But over multilevel the start time is non deterministic, and there is no existing api to send stream start time to the client to code like that against. a library should either give a disconnect error or start a reconnect + continue range algorithm. anyway just some thoughts. Obviously i'm just starting to get more involved so your thoughts would be highly appreciated. |
Sounds fantastic, I look forward to hearing of your experiences! I am mostly refering to plugins such as https://github.com/dominictarr/level-live-stream/ which do allow real time views! In other cases, you are correct - reconnecting, and resuming from the last key is correct. |
FIrst thing we could introduce is A hash per reconnect isn't really necessary I think. When a reconnect recurs the client just tells the server what operations it had going on before, eg So, on buffering: Both client and server need an in-memory cache for pending operations/results. The client can have a cache that grows forever but timeouts occur and flush some cash entries when they just take too long. The server on the other side should have a cache that doesn't grow beyond a certain size, so that it can't be taken down by clients just firing one request and quickly disconnecting. @soldair LevelDB can be made to use a Snapshot for iteration but can also iterate without. See http://leveldb.googlecode.com/svn/trunk/doc/index.html (the snapshot section). |
What I said about snapshots was wrong, they're created implicitely when iterating. |
how each method should be handled depends greatly on the plugins - for example, scuttlebutt has it's own ways to handle reconnect, or if data is append only it's quite easy to reconnect... so, what would be needed is a way to inject a function to handle reconnection for a particular plugin. |
I'm wrapping multilevel methods with this now https://github.com/soldair/node-when-connected Within the many layers of streams we have I'm sure its doing some duplicate work but i don't know if any of this has a place in the multilevel api. This solves issues like multilevel should emit an error if a stream is ended due to disconnection rather than end. |
hey guys. to solve resuming disconnected streams on top of the module i mentioned earlier i've published this handles numbers of retries, resumes and a timeout for any call through to multilevel. You can write programs that are largely unaffected by database process restarts. I don't know if any of these features should be in multilevel considering all things should be moar smaller but i'm doing a lot of hacking to scale processing for my production app. (not there yet but close) i'm probably bothering you guys i know my input was not exactly solicited and i'm all talk so far with no pull requests to the project but if anything i'm thinking about sparks an idea of how to make this api better i would love to contribute. |
this should def. happen in multilevel itself, maybe still using a library just for that. It's great that you experiment with approaches already as I don't have time for that at the moment. I'll have a look at your libs. |
@soldair your input is absolutely welcomed. Please don't hesitate to suggest things, or show how you are using this module. this is good work. Two questions on async callbacks:
we could probably break this into two parts, one for handling streams and one for handling async calls. That will make it pretty easy to plug into multilevel. |
thanks =)
accessing any reconnecting stream is really an async operation (its async sometimes so i prefer to have one code path that always presents the callback interface) so i found that streams support was 90% overlap
in multilevel-reconnected i wrapped stream methods one additionl time if "resume" was specified as an option. So
It made perfect sense for this to be separate because the key logic and such. i did not get close to trying to tangle with createLiveStream business =) anyway. super sleepy almost 5am here. |
hey guys. i mentioned this to @dominictarr in irc today but here is a test case that shows what i think is a pretty big issue.
Basically if you don't have an on disconnect that handles this your program has a bug (not live stream of course). Mostly i would love feedback to clarify the best course of action and to submit pull requests that fix this. If this is not a bug i would love to hear how you are handing it so i can build with your pattern. |
in irc:
|
i'm playing the bind all of the errors game to make sure its good for my On Thu, Jul 18, 2013 at 6:50 PM, Dominic Tarr notifications@github.comwrote:
|
posting this mostly as notes, and thinking out loud.
handling disconnections well is fairly important.
when I'm building apps, I often end up with this pattern:
This is quite awkward, because whenever the app changes modes/state
there are streams/mux-demux/scuttlebutts to disconnect, and new things to connect.
in rumours, I managed to pave over this problem for scuttlebutts.
https://github.com/dominictarr/rumours#example-using-defaults
Basically, the challenge here, is creating an interface to each type of abstraction that works well over reconnections.
Just buffering all requests is not efficient - you don't know if a disconnection is temporary or not.
there are various approaches that could be used for different streaming abstractions.
For example, with a range query, that is append only - so, like in twitter - new tweets are always added to the end of the feed - so, after disconnecting, you just ask for the changes since the last thing you received.
That only works for cases that are append only - if a updates may come into the middle of a range, then you'd have to keep a buffer, of say, the last N updates, and then if the disconnection is brief, it will be fine - but if it's long, just resend the whole range, or use something like level-master instead.
get, put, batch can be buffered like in levelup.
The problem here, is that it depends on the particular way that data is updated and replicated.
The text was updated successfully, but these errors were encountered: