-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle client disconnection when retrying to assign an Erizo #1241
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think client disconnection issue is not related to the retries, but it's still an issue, so I like the idea.
I think there are a couple of things that might be wrong in the PR, so I added some comments trying to explain the issues.
const WEBSOCKET_NORMAL_CLOSURE = 1000; | ||
const WEBSOCKET_GOING_AWAY_CLOSURE = 1001; | ||
|
||
class Channel extends events.EventEmitter { | ||
constructor(socket, nuve) { | ||
super(); | ||
|
||
this.CONNECTED = Symbol('connected'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit lost here, why do all these symbols are scoped in the object now? wouldn't it be clearer with just a function like isConnected() or isDisconnected()?
@@ -244,6 +244,9 @@ class Client extends events.EventEmitter { | |||
'streamId: ' + id + ', clientId: ' + this.id); | |||
callback(null, null, 'ErizoAgent or ErizoJS is not reachable'); | |||
return; | |||
} else if (signMess === 'client-left') { | |||
log.info('message: Abort publishing, client: ' + | |||
this.id + ' left while publishing'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we return from this function here? otherwise we will send a message to a closed channel, which is harmless but it doesn't make sense.
@@ -341,6 +344,9 @@ class Client extends events.EventEmitter { | |||
'clientId: ' + this.id); | |||
callback(null, null, 'ErizoJS is not reachable'); | |||
return; | |||
} else if (signMess === 'client-left') { | |||
log.info('message: Abort subscription, client: ' + | |||
this.id + ' left while subscribing'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idem
}; | ||
|
||
var maybeDeletePublisher = function(publisher, streamId) { | ||
that.removePublisher(publisher, streamId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this is called maybe... if we always call removePublisher?
}; | ||
|
||
var maybeDeleteSubscriber = function(subscriber, streamId) { | ||
that.removeSubscriber(subscriber, streamId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idem
'streamId: ' + streamId + ', ' + | ||
'erizoId: ' + getErizoQueue(streamId) + ', ' + | ||
logger.objectToLog(options.metadata)); | ||
maybeDeletePublisher(publishers[streamId], streamId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm... I think this call will do nothing, so we might have even worse structures because we won't remove the streamId from the erizos
object since publishers[streamId]
has already been set to undefined in a previous retry.
A nice addition would be subscribers[streamId] = undefined
to the previous code maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually they are re-defined before the callRpc("addPublisher").
So if we reached the maximum retry we need to delete these variables, not only set them to undefined.
Also at every retry
erizos.findById(erizoId).publishers.push(streamId);
this function is called, so it's safer to call removePublisher in my opinion. what do you think?
{callback: function (data){ | ||
if (!publishers[streamId] && !subscribers[streamId]){ | ||
{callback: function (data) { | ||
if (!isConnected(client)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same comments apply to this method too.
if (data === 'timeout'){ | ||
if (retries < MAX_ERIZOJS_RETRIES){ | ||
{callback: function (data) { | ||
if (!isConnected(client)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the main contribution of the PR, but it seems like it's not directly related to retries, am I right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather do:
if (data === 'timeout') {
// retry logic
} else if (!client.isConnected()) {
// client disconnected logic
} else {
// initializing
}
I know we might be retrying something that won't work, but I think that way you will surely remove things in case of client disconnection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of something like:
client.isConnected = () {
return channel && channel.isConnected();
};
client should always exist at this point or we would have crashed ErizoController in line 222.
Also, please try to follow the PR template and make the description self contained, not referencing issues or other PRs to understand the problem. It's usually hard to follow all the context otherwise. |
updated PR description |
*** Description ***
This PR aims to fix some inconsistence in the data structure when the rpc of addPublisher and addSubscribe return AFTER the client disconnected.
In the actual state, if a client connects, try to publish, the agent respond but the JS goes in timeout state at the addPublisher rpc, the roomController's that.addPublisher function call itself recursively.
But IF the client close the tab during this process of retry, the ondisconnect method in Cient.js is called and removes all the streams (BUT the streams are added in the array only when the state === 'initializing'), and when the JS is ready, it sends back to the client,js the initializing message, the stream is added to the array AFTER the client is gone and remains here forever. This cause also the ErizoJS to think that there's always someone connected and never close itself.