-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Fixed proper reporting of sending blocked state. Fixed API value for connect-non-blocking #1698
base: master
Are you sure you want to change the base?
Conversation
…ue for connect-non-blocking
… exit checks in tsbpd. Suppressed IPE log when unsubscribing
srtcore/core.cpp
Outdated
} | ||
|
||
HLOGC(tslog.Debug, log << self->CONID() << "tsbpd: WAKE UP!!!"); | ||
} | ||
ExitLoops: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No new goto
statements, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to exit two loops, and additionally maintain the thread checker.
srtcore/core.cpp
Outdated
bool signaled = false; | ||
while (!signaled) | ||
{ | ||
// For safety reasons, do wakeup once per 1s and re-check the flag. | ||
THREAD_PAUSED(); | ||
signaled = tsbpd_cc.wait_for(seconds_from(1)); | ||
THREAD_RESUMED(); | ||
if (self->m_bClosing) | ||
{ | ||
HLOGC(tslog.Debug, log << "tsbpd: IPE? Closing flag set in the meantime of waiting. Exiting"); | ||
goto ExitLoops; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just let TSBPD to re-check the receiver buffer even in case of a spurious wake-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, yes, might be a better idea. OTOH this 1s delay is too much if that deadlock would have happened when closing a socket in the group receiver.
…tsbpd_cc wait 125ms interrupt for sanity check
srtcore/group.cpp
Outdated
int swait_timeout = 0; | ||
|
||
// There's also a hidden condition here that is the upper if condition. | ||
is_pending_blocked = (nsuccessful == 0); | ||
|
||
// If this is the case when | ||
if (m_bSynSending && is_pending_blocked) | ||
{ | ||
HLOGC(gslog.Debug, log << "grp/sendBroadcast: will block for " << m_iSndTimeOut << " - waiting for any writable in blocking mode"); | ||
swait_timeout = m_iSndTimeOut; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code duplication: the same was used in send_broadcast... Consider some common funtion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will. There's a PR for it, but we need to be clear off with all potential problems to solve first. There are marked places that should make this process easy enough. Note however that the things happen differently in broadcast and backup, which means that even conditions for getting sockets blocked are not exactly the same.
srtcore/core.cpp
Outdated
void CUDT::updateBrokenConnection() | ||
{ | ||
HLOGC(smlog.Debug, log << "updateBrokenConnection: setting closing=true and taking out epoll events"); | ||
m_bClosing = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fix require other changes introduced in the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is simply a completion for all cases when releaseSync
is called, which is connected to the fact that releaseSync
is about to make TSBPD thread exit, and it can only exit if m_bClosing == true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does only this change make sense without the rest changes in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest of the changes are formal fixes for real problems, others - including this one - are "sanity fixes". This flag must be set true when the TSBPD thread is expected to exit otherwise it risks a deadlock. Relying on that a flag is set appropriately before exitting isn't the best approach, but it's simply derived from UDT and it's a consequence of how the TSBPD thread is expected to work. Changes here were due to an in-between lifted lock on m_RecvLock
, which was newly introduced with the groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this fix to a separate PR, so it is not blocked by the review of other changes introduced here.
Co-authored-by: Maxim Sharabayko <maxlovic@gmail.com>
This comment has been minimized.
This comment has been minimized.
srtcore/core.cpp
Outdated
@@ -7819,15 +7854,20 @@ void CUDT::destroySynch() | |||
void CUDT::releaseSynch() | |||
{ | |||
SRT_ASSERT(m_bClosing); | |||
if (!m_bClosing) | |||
{ | |||
HLOGC(smlog.Debug, log << "releaseSynch: IPE: m_bClosing not set to false, TSBPD might hangup!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might rather be an error message: LOGC(smlog.Error, ...)
.
// Return value rules: | ||
// In non-blocking mode: | ||
// - return Socket ID, if: | ||
// - you requested only one connection in this call | ||
// In blocking mode: | ||
// - return Socket ID, if: | ||
// - you requested only one connection in this call | ||
// - you connect a group that was not connected yet | ||
// - otherwise return 0 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"if you requested only one connection in this call"
Not clear. I guess the meaning is that there is only one target passed to this group connect function (arraysize
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that it's so if you do call srt_connect
or if you do srt_connect_group
with an array of size 1.
if (arraysize > 1) | ||
retval = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that arraysize
is 400 lines above from this line, it would make sense to make it const
, so that no one mistakenly changes it in between.
int CUDTUnited::groupConnect(CUDTGroup* pg, SRT_SOCKGROUPCONFIG* targets, const int arraysize
{ | ||
// For safety reasons, do wakeup once per 1/8s and re-check the flag. | ||
// This should be enough long time that during a normal transmission | ||
// the TSBPD thread would be woken up much earlier when required by | ||
// ACK per ACK timer (at most 10ms since the last check) and in case | ||
// when this might result in a deadlock, it would only hold up to 125ms, | ||
// which should be little harmful for the application. NOTE THAT THIS | ||
// IS A SANITY CHECK FOR A SITUATION THAT SHALL NEVER HAPPEN. | ||
THREAD_PAUSED(); | ||
signaled = tsbpd_cc.wait_for(milliseconds_from(125)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will hide possible synchronization issues, and it would much harder to track them.
Also might introduce a potential minor performance degradation.
CUDT::releaseSynch()
notifies m_RcvTsbPdCond
used in this tsbpd_cc
. If it happens to be blocked forever, then there is another issue to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's only a sanity check, and I doubt the performance degradation here, although might be that 250ms would be a less likely watchdog to bite, while applications also usually have their own independent watchdogs. Initially I planned 1s here, but that would be way too much. I understand that it might make errors kinda hidden, but OTOH a deadlock here would make a scrap in video transmission (so it won't go "invisible", just it won't make a total disaster).
Note also that releaseSynch won't be called another time, that's true, but TSBPD should simply exit on seeing m_bClosing
flag anyway.
Additionally: the value returned in case of blocking-mode
srt_connect_group
it should return the socket ID of the first successful socket, while in case of non-blocking mode it always returns 0.Findings (update):
NOT updating, turning to draft.