-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Worker crash 'spool->msg_status == MSG_INVALID' #534
Comments
I get the same assertion error (spool->msg_status == MSG_INVALID) when I subscribe to a channel through EventSource under these conditions:
|
Hi, I have the same issue. 1.2.5 works, 1.2.6 crashes.
The first GET returns directly (timeout). The 0.5s timeout does not matter, same thing if it is 5s |
Seems to be introduced by 82b766c, reverting that and it stops crashing. |
@stromnet: Thanks, I'll have this fixed shortly |
For the record, this happens with Redis enabled too (previous test was without redis. Same config as below but remove redis stuff). nginx config used:
|
I'm having the same issue with this are my current stats that makes it break, it crashed right away, in just a few seconds
I haven't been able to reproduce it in a local environment, just with a ton of messages in production. |
@blackjid can you share your Mine looks like so
and I guess |
We are seeing this issue now in the latest NCHAN/NGINX version with a very simple configuration. Configuration
We have relatively low traffic on this, so after 0-5 hours of good, error-free usage, this is our stub status:
Then, in our error logs (resulting in #321):
Causing interprocess alerts to pile up:
Happy to share additional info. Thanks! |
@andjosh you can at least mitigate this with
That way if worker dies it will be restarted by the master and I bet nobody will even notice. That works well until you hit the CPU/connection limit of 1 worker (it either uses 100% of 1 core or has connections opened equal to Also, thanks for sharing the config, it shows that this bug has nothing to do with
like I thought before. |
@ivanovv Unfortunately that does not seem to help. Just tried that now, and all that happens is that the one worker process continuously crashes every second:
|
@andjosh interesting, I was under impression that worker restart should clear things up. At least I can say that worker_count=1 can guarantee you that you won't find nchan stuck in As for this issue I suggest you using nchan 1.2.2. |
Any news on this issue? Right now
Configuration:
|
Why not revert to an older version of |
Older version does not support event-source ping which is required for us otherwise devices will timeout constantly. |
then the only thing I can suggest is going the same route that is described in this #534 (comment) |
I met the issue on v1.2.6 and v1.2.7. Is the root cause identified? Will you provide the fix in the next release? Thanks so much! @slact |
As I understand it there will be no This issue is more than 1 year old, you see. |
still an error in nchan 1.2.8 |
I can confim the issue exists on v1.2.8 and v1.2.10 with same logs:
Using This config suits for me and fixed the issue by removing those headers:
@slact Can we hope to fix this issue before 2022? |
My previous try is just for one situation that described by @stromnet. We got another unknown type of request that raises this error again! @slact Asserting a request/connection must not break the whole web server. |
any update ? |
Still facing this issue. I have compiled my nginx, 1.20.2, with following command:
Error output on the errors.log
My server config
|
Seems this project may be dead. I found reverting to nchan version: 1.2.5 was the only remedy for this and another update problem #578 |
@slact the current version is 1.2.15 but I still get this error when using it as a dynamic module with nginx 1.21.6. And it was like this with 1.20.* too. On Ubuntu server. Built from source. 2022/03/12 16:51:29 [info] 46964#46964: Using 131072KiB of shared memory for nchan in /etc/nginx/nginx.conf:66 |
nchan 1.2.5 works with nginx 1.21.6 as a dynamic module |
@slact I've encountered the same issue. And nchan is not sending any messages to subscribers. |
We're experiencing this issue with the version of NChan that shipped with Ubuntu 22.04 (Jammy). This has inadvertently caused us to lose stability when upgrading to the newest version of Ubuntu. Currently, the only workaround appears to be building an older version of the library from source. It would be very preferable if this was fixed in this repo. |
How is this still not fixed after 4 years... |
@Kurasami Because I fixed the issue I was seeing that led to this crash. If there is another, I have been unable to replicate it. If you want this issue fixed, i need the following information:
|
@slact I've found a way to reproduce (I think) this crash. I've documented it here: #676 (comment) Only commenting here to keep these issues linked (I think they are related) |
I switched from the version included in Ubuntu/Debian, and built from source, and the problem went away 🤔 |
I've updated nchan from
to latest 1.2.6.
While we had some issues with nginx worker crashing in the past, it's been quite stable in last 3-4months, no crashes at all.
Now, in just about 30 minutes after version change, the worker crashed
Here are the details (though, as I used compiled deb packages, there are less details than usual)
nginx error log:
FYI: this is the second dump of the two, I still have this in
core_pattern
I'm about to revert my version change for now, let me know if you need anything else.
The text was updated successfully, but these errors were encountered: