Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullReferenceException in HeartbeatReadTimerCallback crashing application #257

Closed
MichaelLogutov opened this issue Sep 15, 2016 · 31 comments
Assignees
Milestone

Comments

@MichaelLogutov
Copy link

We just got another application crashed due to RabbitMQ library:

Application: Scheduler.exe Framework Version: v4.0.30319 Description: The process was terminated due to an unhandled exception. Exception Info: System.NullReferenceException at RabbitMQ.Client.Framing.Impl.Connection.HeartbeatReadTimerCallback(System.Object) at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.TimerQueueTimer.CallCallback() at System.Threading.TimerQueueTimer.Fire() at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch()

I think that all exceptions inside separate RabbitMQ library threads should not propagate exceptions in any cases. If connection is down and auto recovery is not completed or failed, then Publish and Get methods should throw exceptions.

Anyway - what should I do now to stop RabbitMQ library from crashing my application?
We're using RabbitMQ 3.6.5.0

@kjnilsson
Copy link
Contributor

Hi, which version of the dotnet client are you using?

@MichaelLogutov
Copy link
Author

As I've said - we're using RabbitMQ 3.6.5.0
.NET 4.6.1

@kjnilsson
Copy link
Contributor

Could you try the latest version of the client (currently 4.1.0)?

@MichaelLogutov
Copy link
Author

Will do. Thanks.

@kjnilsson
Copy link
Contributor

There may be a fix in this commit.

@michaelklishin
Copy link
Member

First of all, there is no such RabbitMQ version as 3.6.5.0. What .NET client library version is used?

@MichaelLogutov
Copy link
Author

@MichaelLogutov
Copy link
Author

We've updated to 4.1.0 and got hit with the same crash this night:

Application: Scheduler.exe Framework Version: v4.0.30319 Description: The process was terminated due to an unhandled exception. Exception Info: System.NullReferenceException at RabbitMQ.Client.Framing.Impl.Connection.HeartbeatReadTimerCallback(System.Object) at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.TimerQueueTimer.CallCallback() at System.Threading.TimerQueueTimer.Fire() at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch()

@kjnilsson
Copy link
Contributor

@MichaelLogutov ok, looks like there is still an issue somewhere. Can you give a bit more details on how it is being used? Some kind of minimal repro obviously would be fantastic :)

@MichaelLogutov
Copy link
Author

MichaelLogutov commented Sep 16, 2016

We're creating connection when application start using IConnectionFactory.CreateConnection:

var connection = this.connectionFactory.CreateConnection();
connection.AutoClose = false;

And then we're using CreateChannel:

            var channel = connection.CreateModel();

            if (this.configuration.PublishConfirmTimeout > TimeSpan.Zero)
                channel.ConfirmSelect();

            return channel;

And Publish or BasicGet: channel.BasicGet(queue: queueName, noAck: false)

The most curious thing is that this crash occurred in application that should not publish or receiving messages - and thus does not creates any channels. I could be wrong (it's really big solution and some part of it underneath can use queue to publish something) but maybe RabbitMQ.Client library spins some thread at any case?

@kjnilsson
Copy link
Contributor

After looking at the code I could see one potential null ref scenario so have covered that in the linked PR.

@kjnilsson
Copy link
Contributor

@MichaelLogutov if you want to test out the potential fix right away you could grab the latest package from AppVeyor (package feed: https://ci.appveyor.com/nuget/rabbitmq-dotnet-client-ci
). They are a bit confusingly versioned 4.0.0.x but do represent the latest master commits.

@michaelklishin
Copy link
Member

We will re-trigger a build so that the version is a more sensible 4.1.0.x`.

@michaelklishin michaelklishin added this to the 4.1.1 milestone Sep 16, 2016
@MichaelLogutov
Copy link
Author

Thanks guys. Unfortunately I'm unable to install it.
I've added source:
nuget sources add -Name rabbitmq -Source https://ci.appveyor.com/nuget/rabbitmq-dotnet-client-ci

But I can't find the package:

C:\>nuget list -source rabbitmq
FAKE 4.25.5
NuGet.CommandLine 3.4.3

@kjnilsson
Copy link
Contributor

You may need to enable listing of pre-releases.

@MichaelLogutov
Copy link
Author

Yeah, I found it with "pre" flag but now I can't update it due to other packages referencing higher version than 4.0

@michaelklishin
Copy link
Member

@MichaelLogutov we've bumped AppVeyor minor version to 4.1.0.

@kjnilsson
Copy link
Contributor

Looks like the AppVeyor needs another tweak somewhere before it will version it correctly. Will take a look today.

@kjnilsson
Copy link
Contributor

@MichaelLogutov you could try it now.

@michaelklishin
Copy link
Member

4.1.1-rc1 is now up on nuget.org.

@MichaelLogutov
Copy link
Author

Thanks guys. I'm a bit overwhelmed with work on Monday. Will try it soon.

@MichaelLogutov
Copy link
Author

Ok, we've updated yesterday all our tasks and for now it seems to work without crashes. But it was only one night.

@kjnilsson
Copy link
Contributor

Thanks for testing. Let's leave it a couple of more days and see if the error re-occurs.

@tr00Hlodvig
Copy link

tr00Hlodvig commented Nov 6, 2016

Have the same issue in rmq client version 3.6.5. Unfortunately I can't use 4.1.0 since we use .Net framework 4.5.

Also in addition setting RequestedHeartbeat to 0 do not solve the problem while it is expected since heartbeat should be turned off therefor timers shouldn't be created.

@michaelklishin
Copy link
Member

Posted a comment here while thinking of a different thread. This change is small and possibly can be easily back ported for 3.6.6.

@michaelklishin michaelklishin modified the milestones: 3.6.6, 4.1.1 Nov 6, 2016
@michaelklishin
Copy link
Member

Back ported to stable, it will be in 3.6.6.

@dove
Copy link

dove commented Dec 14, 2016

@michaelklishin thanks for porting back to 3.6.6. I can't upgrade until easynetq is ready for RabbitMQ 4.

@MatthewB2407
Copy link

@michaelklishin

I was getting this error in 3.6.5.

If I'm not missing anything the fix was to upgrade to 3.6.6?

I've done this but getting the same error.

Can anyone help?

Thanks in advance

@dove
Copy link

dove commented Feb 14, 2017

@MatthewB2407 it did the trick for me. maybe just double check you've got 3.6.6 and note any bindingredirects that might need to be updated.

@MatthewB2407
Copy link

MatthewB2407 commented Feb 14, 2017

@dove thanks for the response.

I'm very sure I've got 3.6.6 and just checked any bindingredirects are correct.

Every time I deploy the RabbitMQ client onto our live server it crashes with the following:

Exception: System.NullReferenceException

Message: Object reference not set to an instance of an object.

StackTrace: at RabbitMQ.Client.Framing.Impl.Connection.HeartbeatReadTimerCallback(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.TimerQueueTimer.CallCallback()
at System.Threading.TimerQueueTimer.Fire()
at System.Threading.TimerQueue.FireNextTimers()

@michaelklishin
Copy link
Member

michaelklishin commented Feb 14, 2017

Team RabbitMQ does not use issues for questions or discussions or investigations. We don't have any information about how you deploy the client, what your environment is like, and so on. All we know is that this issue is a natural race condition and while without producing a TLA+ model of much of this client we cannot be 100% sure is no longer present, this issue is no longer being commonly reported by 3.6.6 or 4.x client users.

Please take this to rabbitmq-users and provide more details. Considering upgrading to 4.x releases of this client, too, since 3.6.x will soon be approaching the end of its development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants