Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch exception on UPNP that cause socket problem when reconnecting after network fail #469

Merged
merged 8 commits into from
Dec 4, 2018

Conversation

vncoelho
Copy link
Member

Commenting lines that cause socket problems on reconnecting and, consequently, blocks chain sync of some nodes.

Commenting lines that cause socket problems on reconnecting and, consequently, blocks chain sync of some nodes.
@vncoelho vncoelho closed this Nov 15, 2018
@vncoelho
Copy link
Member Author

Closed until further verification.

@vncoelho
Copy link
Member Author

vncoelho commented Nov 15, 2018

Reopened because even in normal operation the Discover loop is not passing from this line:

length = s.Receive(buffer);

Exception is always being catch even in normal initialization.

OnStart Peer port20334 ws_port10334
UNPNP Discover...
UNPNP Discover I...
UNPNP Discover II...
UNPNP Discover III...
UNPNP Discover INSIDE LOOP...
UNPNP Discover INSIDE LOOP TRY...
NEO-CLI Version: 2.9.1.0

[15:29:24.976] OnStart
[15:29:25.009] initialize: height=114 view=0 index=0 role=Backup
[15:29:27.033] timeout: height=114 view=0 state=Backup
[15:29:27.038] request change view: height=114 view=0 nv=0 state=Backup
[15:29:27.117] CheckExpectedView: view_number=1 context.ViewNumber=0 nv=1 state=Backup, ViewChanging
UNPNP Exeception..
UNPNP Discover END...
TCP MANAGER BIND
OK
WS MANAGER WebHostBuilder
OK
Inside Peers.cs ConnectedPeers.Count=0 
Inside Peers.cs ConnectedPeers.Count=0 
Connecting to this nice guy  remote.Address=172.22.0.5  
Connecting to this nice guy  remote.Address=172.22.0.3  

@vncoelho vncoelho reopened this Nov 15, 2018
@vncoelho
Copy link
Member Author

vncoelho commented Nov 30, 2018

Problem still persists in all my tests, it needs some refinement.
I do not know exactly what these commented supposed to do lines do, @erikzhang. This websocket part is something I did not understand yet.

@erikzhang
Copy link
Member

They are udp messages used to communicate with the router and request port mapping, and should be independent of the problem you describe.

@vncoelho
Copy link
Member Author

vncoelho commented Dec 1, 2018

I see,
the point is that this Discover method is always reaching an exception, @erikzhang.
Even without removing that lines has been scaping from the function (however, not producing errors).
Every time it reaches length = s.Receive(buffer); is jumps to the exception.

In addition, when the node is killed and reconnected we have an error that crash the application.
After commenting that 3 lines (s.SendTo(data,ipe)) this error finishes (when we previous initiated the client, killed it and initialized again):

[ERROR][11/11/2018 22:00:09][Thread 0004][akka://NeoSystem/user/$b] Network is unreachable
Cause: System.Net.Sockets.SocketException (0x80004005): Network is unreachable
   at System.Net.Sockets.Socket.UpdateStatusAfterSocketErrorAndThrowException(SocketError error, String callerName)
   at System.Net.Sockets.Socket.SendTo(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, EndPoint remoteEP)
   at Neo.Network.UPnP.Discover()
   at Neo.Network.P2P.Peer.OnStart(Int32 port, Int32 ws_port)
   at Neo.Network.P2P.LocalNode.OnReceive(Object message)
   at Akka.Actor.UntypedActor.Receive(Object message)
   at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
   at Akka.Actor.ActorCell.ReceiveMessage(Object message)
   at Akka.Actor.ActorCell.Invoke(Envelope envelope)
[ERROR][11/11/2018 22:00:09][Thread 0003][akka://NeoSystem/user/$b] Error while creating actor instance of type Neo.Network.P2P.LocalNode with 1 args: (Neo.NeoSystem)
Cause: [akka://NeoSystem/user/$b#837781826]: Akka.Actor.PostRestartException: Exception post restart (System.Net.Sockets.SocketException) ---> System.TypeLoadException: Error while creating actor instance of type Neo.Network.P2P.LocalNode with 1 args: (Neo.NeoSystem) ---> System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.InvalidOperationException: Operation is not valid due to the current state of the object.
   at Neo.Network.P2P.LocalNode..ctor(NeoSystem system)
   --- End of inner exception stack trace ---
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
   at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes, StackCrawlMark& stackMark)
   at System.Activator.CreateInstance(Type type, BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes)
   at System.Activator.CreateInstance(Type type, Object[] args)
   at Akka.Actor.Props.ActivatorProducer.Produce()
   at Akka.Actor.Props.NewActor()
   --- End of inner exception stack trace ---
   at Akka.Actor.Props.NewActor()
   at Akka.Actor.ActorCell.CreateNewActorInstance()
   at Akka.Actor.ActorCell.<>c__DisplayClass109_0.<NewActor>b__0()
   at Akka.Actor.ActorCell.UseThreadContext(Action action)
   at Akka.Actor.ActorCell.NewActor()
   at Akka.Actor.ActorCell.FinishRecreate(Exception cause, ActorBase failedActor)
   --- End of inner exception stack trace ---

@erikzhang
Copy link
Member

@vncoelho Can you test it again?

@shargon
Copy link
Member

shargon commented Dec 3, 2018

I agree with @erikzhang , this messages could not produce this errors

@vncoelho
Copy link
Member Author

vncoelho commented Dec 3, 2018

Thanks, @erikzhang. I just tried again and the error just disappear by catching the exception.
Nice catch... aheuaheuaea

@shargon, I also do not understand it completely.
I think that It should be something related to an internal queue of the Operational System socket layer.
As @erikzhang mentioned, this is used to communicate with the router, which probably lost connection and reconnected with new variables. I do not know exactly, I never worked with this before.

By the way, when is WebSocket currently used?

            if (ws_port > 0)
            {
                ws_host = new WebHostBuilder().UseKestrel().UseUrls($"http://*:{ws_port}").Configure(app => app.UseWebSockets().Run(ProcessWebSocketAsync)).Build();
                ws_host.Start();
            }

@shargon
Copy link
Member

shargon commented Dec 3, 2018

I prefer to use generic catch, not just catch SocketException

@vncoelho
Copy link
Member Author

vncoelho commented Dec 3, 2018

@erikzhang, I changed to generic exception as @shargon suggested, but if you want a more precise catch fell free to roll back to SocketException

@vncoelho vncoelho changed the title Remove lines that cause socket problem reconnect Catch exception on UPNP that cause socket problem when reconnecting after network fail Dec 3, 2018
@erikzhang
Copy link
Member

By the way, when is WebSocket currently used?

WebSocket is used for browser-type nodes, which allow nodes to communicate with each other through the WebSocket.

@erikzhang erikzhang merged commit d0f80cc into neo-project:master Dec 4, 2018
@vncoelho vncoelho deleted the patch-6 branch December 4, 2018 09:21
rodoufu pushed a commit to rodoufu/neo that referenced this pull request Mar 5, 2019
Thacryba pushed a commit to simplitech/neo that referenced this pull request Feb 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants