Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panick on seed node after incoming connection #633

Closed
cipig opened this issue May 9, 2020 · 17 comments · Fixed by #631
Closed

panick on seed node after incoming connection #633

cipig opened this issue May 9, 2020 · 17 comments · Fixed by #631
Assignees

Comments

@cipig
Copy link
Member

cipig commented May 9, 2020

saw this panic on both seed nodes after the same IP 93.242.216.127 connected... one panicked right after connect 12:25:12, the other one needed a bit longer 12:22:21 - 12:22:47

😀 2020-05-09 12:22:21 +0000 [incoming_connection 93.242.216.127:62684] New connection...
09 12:22:47, common:433] panicked at 'assertion failed: `(left == right)`
  left: `267`,
 right: `0`', /rustc/b2e36e6c2d229126b59e892c9147fbb68115d292/src/libstd/macros.rs:16:9
09 12:22:47, common:434] backtrace
  panicking.rs:419] rust_begin_unwind 0x55e0f53d9f9a
  read_line.rs:69] <tokio::io::util::read_line::ReadLine<R> as core::future::future::Future>::poll::h587bbe252373249d 0x55e0f47a3b4f
  lp_network.rs:211] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::h8656ef996a052533 0x55e0f4735089
  mod.rs:66] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h6f6c84188467e3b0 0x55e0f472bd89
  fuse.rs:84] <futures_util::future::fuse::Fuse<Fut> as core::future::future::Future>::poll::hed3b43df29c8a8b3 0x55e0f482453a
  lp_network.rs:258] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::{{closure}}::{{closure}}::h918703cc6b8eaa6a 0x55e0f47a663a
  select_mod.rs:97] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::{{closure}}::h487ad45a26a3a135 0x55e0f47a673a
  poll_fn.rs:55] <futures_util::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::h6f948e809ebc67dc 0x55e0f47a65a1
  lp_network.rs:258] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::hdb11b6d0e3b86231 0x55e0f4735b9c
  mod.rs:66] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h9ddc8f6a6839d900 0x55e0f472c329
  unit_error.rs:33] <futures_util::future::unit_error::UnitError<Fut> as core::future::future::Future>::poll::h788359137ed2cbeb 0x55e0f47a5953
  compat03as01.rs:122] <futures_util::compat::compat03as01::Compat<Fut> as futures::future::Future>::poll::{{closure}}::he072ba36228edd13 0x55e0f4db1c55
  compat03as01.rs:224] futures_util::compat::compat03as01::with_context::hda8d77447ec5baad 0x55e0f4db1b99
  mod.rs:145] tokio_threadpool::task::Task::run::{{closure}}::h3975ff1c97d730b9 0x55e0f52575aa
  ??:0] __rust_try.llvm.4839013338367208620 0x55e0f5249aa1
  mod.rs:130] tokio_threadpool::task::Task::run::h89c6cf80293f31d4 0x55e0f5245f46
  mod.rs:567] tokio_threadpool::worker::Worker::run_task2::hd5e44ab5c2c9c23e 0x55e0f523e292
  mod.rs:459] tokio_threadpool::worker::Worker::run_task::h9453901e2b36e552 0x55e0f523dfc1
  mod.rs:390] tokio_threadpool::worker::Worker::try_run_owned_task::h6d2a366300d91622 0x55e0f523dc39
  mod.rs:297] tokio_threadpool::worker::Worker::try_run_task::h3e5b2871f90ec8ed 0x55e0f523d93e
  mod.rs:241] tokio_threadpool::worker::Worker::run::h88455d19ee08f462 0x55e0f523d86d
  mod.rs:125] tokio_threadpool::worker::Worker::do_run::{{closure}}::h40336fc4ede26499 0x55e0f525782f
  mod.rs:116] tokio_threadpool::worker::Worker::do_run::h59229c355aa94349 0x55e0f523d786
  mod.rs:344] tokio_threadpool::pool::Pool::spawn_thread::{{closure}}::hb79092efa83eeacf 0x55e0f5249793
  ??:0] __rust_try.llvm.4839013338367208620 0x55e0f5249aa1
  ??:0] start_thread 0x7f2489e96fa2
  ??:0] clone 0x7f2489dab4ce
  ??:0]  0xffffffffffffffff
Virtual memory addresses of /home/electrumx/atomicDEX-API/target/debug/mm2  PHDR:0x55e0f4522040  INTERP:0x55e0f45222e0  LOAD:0x55e0f4522000  LOAD:0x55e0f468d000  LOAD:0x55e0f540a000  LOAD:0x55e0f5cc3220  DYNAMIC:0x55e0f5d56348  NOTE:0x55e0f45222fc  TLS:0x55e0f5cc3220  GNU_EH_FRAME:0x55e0f566062c  GNU_STACK:0x55e0f4522000  GNU_RELRO:0x55e0f5cc3220
😀 2020-05-09 12:25:12 +0000 [incoming_connection 93.242.216.127:62883] New connection...
09 12:25:12, common:433] panicked at 'assertion failed: `(left == right)`
  left: `18`,
 right: `0`', /rustc/b2e36e6c2d229126b59e892c9147fbb68115d292/src/libstd/macros.rs:16:9
09 12:25:12, common:434] backtrace
  panicking.rs:419] rust_begin_unwind 0x56408fff3f9a
  read_line.rs:69] <tokio::io::util::read_line::ReadLine<R> as core::future::future::Future>::poll::h587bbe252373249d 0x56408f3bdb4f
  lp_network.rs:211] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::h8656ef996a052533 0x56408f34f089
  mod.rs:66] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h6f6c84188467e3b0 0x56408f345d89
  fuse.rs:84] <futures_util::future::fuse::Fuse<Fut> as core::future::future::Future>::poll::hed3b43df29c8a8b3 0x56408f43e53a
  lp_network.rs:258] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::{{closure}}::{{closure}}::h918703cc6b8eaa6a 0x56408f3c063a
  select_mod.rs:97] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::{{closure}}::h487ad45a26a3a135 0x56408f3c073a
  poll_fn.rs:55] <futures_util::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::h6f948e809ebc67dc 0x56408f3c05a1
  lp_network.rs:258] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::hdb11b6d0e3b86231 0x56408f34fb9c
  mod.rs:66] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h9ddc8f6a6839d900 0x56408f346329
  unit_error.rs:33] <futures_util::future::unit_error::UnitError<Fut> as core::future::future::Future>::poll::h788359137ed2cbeb 0x56408f3bf953
  compat03as01.rs:122] <futures_util::compat::compat03as01::Compat<Fut> as futures::future::Future>::poll::{{closure}}::he072ba36228edd13 0x56408f9cbc55
  compat03as01.rs:224] futures_util::compat::compat03as01::with_context::hda8d77447ec5baad 0x56408f9cbb99
  mod.rs:145] tokio_threadpool::task::Task::run::{{closure}}::h3975ff1c97d730b9 0x56408fe715aa
  ??:0] __rust_try.llvm.4839013338367208620 0x56408fe63aa1
  mod.rs:130] tokio_threadpool::task::Task::run::h89c6cf80293f31d4 0x56408fe5ff46
  mod.rs:567] tokio_threadpool::worker::Worker::run_task2::hd5e44ab5c2c9c23e 0x56408fe58292
  mod.rs:459] tokio_threadpool::worker::Worker::run_task::h9453901e2b36e552 0x56408fe57fc1
  mod.rs:390] tokio_threadpool::worker::Worker::try_run_owned_task::h6d2a366300d91622 0x56408fe57c39
  mod.rs:297] tokio_threadpool::worker::Worker::try_run_task::h3e5b2871f90ec8ed 0x56408fe5793e
  mod.rs:241] tokio_threadpool::worker::Worker::run::h88455d19ee08f462 0x56408fe5786d
  mod.rs:125] tokio_threadpool::worker::Worker::do_run::{{closure}}::h40336fc4ede26499 0x56408fe7182f
  mod.rs:116] tokio_threadpool::worker::Worker::do_run::h59229c355aa94349 0x56408fe57786
  mod.rs:344] tokio_threadpool::pool::Pool::spawn_thread::{{closure}}::hb79092efa83eeacf 0x56408fe63793
  ??:0] __rust_try.llvm.4839013338367208620 0x56408fe63aa1
  ??:0] start_thread 0x7f33a68a0fa2
  ??:0] clone 0x7f33a67b54ce
  ??:0]  0xffffffffffffffff
Virtual memory addresses of /home/electrumx/atomicDEX-API/target/debug/mm2  PHDR:0x56408f13c040  INTERP:0x56408f13c2e0  LOAD:0x56408f13c000  LOAD:0x56408f2a7000  LOAD:0x564090024000  LOAD:0x5640908dd220  DYNAMIC:0x564090970348  NOTE:0x56408f13c2fc  TLS:0x5640908dd220  GNU_EH_FRAME:0x56409027a62c  GNU_STACK:0x56408f13c000  GNU_RELRO:0x5640908dd220
@artemii235
Copy link
Member

Thanks for report! Checking right now.

@artemii235 artemii235 self-assigned this May 11, 2020
@artemii235
Copy link
Member

The panic is caused by debug assertion in tokio read_line_internal function. I suspect that the future is polled after it is finished and it might be the result of usage of different futures and executors versions with conversion between each other (compat03as01). I will try to run the TcpStreams processing loops on tokio 0.2 executor and see if it helps.

@artemii235
Copy link
Member

Pushed update and updated my seednode.
@cipig Could you please update yours too and monitor them for couple of days?

@cipig
Copy link
Member Author

cipig commented May 11, 2020

😟 2020-05-11 12:54:04 +0000 [incoming_connection 91.10.1.21:64394] Error Connection reset by peer (os error 104) writing to socket, dropping connection
😀 2020-05-11 12:54:04 +0000 [incoming_connection 91.10.1.21:64662] New connection...
11 12:54:05, common:433] panicked at 'assertion failed: `(left == right)`
  left: `529`,
 right: `0`', /rustc/b2e36e6c2d229126b59e892c9147fbb68115d292/src/libstd/macros.rs:16:9
11 12:54:05, common:434] backtrace
  panicking.rs:419] rust_begin_unwind 0x5591215e08da
  read_line.rs:69] <tokio::io::util::read_line::ReadLine<R> as core::future::future::Future>::poll::h587bbe252373249d 0x5591209a540f
  lp_network.rs:211] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::h8656ef996a052533 0x559120936199
  mod.rs:66] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h6f6c84188467e3b0 0x55912092ce99
  fuse.rs:84] <futures_util::future::fuse::Fuse<Fut> as core::future::future::Future>::poll::hed3b43df29c8a8b3 0x559120a2685a
  lp_network.rs:258] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::{{closure}}::{{closure}}::h918703cc6b8eaa6a 0x5591209a7e6a
  select_mod.rs:97] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::{{closure}}::h487ad45a26a3a135 0x5591209a7f6a
  poll_fn.rs:55] <futures_util::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::h6f948e809ebc67dc 0x5591209a7dd1
  lp_network.rs:258] mm2::mm2::lp_network::seednode_loop::{{closure}}::{{closure}}::hdb11b6d0e3b86231 0x559120936cac
  mod.rs:66] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h9ddc8f6a6839d900 0x55912092d439
  ??:0] __rust_try.llvm.10287638557743538876 0x5591209eb8c1
  task.rs:38] <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll::ha241a3daa69100ab 0x559120dab986
  ??:0] __rust_try.llvm.13240658710947542558 0x559120db7551
  ??:0] __rust_try.llvm.13240658710947542558 0x559120db7551
  ??:0] start_thread 0x7f39d4faafa2
  ??:0] clone 0x7f39d4ebf4ce
  ??:0]  0xffffffffffffffff
Virtual memory addresses of /home/electrumx/atomicDEX-API/target/debug/mm2  PHDR:0x559120723040  INTERP:0x5591207232e0  LOAD:0x559120723000  LOAD:0x55912088e000  LOAD:0x559121611000  LOAD:0x559121ecef60  DYNAMIC:0x559121f622a8  NOTE:0x5591207232fc  TLS:0x559121ecef60  GNU_EH_FRAME:0x55912186762c  GNU_STACK:0x559120723000  GNU_RELRO:0x559121ecef60
😀 2020-05-11 12:54:34 +0000 [incoming_connection 91.10.1.21:64679] New connection...

saw this one on both seed nodes (same IP address and time)

😟 2020-05-11 13:06:31 +0000 [incoming_connection 91.10.1.21:64671] Error Connection reset by peer (os error 104) reading from socket, dropping connection
😀 2020-05-11 13:06:32 +0000 [incoming_connection 91.10.1.21:65529] New connection...
😟 2020-05-11 13:06:40 +0000 [incoming_connection 91.10.1.21:65529] Reached EOF, dropping connection

later... so not related to the specific node 91.10.1.21

@artemii235
Copy link
Member

I see, so my initial assumption was wrong. The assertion is active only for debug builds, please rebuild with --release cargo flag and restart the seednodes, it will suppress the panic. I will keep debug build on my seednode and also try to reproduce the problem outside of MM2 context to submit an issue to tokio repo, it might be bug in tokio.

@artemii235
Copy link
Member

Please also ensure to rebuild with latest commit, it contains important modifications not related to panic.

@cipig
Copy link
Member Author

cipig commented May 12, 2020

rebuilt from latest mm2-seednodes-optimisation, but the last panic showed up on both seed nodes shortly after start.
but i found the issue... it is the kernel setting net.ipv4.tcp_tw_reuse = 1, after turning it off, the error did not happen again...
btw, that explains why the error happened only once, shortly after start... i guess it was trying to reuse an old TIME_WAIT connection, likely from the previous run

@artemii235
Copy link
Member

I will recheck whether tcp_tw_reuse has effect on this, I saw same panic on my seednode occurred long time after start.

@cipig
Copy link
Member Author

cipig commented May 12, 2020

indeed, looked again and i also have one panic, without tcp_tw_reuse, but only on one of the seeds

@artemii235
Copy link
Member

I was able to reproduce the panic on my local environment, will submit the report to tokio repo soon.
Also it looks like debug assertions remain enabled because we keep debug info for release builds, I've explicitly disabled them for release profile. Could you please rebuild your seednodes binaries using latest commit?

@artemii235
Copy link
Member

Submitted the issue to tokio repo: tokio-rs/tokio#2532

@tonymorony
Copy link

Submitted the issue to tokio repo: tokio-rs/tokio#2532

tokio-rs/tokio#2532 looks like this issue was just resolved in tokio codebase

@artemii235
Copy link
Member

Thanks for mentioning this! The fix was not yet published on crates.io so I prefer to wait for it before closing the issue.

@cipig
Copy link
Member Author

cipig commented Jul 22, 2020

@artemii235
Copy link
Member

@cipig Thanks for info! I updated tokio version in #631, could you please update your seednodes, check if everything works well and then approve the PR? I will merge it afterwards.

@artemii235 artemii235 linked a pull request Jul 22, 2020 that will close this issue
@artemii235 artemii235 added the bug label Jul 22, 2020
@cipig
Copy link
Member Author

cipig commented Jul 22, 2020

thanks. looks very good, both seed nodes are running this version without problems for many hours

@artemii235
Copy link
Member

Thanks for testing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants