Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fdb_stop_network MUST be called before the program exits. #170

Closed
Speedy37 opened this issue Jan 23, 2020 · 6 comments
Closed

fdb_stop_network MUST be called before the program exits. #170

Speedy37 opened this issue Jan 23, 2020 · 6 comments
Labels
breaking-change Implies a breaking change bug Something isn't working
Milestone

Comments

@Speedy37
Copy link
Contributor

Sometimes the tokio test fails:

error: test failed, to rerun pass '--test tokio'
##[error]test failed, to rerun pass '--test tokio'
Caused by:
  process didn't exit successfully: `/home/runner/work/foundationdb-rs/foundationdb-rs/target/debug/deps/tokio-c51e2bb533a18041` (signal: 6, SIGABRT: process abort signal)
##[error]The process '/usr/share/rust/.cargo/bin/cargo' failed with exit code 101
##[error]Node run failed with exit code 1
@Speedy37 Speedy37 added this to the 0.4.2 milestone Jan 23, 2020
@Speedy37 Speedy37 changed the title Spurious tokio test failure on ubuntu Tokio test might fail on ubuntu Jan 23, 2020
@Speedy37
Copy link
Contributor Author

Speedy37 commented Feb 1, 2020

This was with v0.2.10.
Seems like recent builds (v0.2.11) aren't affected.

@Speedy37
Copy link
Contributor Author

Speedy37 commented Feb 3, 2020

@Speedy37 Speedy37 added the bug Something isn't working label Feb 26, 2020
@Speedy37
Copy link
Contributor Author

This is probably caused due to this foundation spec:

fdb_stop_network:
Signals the event loop invoked by fdb_run_network() to terminate. You must call this function and wait for fdb_run_network() to return before allowing your program to exit, or else the behavior is undefined.

@Speedy37
Copy link
Contributor Author

yep, this is the cause, fdb_stop_network MUST be called before the program exits.

@Speedy37 Speedy37 changed the title Tokio test might fail on ubuntu fdb_stop_network MUST be called before the program exits. Feb 26, 2020
@Speedy37 Speedy37 added the help wanted Extra attention is needed label Feb 26, 2020
@Speedy37
Copy link
Contributor Author

Internally I traced the segfault to a data race between the network thread that is still running and the __do_global_dtors__aux call in libfdb_c.so.

main thread:
    frame #22: 0x00007ffff7cb8ddf libfdb_c.so`TraceLog::~TraceLog(this=0x00007ffff7faa980) at Trace.cpp:496:14
    frame #23: 0x00007ffff74432de libc.so.6`__cxa_finalize + 206
    frame #24: 0x00007ffff77f5f47 libfdb_c.so`__do_global_dtors_aux + 39
    frame #25: 0x00007ffff7fe13fb ld-2.30.so`___lldb_unnamed_symbol59$$ld-2.30.so + 523
    frame #26: 0x00007ffff7442ba7 libc.so.6`___lldb_unnamed_symbol193$$libc.so.6 + 247
    frame #27: 0x00007ffff7442d60 libc.so.6`exit + 32
    frame #28: 0x00007ffff74201ea libc.so.6`__libc_start_main + 250
    frame #29: 0x000055555556c17e tokio-8ff57d870675dfc8`_start + 46

@Speedy37
Copy link
Contributor Author

Speedy37 commented Feb 26, 2020

Fixing this undefined behavior will requires a breaking change :(

  • Only one #[test] will be possible per test file.
  • NetworkBuilder::boot<T>(self, f: impl FnOnce() -> T) -> T
  • unsafe NetworkBuilder::build

@Speedy37 Speedy37 added breaking-change Implies a breaking change and removed help wanted Extra attention is needed labels Feb 26, 2020
@Speedy37 Speedy37 modified the milestones: 0.4.2, 0.5.0 Feb 26, 2020
Speedy37 pushed a commit that referenced this issue Feb 28, 2020
[breaking-change] fix #170: `fdb_stop_network` **MUST** be called before the program exits.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change Implies a breaking change bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant