-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Threadpool tsan #238
Threadpool tsan #238
Conversation
…ool has been destroyed by locking the dataMutex around that queue, and having the destructor pass waiting tasks onto the queue when threads are stuck waiting for on said queue
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## develop #238 +/- ##
===========================================
- Coverage 80.42% 80.40% -0.02%
===========================================
Files 52 52
Lines 10762 10810 +48
===========================================
+ Hits 8655 8692 +37
- Misses 2107 2118 +11 ☔ View full report in Codecov by Sentry. |
… memory leak actually exists. If it's just waiting for threads to clean up, then it will keep waiting.
Can we open a PR in WebRTC to confirm the CI there passes too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* Fix definition of LP64 for gcc * Use INT64 instead of LL so the type is will consistent across platforms * Modify threadpool teardown (#218) * Modify threadpool teardown to avoid unlocking mutexes from a different thread than locked it. * Remove an extra comma that made the format incorrect (#216) * Use const PCHAR so C++ compilers won't complain about passing strings constants (#215) * Add stream state machine iterator (#220) * Define iterator function, add a Stream member var * Replace recursive stepState calls with iterator * Replace all Stream stepState calls with iterator * Fix existing comment typos * CLANG * Rename 'keepIterating' member to be more clear * Add null pointer checks to iterator * CLANG * Stylistic change based on precedent * Remove redundant pStateMachine null check * Remove another redundant null pointer check * Fix null checks * Add timerqueuekick and checkstatemachinetransition (#221) * Add timerqueuekick and checkstatemachinetransition * Increase sleep at end of threadpool test to give spawned threads time to free their memory * Utils - Remove duplicate error code * m1 build * update cmake * fix incompatible pointer error * fix incompatible pointer error * run tests * cmake once * no typecasting * Revert "no typecasting" This reverts commit f2096b8. * Revert "Revert "no typecasting"" This reverts commit 7d53af0. * everything working * Add gcc-M1 CI test (#225) * Add gcc M1 test * testing * more * Update gcc path * Try using LLP64 for Apple * Add is null or empty * Update CommonDefs.h (#229) * Fix 32 bit crash issue (#230) * fix 32-bit crash in threadpool and possible seg fault if get item was null * clang format * remove null item and keep iterating * properly free threadpool if failed to create (#231) * address compiler warnings, cleanup ci (#232) * remove unnecessary double compiles, separate out test ci run into a separate section, address compile warning * address multiple compile warnings * address more compiler warnings for linux, separate out few more tests from build in ci * move stuff around * use consistent flags in pic library and test * clang format * only run test if flag FIXUP_ANNEX_B_TRAILING_NALU_ZERO is set, previously we re-built pic with this flag for the tests but the test itself segfaults if it's not set * put tests behind ifdefs which require flags set * tweak ci yml * more ifdefing tests * more ifdefs for tests * more changes * correct path for tsan/ubsan suppressions files * address warning for linux implicit decl of pthread_getname_np * define only once * more ifdefs * more ifdefs for tests * ifdef more tests * more ifdefs * more ifdefs * more ifdefs * more ifdefs * more ifdefs * if thread count is going to overflow, return 0 * another if/def * add build and testing with ALIGNED_MEMORY_MODEL and FIXUP_ANNEX_B_TRAILING_NALU_ZERO to test both paths in CI * more ifdefs for unalign access * more ifdefs for ubsan * ifdef more tests for aligned memory model * more memory aligned ifdefs for tests * more if defs in tests * make sure correct number of bytes for snprintf * State machine name for logs (#233) * Add state machine tag * Add tag for client state machine * Use char with fixed size * Use default tags * Random string * Allow tag setting for state machines * Add unit tests * Free state machine * Use createStateMachineWithTag instead * Rename * Address comments * Silent (#234) * Remove const in the api definition for getStateMachineName (#235) * char* * Remove const * Enable tsan (#236) * enable tsan * lock access to global count to address tsan violation * remove globals, use struct to thread * some changes * lock to protect data race in case of canceled thread still * lock around terminate in tests for tsan data race * free mutex in tests * Release build (#237) * Switch to ASSERT_EQ from assert to prevent release build optimization * Fix assert * Cleanup log statements * Threadpool tsan (#238) * Protect accessing threadpool with threaddata lock * Fix potential seg fault from attempting to access queue after threadpool has been destroyed by locking the dataMutex around that queue, and having the destructor pass waiting tasks onto the queue when threads are stuck waiting for on said queue * Fix typo from merge * Reduce sleeps in threadpool * Lock test utils mutex, and check for termination in actors before assigning pQueue * need allocators set check * wait on teardown of test threads * removing accessing gTerminateCount in the for loop * longer sleep to wait for threads to free their heap * dramatically increase sleep * Add sleep in the heap check in case it's still waiting for a thread to exit * Infinite loop if memory isn't clear. Tests will timeout on GHA if the memory leak actually exists. If it's just waiting for threads to clean up, then it will keep waiting. * Remove sleep from termination task * Assert handle (#239) * Use abort instead * custom assert * Fix definition * Fix fprintf * Simple assert * Clang formatted * Fix build issue * safe gmtime, guard with mutex since not thread safe on non windows pl… (#240) * safe gmtime, guard with mutex since not thread safe on non windows platforms * clang format * add cygwin to keep original gmtime since pthread isn't used there in our thread impl * address pr comments, fix compile warnings in threadpool * missed the extern variable name change * add EXPECT_EQ around thread create and thread join calls in the unit tests, possibly address tsan complain where it it looks like multiple tests executing at the same time * unlock stream mutex before acquiring the fom mutex to avoid deadlock with putframe (#242) * Update ci.yml using xcode 14.3.1 results in a linker issue with gcc, 15.1 fixes the issue, trying 15.2 since this appears to be the latest available. * Update ci.yml ci.yml formatting * Reset current index in case it overflows out of range (#245) * Add PR description lint (#248) * Merge pull request #249 from awslabs/fix-read-windows Fix readFile on Windows --------- Co-authored-by: Dave Johansen <davejohansen@gmail.com> Co-authored-by: jdelapla <delaplan@amazon.com> Co-authored-by: Stefan Kieszkowski <85728496+stefankiesz@users.noreply.github.com> Co-authored-by: Jeremy Gunawan <sirknightj@gmail.com> Co-authored-by: Niyati Maheshwari <niyatim@amazon.com> Co-authored-by: Niyati Maheshwari <niyatim23@gmail.com> Co-authored-by: Jeremy Gunawan <jggunawa@amazon.com> Co-authored-by: Divya Sampath Kumar <disa6302@colorado.edu>
Issue #, if available:
N/A
Description of changes:
ThreadData->mutex now wraps all interactions with any threadpool objects, this includes the queue.
As a consequence of having the data mutex locked while waiting on the queue, the threadpool couldn't teardown if an actor was waiting on the queue, so the teardown now pushes wait tasks to the queue in the event of that use case.
Additionally all utils tests were changed so that if a memory leak is detected, it will infinitely wait for the memory leak to end -- resulting in a GHA timeout or a loose actor cleaning up the memory (common race condition specific to tests only).
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.