Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Threading.Tasks.Dataflow tests failing on a checked runtime #1847

Closed
safern opened this issue Jan 17, 2020 · 2 comments · Fixed by #1907
Closed

System.Threading.Tasks.Dataflow tests failing on a checked runtime #1847

safern opened this issue Jan 17, 2020 · 2 comments · Fixed by #1907
Labels
area-VM-coreclr untriaged New issue has not been triaged by the area owner

Comments

@safern
Copy link
Member

safern commented Jan 17, 2020

/private/tmp/helix/working/B45A096B/w/ABA708E6/e /private/tmp/helix/working/B45A096B/w/ABA708E6/e
  Discovering: System.Threading.Tasks.Dataflow.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Threading.Tasks.Dataflow.Tests (found 290 of 297 test cases)
  Starting:    System.Threading.Tasks.Dataflow.Tests (parallel test collections = on, max threads = 4)

Assert failure(PID 15886 [0x00003e0e], Thread: 1442342 [0x160226]): oldValue == NULL || (oldValue == value && oldFlags == flags)
    File: /Users/runner/runners/2.163.1/work/1/s/src/coreclr/src/vm/ceeload.inl Line: 243
    Image: /private/tmp/helix/working/B45A096B/p/dotnet

./RunTests.sh: line 161: 15886 Abort trap: 6           (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Threading.Tasks.Dataflow.Tests.runtimeconfig.json --depsfile System.Threading.Tasks.Dataflow.Tests.deps.json xunit.console.dll System.Threading.Tasks.Dataflow.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing -notrait category=nonnetcoreapptests -notrait category=nonosxtests $RSP_FILE
/private/tmp/helix/working/B45A096B/w/ABA708E6/e
----- end Thu Jan 16 17:29:24 PST 2020 ----- exit code 134 ----------------------------------------------------------

It failed on job: https://helix.dot.net/api/2019-06-17/jobs/c473c8c3-0b1b-4f8f-bcb4-64bd70150357/workitems/System.Threading.Tasks.Dataflow.Tests/console

Here is the dump:
https://helix.dot.net/api/2019-06-17/jobs/c473c8c3-0b1b-4f8f-bcb4-64bd70150357/workitems/System.Threading.Tasks.Dataflow.Tests/files/core.15886

cc: @jkotas @stephentoub

@jkotas
Copy link
Member

jkotas commented Jan 18, 2020

Stacktrace of the failure:

    frame #7: 0x000000010642b3d6 libcoreclr.dylib`DbgAssertDialog + 102
    frame #8: 0x000000010673e86e libcoreclr.dylib`MethodTableBuilder::GatherGenericsInfo(Module*, unsigned int, Instantiation, MethodTableBuilder::bmtGenericsInfo*, StackingAllocator*) + 702
    frame #9: 0x000000010673eff2 libcoreclr.dylib`ClassLoader::CreateTypeHandleForTypeDefThrowing(Module*, unsigned int, Instantiation, AllocMemTracker*) + 578
    frame #10: 0x00000001064e42be libcoreclr.dylib`ClassLoader::CreateTypeHandleForTypeKey(TypeKey*, AllocMemTracker*) + 1118
    frame #11: 0x00000001064e3d0a libcoreclr.dylib`ClassLoader::DoIncrementalLoad(TypeKey*, TypeHandle, ClassLoadLevel) + 458
    frame #12: 0x00000001064e52f2 libcoreclr.dylib`ClassLoader::LoadTypeHandleForTypeKey_Body(TypeKey*, TypeHandle, ClassLoadLevel) + 1106
    frame #13: 0x00000001064e0351 libcoreclr.dylib`ClassLoader::LoadTypeHandleForTypeKey(TypeKey*, TypeHandle, ClassLoadLevel, InstantiationContext const*) + 305
    frame #14: 0x00000001064e1bb2 libcoreclr.dylib`ClassLoader::LoadTypeDefThrowing(Module*, unsigned int, ClassLoader::NotFoundAction, ClassLoader::PermitUninstantiatedFlag, unsigned int, ClassLoadLevel, Instantiation*) + 594
    frame #15: 0x00000001064dd8a3 libcoreclr.dylib`ClassLoader::LoadTypeHandleThrowing(NameHandle*, ClassLoadLevel, Module*) + 931
    frame #16: 0x00000001064dd3eb libcoreclr.dylib`ClassLoader::LoadTypeHandleThrowIfFailed(NameHandle*, ClassLoadLevel, Module*) + 43
    frame #17: 0x00000001064dd3b2 libcoreclr.dylib`ClassLoader::LoadTypeByNameThrowing(Assembly*, char const*, char const*, ClassLoader::NotFoundAction, ClassLoader::LoadTypesFlag, ClassLoadLevel) + 98
    frame #18: 0x00000001064b4eb0 libcoreclr.dylib`MscorlibBinder::LookupClassLocal(BinderClassID) + 96
    frame #19: 0x0000000106746929 libcoreclr.dylib`MarshalInfo::MarshalInfo(Module*, SigPointer, SigTypeContext const*, unsigned int, MarshalInfo::MarshalScenario, CorNativeLinkType, CorNativeLinkFlags, int, unsigned int, unsigned int, int, int, int, int, MethodDesc*, int, int, char const*, char const*, unsigned int) + 9897
    frame #20: 0x0000000106748192 libcoreclr.dylib`MarshalInfo::MarshalInfo(Module*, SigPointer, SigTypeContext const*, unsigned int, MarshalInfo::MarshalScenario, CorNativeLinkType, CorNativeLinkFlags, int, unsigned int, unsigned int, int, int, int, int, MethodDesc*, int, int, char const*, char const*, unsigned int) + 146
    frame #21: 0x00000001066d333c libcoreclr.dylib`ParseNativeType(Module*, SigPointer, unsigned int, ParseNativeTypeFlags, NativeFieldDescriptor*, SigTypeContext const*, char const*, char const*, char const*) + 460
    frame #22: 0x00000001064daa93 libcoreclr.dylib`EEClassLayoutInfo::ParseFieldNativeTypes(IMDInternalImport*, unsigned int, HENUMInternal*, unsigned int, Module*, ParseNativeTypeFlags, SigTypeContext const*, int*, LayoutRawFieldInfo*, EEClassLayoutInfo*, unsigned int*, char const*, char const*) + 899
    frame #23: 0x00000001064d9b06 libcoreclr.dylib`EEClassLayoutInfo::CollectLayoutFieldMetadataThrowing(unsigned int, unsigned char, unsigned char, int, MethodTable*, unsigned int, HENUMInternal*, Module*, SigTypeContext const*, EEClassLayoutInfo*, LayoutRawFieldInfo*, LoaderAllocator*, AllocMemTracker*) + 1526
    frame #24: 0x000000010673fa53 libcoreclr.dylib`ClassLoader::CreateTypeHandleForTypeDefThrowing(Module*, unsigned int, Instantiation, AllocMemTracker*) + 3235
    frame #25: 0x00000001064e4411 libcoreclr.dylib`ClassLoader::CreateTypeHandleForTypeKey(TypeKey*, AllocMemTracker*) + 1457
    frame #26: 0x00000001064e3d0a libcoreclr.dylib`ClassLoader::DoIncrementalLoad(TypeKey*, TypeHandle, ClassLoadLevel) + 458
    frame #27: 0x00000001064e55d8 libcoreclr.dylib`ClassLoader::LoadTypeHandleForTypeKey_Body(TypeKey*, TypeHandle, ClassLoadLevel) + 1848
    frame #28: 0x00000001064e0351 libcoreclr.dylib`ClassLoader::LoadTypeHandleForTypeKey(TypeKey*, TypeHandle, ClassLoadLevel, InstantiationContext const*) + 305
    frame #29: 0x00000001064e004d libcoreclr.dylib`ClassLoader::LoadConstructedTypeThrowing(TypeKey*, ClassLoader::LoadTypesFlag, ClassLoadLevel, InstantiationContext const*) + 733
    frame #30: 0x00000001064e24ef libcoreclr.dylib`ClassLoader::LoadGenericInstantiationThrowing(Module*, unsigned int, Instantiation, ClassLoader::LoadTypesFlag, ClassLoadLevel, InstantiationContext const*, int) + 527

@jkotas
Copy link
Member

jkotas commented Jan 18, 2020

This comment in MethodTableBuilder::GatherGenericsInfo does not seem to hold:

                     // No race here - the row in GenericParam table is owned exclusively by this type and we
                    // are holding a lock preventing other threads from concurrently loading it.

The field layout does recursive type loading that can produce multi-threaded deadlocks. We break these deadlocks by allowing the same type to be loaded on multiple threads as necessary (look for TSNC_LoadsTypeViolation). It can lead to multiple threads updating GenericParam table in non-thread safe way.

This bug was likely exposed by #103 . Marshaling of generic types was unconditionally disabled before this change, and so GatherGenericsInfo where the problem is would not be reached.

jkotas added a commit that referenced this issue Jan 22, 2020
Same type can be loaded by multiple threads in parallel in rare situations

Fixes #1847
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VM-coreclr untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants