-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM64 CG2 compilation of System.Text.Json crashes on JIT assert 'block->bbWeight > BB_ZERO_WEIGHT' #52785
Comments
assume this is a recent regression? |
FWIW, on OSX ARM64 there's another non-deterministic issue, the CG2 compiler sometimes crashes on a nullref when internally sorting dependency nodes; that is demonstrated in the OSX arm64 leg of the same run as quoted in the title: Unhandled exception. System.AggregateException: One or more errors occurred. (Object reference not set to an instance of an object.) ---> System.NullReferenceException: Object reference not set to an instance of an object. at ILCompiler.Sorting.Implementation.MergeSortCore`5.ParallelSort(TDataStructure arrayToSort, Int32 index, Int32 length, TComparer comparer) --- End of inner exception stack trace --- at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken) at ILCompiler.Sorting.Implementation.MergeSortCore`5.ParallelSortApi(TDataStructure arrayToSort, TComparer comparer) at ILCompiler.MetadataManager.GetCompiledMethods(EcmaModule moduleToEnumerate, CompiledMethodCategory methodCategory) at ILCompiler.DependencyAnalysis.NodeFactory.EnumerateCompiledMethods(EcmaModule moduleToEnumerate, CompiledMethodCategory methodCategory)+MoveNext() at ILCompiler.DependencyAnalysis.ReadyToRun.ExceptionInfoLookupTableNode.LayoutMethodsWithEHInfo() at ILCompiler.DependencyAnalysis.ReadyToRun.ExceptionInfoLookupTableNode.ShouldSkipEmittingObjectNode(NodeFactory factory) at ILCompiler.DependencyAnalysis.ReadyToRun.HeaderNode.GetData(NodeFactory factory, Boolean relocsOnly) at ILCompiler.DependencyAnalysis.ReadyToRunObjectWriter.EmitPortableExecutable() at ILCompiler.ReadyToRunCodegenCompilation.Compile(String outputFile) at ILCompiler.Program.RunSingleCompilation(Dictionary`2 inFilePaths, InstructionSetSupport instructionSetSupport, String compositeRootPath, Dictionary`2 unrootedInputFilePaths, HashSet`1 versionBubbleModulesHash, CompilerTypeSystemContext typeSystemContext) at ILCompiler.Program.Run(String[] args) at ILCompiler.Program.Main(String[] args) |
I think the osx arm64 |
I see the first repro of the OSX arm64 bug in Steve MacLean's run from May 5 (last Wednesday): I'm looking for the first occurrence of the Windows bug. |
OK, so I see the first occurrence of the Windows bug in @BruceForstall's run from May 6: https://dev.azure.com/dnceng/public/_build/results?buildId=1127214&view=results |
@AndyAyersMS is the expert for issues with basic block weight. |
I'll take a look at the windows arm64 failure. |
We end up with a negative block count and that leads to the assert. Suspect the issue is in
|
Root cause seems to be upstream, when we compute edge weights we're willing to set a min edge weight to something below zero (and root cause of that is inconsistent profile data, which we have to tolerate). |
If the solver wants to set the edge weight below zero, set it to zero if within slop, or disallow if not. Addresses assert seen in #52785.
@trylek can you verify this is now fixed? |
Sorry for the late response. Thank you for fixing the failure in System.Text.Json, I confirm we're no longer hitting it, e.g. in this recent CG2 run: Instead of the original failure, all arm64 builds and Windows x64 builds are now failing in System.IO.Compression with another JIT assert: 58 / 257 (24%, 1 failed): failed in 4806 msecs, exit code -2147483645 = 0x80000003, expected 0: dotnet.exe D:\workspace\_work\1\s\artifacts\bin\coreclr\windows.x64.Checked\crossgen2\crossgen2.dll @D:\workspace\_work\1\s\artifacts\tests\coreclr\obj\windows.x64.Checked\crossgen.out\System.IO.Compression.dll.rsp 67 / 257 (24%, 1 failed): launching: D:\workspace\_work\1\s\.dotnet\dotnet.exe D:\workspace\_work\1\s\artifacts\bin\coreclr\windows.x64.Checked\crossgen2\crossgen2.dll @D:\workspace\_work\1\s\artifacts\tests\coreclr\obj\windows.x64.Checked\crossgen.out\System.Linq.Queryable.dll.rsp D:\workspace\_work\1\s\src\coreclr\jit\morph.cpp:12306 Assertion failed '(effOp1->gtOper == GT_CNS_INT) && (effOp1->IsIntegralConst(0) || effOp1->IsIntegralConst(1))' in 'System.IO.Compression.ZipArchive:.ctor(System.IO.Stream,int,bool,System.Text.Encoding):this' during 'Optimize Valnum CSEs' (IL size 489) This seems to be the tracking bug for the remaining issue: #33091 |
Let me see what's up with that new failure... I have a hunch it may be related to #52524. |
In particular we need to set `GTF_DONT_CSE` so that CSE doesn't introduce commas under `GT_JTRUE` nodes. Fixes dotnet#52785.
Awesome, thanks for the quick investigation! |
In particular we need to set `GTF_DONT_CSE` so that CSE doesn't introduce commas under `GT_JTRUE` nodes. Fixes #52785.
OS: Windows
Architecture: arm64
Example run: https://dev.azure.com/dnceng/public/_build/results?buildId=1136542&view=logs&j=438f2a33-0bac-577f-c1e5-b7956f9ac284&t=437a67a3-60d6-56bd-4640-6cece310571b
Diagnostic info:
/cc @dotnet/crossgen-contrib @dotnet/jit-contrib
The text was updated successfully, but these errors were encountered: