[Perf] ARM64 regression in System.Diagnostics.Perf_Activity.ActivityAllocations #68624

performanceautofiler · 2022-04-21T06:13:47Z

Run Information

Architecture	arm64
OS	ubuntu 18.04
Baseline	b58d8013dbc4f3a65e5e2242fe09a43f455a329f
Compare	84b664a9875067a97fe32b654ff32f23f4001a60
Diff	Diff

Regressions in System.Diagnostics.Perf_Activity

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio	Baseline ETL	Compare ETL
ActivityAllocations - Duration of single invocation	758.56 ns	813.42 ns	1.07	0.20	False
ActivityAllocations - Duration of single invocation	623.66 ns	715.80 ns	1.15	0.11	False

Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Diagnostics.Perf_Activity*'

Payloads

Baseline
Compare

Histogram

System.Diagnostics.Perf_Activity.ActivityAllocations(idFormat: W3C)

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 813.4234915083607 > 802.4856996133994.
IsChangePoint: Marked as a change because one of 4/13/2022 10:02:29 PM, 4/19/2022 4:53:57 AM falls between 4/10/2022 3:30:35 PM and 4/19/2022 4:53:57 AM.
IsRegressionStdDev: Marked as regression because -6.738201722219294 (T) = (0 -836.9530328425158) / Math.Sqrt((284.7705405371702 / (23)) + (1754.8085293566023 / (18))) is less than -2.0226909200346674 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (23) + (18) - 2, .025) and -0.09216640816227123 = (766.323727398658 - 836.9530328425158) / 766.323727398658 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Diagnostics.Perf_Activity.ActivityAllocations(idFormat: Hierarchical)

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 715.7983124923339 > 655.6405660025653.
IsChangePoint: Marked as a change because one of 4/13/2022 4:27:35 PM, 4/19/2022 4:53:57 AM falls between 4/10/2022 3:30:35 PM and 4/19/2022 4:53:57 AM.
IsRegressionStdDev: Marked as regression because -11.823299540885158 (T) = (0 -708.1518629941236) / Math.Sqrt((243.4064317595605 / (20)) + (499.0656483304033 / (18))) is less than -2.028094000977961 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (20) + (18) - 2, .025) and -0.11789003363534732 = (633.4718457872224 - 708.1518629941236) / 633.4718457872224 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

The text was updated successfully, but these errors were encountered:

AndyAyersMS · 2022-04-22T18:01:24Z

Possibly from #67920.

AndyAyersMS · 2022-04-27T21:33:33Z

@tarekgh thinks #67920 is not the likely suspect.

Another possibility given that this is arm64 is #66407. SPMI Diffs show there was an impact to System.Diagnostics.Tracing.EventListener:AddEventSource though it was a code size reduction.

The regression has persisted since then, so it does not look like noise.

dotnet-issue-labeler · 2022-04-27T21:34:00Z

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

ghost · 2022-04-27T21:35:15Z

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture	arm64
OS	ubuntu 18.04
Baseline	b58d8013dbc4f3a65e5e2242fe09a43f455a329f
Compare	84b664a9875067a97fe32b654ff32f23f4001a60
Diff	Diff

Regressions in System.Diagnostics.Perf_Activity

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio	Baseline ETL	Compare ETL
ActivityAllocations - Duration of single invocation	758.56 ns	813.42 ns	1.07	0.20	False
ActivityAllocations - Duration of single invocation	623.66 ns	715.80 ns	1.15	0.11	False

Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Diagnostics.Perf_Activity*'

Payloads

Baseline
Compare

Histogram

System.Diagnostics.Perf_Activity.ActivityAllocations(idFormat: W3C)

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 813.4234915083607 > 802.4856996133994.
IsChangePoint: Marked as a change because one of 4/13/2022 10:02:29 PM, 4/19/2022 4:53:57 AM falls between 4/10/2022 3:30:35 PM and 4/19/2022 4:53:57 AM.
IsRegressionStdDev: Marked as regression because -6.738201722219294 (T) = (0 -836.9530328425158) / Math.Sqrt((284.7705405371702 / (23)) + (1754.8085293566023 / (18))) is less than -2.0226909200346674 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (23) + (18) - 2, .025) and -0.09216640816227123 = (766.323727398658 - 836.9530328425158) / 766.323727398658 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### System.Diagnostics.Perf_Activity.ActivityAllocations(idFormat: Hierarchical)

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 715.7983124923339 > 655.6405660025653.
IsChangePoint: Marked as a change because one of 4/13/2022 4:27:35 PM, 4/19/2022 4:53:57 AM falls between 4/10/2022 3:30:35 PM and 4/19/2022 4:53:57 AM.
IsRegressionStdDev: Marked as regression because -11.823299540885158 (T) = (0 -708.1518629941236) / Math.Sqrt((243.4064317595605 / (20)) + (499.0656483304033 / (18))) is less than -2.028094000977961 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (20) + (18) - 2, .025) and -0.11789003363534732 = (633.4718457872224 - 708.1518629941236) / 633.4718457872224 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author:	performanceautofiler[bot]
Assignees:	-
Labels:	`area-CodeGen-coreclr`, `untriaged`, `refs/heads/main`, `ubuntu 18.04`, `RunKind=micro`, `Regression`, `CoreClr`, `Look Again`, `arm64`
Milestone:	-

AndyAyersMS · 2022-04-27T21:36:35Z

cc @TIHan

TIHan · 2022-06-06T23:33:38Z

@kunalspathak and I looked at this a bit ago. We believe that the regression wasn't caused by the mod operation.

kunalspathak · 2022-06-07T00:10:11Z

So I think there are 2 separate regressions going on...

For https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_arm64_ubuntu%2018.04/System.Diagnostics.Perf_Activity.ActivityAllocations(idFormat%3a%20W3C).html, the regression is kind of stay and the commit range is 5c57f2c...7bac4e8.

For https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_arm64_ubuntu%2018.04/System.Diagnostics.Perf_Activity.ActivityAllocations(idFormat%3a%20Hierarchical).html

The commit range where this benchmark is affected is 3635e0f...f7d9d6d.

Just to be sure, you might want to see if it repros and if yes, we can do asmdiff to see what is causing it.

kunalspathak · 2022-06-08T18:32:22Z

@tarekgh thinks #67920 is not the likely suspect.

@tarekgh - @TIHan spent time and did investigation if #66407 is the cause, and he was not able to repro the regression. In fact, he see some improvements.

I see that #67920 added a layer of Enumerator() and not sure if that is causing it, but the regression is consistent on windows/ubuntu. Did you verify by locally running the benchmarks if it was not with the Activity related change?

Windows: https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_arm64_Windows%2010.0.19041%2fSystem.Diagnostics.Perf_Activity.ActivityAllocations(idFormat%3a%20W3C).html

https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_arm64_Windows%2010.0.19041%2fSystem.Diagnostics.Perf_Activity.ActivityAllocations(idFormat%3a%20Hierarchical).html

There was also #67938 (which I doubt might be related).

tarekgh · 2022-06-08T18:43:05Z

I see that #67920 added a layer of Enumerator() and not sure if that is causing it, but the regression is consistent on windows/ubuntu. Did you verify by locally running the benchmarks if it was not with the Activity related change?

#67920 is a code is not exercised at all in the perf test scenario, so I am positive this shouldn't be related to the regression we are seeing.

There was also #67938 (which I doubt might be related).

This could be related; the only part that can affect this test is the change:

CurrentChanged usually is null, so the overhead added there is only the assignment and the null check for CurrentChanged. I am not sure if this is enough change can cause such regression but I agree it is related. I'll try to measure that.

kunalspathak · 2022-06-08T20:03:55Z

Thanks! I will assign it to you then.

TIHan · 2022-06-08T22:58:11Z

@tarekgh I think we found the cause, basically this case of x % 2 uses the csneg operation where before it had a more optimal form than that; this is just specifically when the divisor is a constant 2.

tarekgh · 2022-06-08T23:04:59Z

I'll assign the issue back to @TIHan then. Thanks for finding the root cause.

TIHan · 2022-06-09T19:13:28Z

Will resolve this regression in #68885

TIHan · 2022-06-16T23:25:23Z

@tarekgh Apologies for circling around. I spent some time and did a much deeper investigation into this. It turns out it is not caused by the x % 2 changes we've made. In fact, any mod changes wouldn't impact this benchmark because the expression, x % 2 == 0, does not benefit from the recent mod optimizations.

I'm assigning back to you.

tarekgh · 2022-06-17T01:47:57Z

I have done some measurements locally on my machine to see the cost of the change #67938, I got the following:

With #67938 change

[2022/06/16 18:19:17][INFO] |              Method |     idFormat |     Mean |   Error |  StdDev |   Median |      Min |      Max |  Gen 0 | Allocated |
[2022/06/16 18:19:17][INFO] |-------------------- |------------- |---------:|--------:|--------:|---------:|---------:|---------:|-------:|----------:|
[2022/06/16 18:19:17][INFO] | ActivityAllocations | Hierarchical | 157.8 ns | 6.46 ns | 7.44 ns | 156.6 ns | 147.6 ns | 175.6 ns | 0.0420 |     352 B |
[2022/06/16 18:19:17][INFO] | ActivityAllocations |          W3C | 229.9 ns | 8.18 ns | 9.42 ns | 234.5 ns | 214.1 ns | 239.6 ns | 0.0489 |     416 B |

Without #67938 change

[2022/06/16 18:22:44][INFO] |              Method |     idFormat |     Mean |   Error |   StdDev |   Median |      Min |      Max |  Gen 0 | Allocated |
[2022/06/16 18:22:44][INFO] |-------------------- |------------- |---------:|--------:|---------:|---------:|---------:|---------:|-------:|----------:|
[2022/06/16 18:22:44][INFO] | ActivityAllocations | Hierarchical | 146.0 ns | 4.41 ns |  4.90 ns | 147.7 ns | 136.5 ns | 154.4 ns | 0.0419 |     352 B |
[2022/06/16 18:22:44][INFO] | ActivityAllocations |          W3C | 224.3 ns | 9.19 ns | 10.58 ns | 221.9 ns | 211.3 ns | 242.0 ns | 0.0492 |     416 B |

I am seeing the Hierarchical scenario mentioned in the issue regressed by 15% while in my measurement is shows around 8%. Also, W3C scenario in this issue showed it is 7% regression while my measurements show it is around 2%.

By that, my change could be contributing here but I don't think it is the only culprit in this regression. Also, considering we need the features we have added here, it is acceptable to have this regression especially the test is not really testing the real scenario but only the part that initializes the empty Activity object. So, this regression wouldn't be noticeable in the real scenario of using Activity class.

I'll leave it to anyone else who wants to dig more to see if there is anything contributing to this, or if we are ok to resolve this issue.

tarekgh · 2022-06-17T01:52:50Z

One last thing I just notices, I am seeing this issue marked with Ubuntu 18.04. Although I did the measurement on my machine which has Windows OS but if my change is the cause, it should show on all OS's and all architectures as the change is not OS or architecture dependent.

kunalspathak · 2022-06-17T03:09:15Z

this issue marked with Ubuntu 18.04.

And this is on arm64, did you check on windows/arm64?

Also, considering we need the features we have added here, it is acceptable to have this regression

Sounds good, and I am sure you verified that we didn't introduce any regression in normal path.

especially the test is not really testing the real scenario but only the part that initializes the empty Activity object.

Yes, we have seen that in past in other benchmarks and fix it by adding relevant test in dotnet/performance. e.g. dotnet/performance#2479. If you don't mind fixing the benchmark, that will be great.

tarekgh · 2022-06-17T19:19:03Z

And this is on arm64, did you check on windows/arm64?

No, I didn't try arm. But I am wondering now, wouldn't this be more related to JIT then? I cannot think of anything in our changes that can cause such regression in specific architecture.

Yes, we have seen that in past in other benchmarks and fix it by adding relevant test in dotnet/performance. e.g. dotnet/performance#2479. If you don't mind fixing the benchmark, that will be great.

I would keep the test as it is for now as it looks sensitive to the JIT changes. So, I am seeing the value to keep it. We can consider adding more tests later for real usage scenarios with Activity.

JulieLeeMSFT · 2022-07-01T04:01:54Z

@TIHan PTAL.

TIHan · 2022-07-05T19:39:33Z

@JulieLeeMSFT This regression is occurring in x64 as well, so it's not related to changes in the JIT for ARM64. The regression was also said to be acceptable:

Also, considering we need the features we have added here, it is acceptable to have this regression

Closing. Anyone is free to re-open this if they deem it critical.

performanceautofiler bot added arm64 untriaged New issue has not been triaged by the area owner labels Apr 21, 2022

AndyAyersMS mentioned this issue Apr 22, 2022

System.Diagnostics.Activity: Implement Enumerate* API #67920

Merged

AndyAyersMS added Look Again and removed untriaged New issue has not been triaged by the area owner labels Apr 22, 2022

dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Apr 27, 2022

AndyAyersMS transferred this issue from dotnet/perf-autofiling-issues Apr 27, 2022

AndyAyersMS changed the title ~~[Perf] Changes at 4/14/2022 3:36:10 AM~~ [Perf] ARM64 regression in System.Diagnostics.Perf_Activity.ActivityAllocations Apr 27, 2022

AndyAyersMS added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 27, 2022

AndyAyersMS mentioned this issue Apr 27, 2022

ARM64 - Optimizing a % b operations part 2 #66407

Merged

2 tasks

TIHan mentioned this issue May 5, 2022

ARM64 - Always morph GT_MOD #68885

Merged

2 tasks

JulieLeeMSFT assigned TIHan May 5, 2022

JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label May 5, 2022

JulieLeeMSFT added this to the 7.0.0 milestone May 5, 2022

TIHan removed their assignment Jun 6, 2022

kunalspathak assigned tarekgh Jun 8, 2022

tarekgh assigned TIHan and unassigned tarekgh Jun 8, 2022

TIHan mentioned this issue Jun 10, 2022

ARM64 - Optimize i % 2 #70599

Merged

1 task

TIHan assigned tarekgh and unassigned TIHan Jun 16, 2022

tarekgh removed their assignment Jun 17, 2022

JulieLeeMSFT assigned TIHan Jul 1, 2022

TIHan closed this as completed Jul 5, 2022

ghost locked as resolved and limited conversation to collaborators Aug 5, 2022

jeffhandley added arch-arm64 runtime-coreclr specific to the CoreCLR runtime os-linux Linux OS (any supported distro) and removed arm64 labels Dec 28, 2022

[Perf] ARM64 regression in System.Diagnostics.Perf_Activity.ActivityAllocations #68624

[Perf] ARM64 regression in System.Diagnostics.Perf_Activity.ActivityAllocations #68624

Comments

performanceautofiler bot commented Apr 21, 2022

Run Information

Regressions in System.Diagnostics.Perf_Activity

Repro

Payloads

Histogram

System.Diagnostics.Perf_Activity.ActivityAllocations(idFormat: W3C)

Description of detection logic

Description of detection logic

Docs

AndyAyersMS commented Apr 22, 2022

AndyAyersMS commented Apr 27, 2022

dotnet-issue-labeler bot commented Apr 27, 2022

ghost commented Apr 27, 2022

Run Information

Regressions in System.Diagnostics.Perf_Activity

Repro

Payloads

Histogram

System.Diagnostics.Perf_Activity.ActivityAllocations(idFormat: W3C)

Description of detection logic

Description of detection logic

Docs

AndyAyersMS commented Apr 27, 2022

TIHan commented Jun 6, 2022

kunalspathak commented Jun 7, 2022

kunalspathak commented Jun 8, 2022

tarekgh commented Jun 8, 2022 • edited Loading

kunalspathak commented Jun 8, 2022

TIHan commented Jun 8, 2022 • edited Loading

tarekgh commented Jun 8, 2022

TIHan commented Jun 9, 2022

TIHan commented Jun 16, 2022

tarekgh commented Jun 17, 2022

With #67938 change

Without #67938 change

tarekgh commented Jun 17, 2022 • edited Loading

kunalspathak commented Jun 17, 2022

tarekgh commented Jun 17, 2022

JulieLeeMSFT commented Jul 1, 2022

TIHan commented Jul 5, 2022

tarekgh commented Jun 8, 2022 •

edited

Loading

TIHan commented Jun 8, 2022 •

edited

Loading

tarekgh commented Jun 17, 2022 •

edited

Loading