Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of JITDUMP_USE_ARCH_TIMESTAMP to perfjitdump. #111359

Merged
merged 1 commit into from
Jan 17, 2025

Conversation

Kuinox
Copy link
Contributor

@Kuinox Kuinox commented Jan 13, 2025

Fixes #110809.

They way I read the env variable is probably wrong but I have no idea how to do it.
I did find the files where lots of dotnet env variable were declared, but in this case the variable name should not follow the dotnet convention since it's a standardized env variable to configure this setting.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 13, 2025
@tommcdon
Copy link
Member

@mdh1418 @noahfalk @brianrob

@brianrob
Copy link
Member

Thanks for adding me @tommcdon.

The idea of this change seems very reasonable to me. I like the idea that it is an opt-in change as well.

In terms of configuration - Do we want to control this via a separate environment variable, or do we want configuration to occur via something like the PerfMapType which would allow for this to be configured via the IPC channel? I get worried when we use new environment variables for this because this will continue to put is in a position where an environment variable is required to enable profiling, and this is something that we should be aiming away from. What do you think @mdh1418, @noahfalk?

@Kuinox
Copy link
Contributor Author

Kuinox commented Jan 13, 2025

I think this environment variable is automatically set by perf when profiling with intel pt:
https://github.com/torvalds/linux/blob/master/tools/perf/arch/x86/util/intel-pt.c#L1189

Without that, perf inject refuse to work when using intel pt, because it expect the jit dump to use arch timestamp.

@brianrob
Copy link
Member

Thanks @Kuinox. Am I understanding correctly that the scenario here is that perf is the process that is starts the process to be profiled? From my read of the code, this would have to be the case if perf is responsible for setting the environment variable. Does that match your expectation?

@Kuinox
Copy link
Contributor Author

Kuinox commented Jan 13, 2025

the scenario here is that perf is the process that is starts the process to be profiled?

Yes.

From my read of the code, this would have to be the case if perf is responsible for setting the environment variable.

I tested just to be sure:

~/testarchenvvar$ corerun TestArchEnvVar.dll
JITDUMP_USE_ARCH_TIMESTAMP is not set
~/testarchenvvar$ perf record --event=intel_pt//u -- corerun TestArchEnvVar.dll
JITDUMP_USE_ARCH_TIMESTAMP is set to 1
[ perf record: Woken up 15123 times to write data ]
[ perf record: Captured and wrote 441.230 MB perf.data ]

@brianrob
Copy link
Member

Gotcha - in that case, I think it's fine to have the environment variable control this. At the same time, we probably want to internally configure this using PerfMapType so that the behavior also works for machine-wide collection, especially if it's controlled by a non-perf tool profiler that uses the IPC channel to enable/disable jitdump.

@Kuinox
Copy link
Contributor Author

Kuinox commented Jan 13, 2025

If I understand correctly, you want that this config be set via PerfMapType, where should I read the env variable & set PerfMapType then?

@brianrob
Copy link
Member

If I understand correctly, you want that this config be set via PerfMapType, where should I read the env variable & set PerfMapType then?

I think that is the right thing, but I want to wait for @mdh1418 and/or @noahfalk to weigh in.

@mdh1418
Copy link
Member

mdh1418 commented Jan 14, 2025

Are intel_pt and __rdtsc() only available for HOST_AMD64?

I agree that it would be good to extend recording timestamps as TSC to any interested profiler, rather than requiring all profilers to know to set the specific JITDUMP_FLAGS_ARCH_TIMESTAMP env variable.

@Kuinox
Copy link
Contributor Author

Kuinox commented Jan 14, 2025

Are intel_pt and __rdtsc() only available for HOST_AMD64?

intel pt will only work on Intel CPUs, not AMD ones.
The magic trace(the tool i'm trying to make work with .NET, or something similar) wiki indicates:

magic-trace relies on Intel Processor Trace, which only really starts to be functional on relatively recent Intel CPUs. That means it doesn't support ARM, and it doesn't support AMD. There are timer resolution limitations on Broadwell, which was the first generation to support Intel PT.

rather than requiring all profilers to know to set the specific JITDUMP_FLAGS_ARCH_TIMESTAMP env variable.

Here for my changes it's specifically for the perf jit dump, is it used by other profilers ?

if (!initialized)
{
#if defined(HOST_AMD64)
const char* archTimestamp = getenv("JITDUMP_USE_ARCH_TIMESTAMP");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using getenv() directly you can add this to the other config settings the runtime supports:

  1. Declare a config value in configvalues.h. Here is an example that uses the 'DontPrependPrefix' flag to declare an env var without the DOTNET_ prefix:
    https://github.com/dotnet/runtime/blob/main/src/coreclr/inc/clrconfigvalues.h#L426

  2. Use CLRConfig::GetConfigValue(CLRConfig::) to retrieve the value. Example:

    fProfEnabled = CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_CORECLR_ENABLE_PROFILING);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would have to be CLRConfigNoCache::Get("JITDUMP_USE_ARCH_TIMESTAMP", /*noprefix*/ true, &getenv) to avoid layering issues with PAL. When this method is called with noprefix=true, it just calls getenv without anything extra.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jkotas! - I missed the pal layering aspect in there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean i dont have to update the configvalue.h and simply change the getenv method with what @jkotas said.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling getenv directly looks better.

@jkotas
Copy link
Member

jkotas commented Jan 14, 2025

Here for my changes it's specifically for the perf jit dump, is it used by other profilers ?

+1

JITDUMP_USE_ARCH_TIMESTAMP env variable is the ecosystem wide mechanism for this. This env variable is recognized by number of tools and runtimes that care about this. For example, Java: https://github.com/torvalds/linux/blob/master/tools/perf/jvmti/jvmti_agent.c#L238
hhvm: https://github.com/facebook/hhvm/blob/03d0ff9eee2b03b12e5a45aa5cf0b233b753cff4/hphp/runtime/vm/debug/perf-jitdump.cpp#L129 . There is no other ecosystem wide mechanism for this. We can certainly add a dotnet specific way to configure this in addition to the standard one, but it feels like overengineering.

@noahfalk
Copy link
Member

noahfalk commented Jan 14, 2025

I think that is the right thing, but I want to wait for @mdh1418 and/or @noahfalk to weigh in.

As I understand it, this new bit is orthogonal to the settings currently on PerfMapType. @brianrob were you suggesting that this new bit would show up on the PerfMapEnabled env var so now instead of 0-3 we'd have 0-7?

I have been thinking about it a little differently so far:

  1. For enablement through env var I would leave PerfMapEnabled as-is and allow this separate JITDUMP_USE_ARCH_TIMESTAMP env var to control the new bit as implemented in the current PR.
  2. For conveying the configuration internally we could define a new type such as:
    struct PerfMapConfig
    {
    BOOL enablePerfMap
    BOOL enableJitDump
    BOOL useJitDumpArchTimestamp
    }
    I don't care too much about the internal representation but since this isn't perf-critical I'd lean towards being a bit verbose and explicit for clarity.
  3. For configuration not via env variables we could redefine the old IPC event with an additional bit and rely on older runtime returning an error or define a new IPC event. Redefining it is probably less work but a little hackier.

For now I'd be content if we only implemented (1) because it sounds like the scenario which needs the new timestamp is relatively coupled with launching a single process via perf. I'd delay implementing (2) and (3) until it was clearer what is the E2E scenario is that would use it.

@brianrob
Copy link
Member

As I understand it, this new bit is orthogonal to the settings currently on PerfMapType. @brianrob were you suggesting that this new bit would show up on the PerfMapEnabled env var so now instead of 0-3 we'd have 0-7?

Agreed that this is an orthogonal configuration bit. I was just thinking about adding one additional value (0-4) to add a new type "jitdump with rdtsc". The idea being that if other profilers end up needing to use this capability, they may not want to force end-to-end process capture invoked by the profiler itself, and may want to support attach, including using IPC.

I'm ok with an incremental approach - I just want to make sure that we have the option to expose this via IPC in the future should this be necessary.

@noahfalk
Copy link
Member

I'm ok with an incremental approach - I just want to make sure that we have the option to expose this via IPC in the future should this be necessary.

Yep, I don't see anything here that precludes us supporting it for IPC in the future if the need arises.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@noahfalk noahfalk merged commit a2ef22e into dotnet:main Jan 17, 2025
87 of 91 checks passed
@noahfalk
Copy link
Member

Thanks @Kuinox!

grendello added a commit to grendello/runtime that referenced this pull request Jan 20, 2025
* main: (89 commits)
  Add Dispose for X509Chain instance (dotnet#110740)
  Fix XML comment on regex split enumerator (dotnet#111572)
  JIT: tolerate missing InitClass map in SPMI (dotnet#111555)
  Build ilasm/ildasm packages for the host machine (dotnet#111512)
  Unicode 16.0 Support (dotnet#111469)
  Improve performance of interface method resolution in ILC (dotnet#103066)
  Fix building the host-targeting components and packing ILC (dotnet#111552)
  Improve JSON validation perf (dotnet#111332)
  Update github-merge-flow.jsonc to autoflow 9.0 to 9.0-staging (dotnet#111549)
  Include GPL-3 licence text in the notice (dotnet#111528)
  Remove explicit __compact_unwind entries from x64 assembler (dotnet#111530)
  Add MemoryExtensions overloads with comparer (dotnet#110197)
  Avoid capturing the ExecutionContext for the whole HTTP connection lifetime (dotnet#111475)
  Forward DefaultArtifactVisibility down from the VMR orchestrator (dotnet#111513)
  Fix relocs errors on riscv64 (dotnet#111317)
  Added JITDUMP_USE_ARCH_TIMESTAMP support. (dotnet#111359)
  add rcl/rcr tp and latency info (dotnet#111442)
  Fix stack overflow in compiler-generated state (dotnet#109207)
  Produce a package with the host-running ILC for repos in the VMR (dotnet#111443)
  Delete dead code in ilasm PE writer (dotnet#111218)
  ...
@github-actions github-actions bot locked and limited conversation to collaborators Feb 16, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Diagnostics-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add an option so perf jit dump can optionally use TSC for timestamps.
6 participants