Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for specifying DOTNET_DiagnosticPortDefaultPrefix #110473

Closed
xoofx opened this issue Dec 6, 2024 · 7 comments
Closed

Add support for specifying DOTNET_DiagnosticPortDefaultPrefix #110473

xoofx opened this issue Dec 6, 2024 · 7 comments
Labels
area-Diagnostics-coreclr feature-request in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@xoofx
Copy link
Member

xoofx commented Dec 6, 2024

Hello, 🙂

I have a case where I'm compiling .NET 8 NativeAOT dyld with EventPipe support that I can load in an existing .NET process. I would like to connect to the diagnostic port of this specific dyld by being able to configure my own prefix for the socket configuration. Currently, the prefix is hardcoded with dotnet-diagnostic at (for MacOS/Linux):

static
inline
bool
ipc_transport_get_default_name (
ep_char8_t *name,
int32_t name_len)
{
#ifdef DS_IPC_PAL_AF_UNIX
#ifndef EP_NO_RT_DEPENDENCY
return ds_rt_transport_get_default_name (
name,
name_len,
"dotnet-diagnostic",
ep_rt_current_process_get_id (),
NULL,
"socket");
#elif defined (EP_NO_RT_DEPENDENCY) && defined (FEATURE_CORECLR)
// generate the default socket name in TMP Path
const ProcessDescriptor pd = ProcessDescriptor::FromCurrentProcess();
PAL_GetTransportName(
name_len,
name,
"dotnet-diagnostic",
pd.m_Pid,
pd.m_ApplicationGroupId,
"socket");
return true;
#else
return false;
#endif
#else
return false;
#endif
}

I'm proposing to add support for an environment variable DOTNET_DiagnosticPortDefaultPrefix that would allow to configure this prefix name.

If not set, it would default to dotnet-diagnostic.

In general, this prefix could be used to disambiguate more easily these diagnostic ports from the pipe/tmp folders when there are several applications/services running on the same machine.

If accepted, I can make a PR.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Dec 6, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Dec 6, 2024
@xoofx xoofx added area-Diagnostics-coreclr and removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Dec 6, 2024
Copy link
Contributor

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

@xoofx
Copy link
Member Author

xoofx commented Dec 6, 2024

I would like to connect to the diagnostic port of this specific dyld by being able to configure my own prefix for the socket configuration.

Hm, I thought I had tested this but testing it right now, I don't see my NativeAOT dyld creating a dotnet-diagnostic port file in TMPDIR. 🤔

@jkotas do you happen to know if EventPipe in NativeAOT .NET 8 works for shared libraries? A quick glance at the code tells me that it should, but I might be missing something...

@xoofx
Copy link
Member Author

xoofx commented Dec 6, 2024

Oh, ok, I missed the warning on the command line and it refers to #91762 and #88087 so going to follow the other issues to see what is working / not working.

@xoofx xoofx closed this as completed Dec 6, 2024
@xoofx xoofx reopened this Dec 7, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Dec 7, 2024
@xoofx
Copy link
Member Author

xoofx commented Dec 7, 2024

After checking in #88087 (comment), it seems to work. So I'm quite interested in creating a PR for supporting DOTNET_DiagnosticPortDefaultPrefix.

@xoofx
Copy link
Member Author

xoofx commented Dec 7, 2024

Prototype PR #110503

@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Dec 7, 2024
@tommcdon tommcdon added this to the 10.0.0 milestone Dec 7, 2024
@hoyosjs
Copy link
Member

hoyosjs commented Dec 11, 2024

There's a lot of cases that I feel are ill defined for these kind of cases:

  • Diag port is great for startup scenarios, but has many issues for processes that spawn processes.
    • inheritance of env vars means every process will try to use the same port.
    • Even with a scheme to disambiguate, they inherit the default pause behavior.
  • The name of the default port is fixed. There's already a few tools that assume this. changing this is painful
  • This case - where you need ps information to disambiguate the process.
  • The case for multiple runtimes, where you point out a few issues that we've see.

Generally - there's lots of UX to define here... I do think it's worth looking at these and I've wanted to for a while. However, probably makes sense to look at the scenarios here and try to think how well these fit with all these points. That being said, this sounds pretty usable as is for the cases it can try to solve.

@xoofx
Copy link
Member Author

xoofx commented Dec 12, 2024

Generally - there's lots of UX to define here...

Completely agree, it's far from ideal and the UX is definitely not great.

  • Diag port is great for startup scenarios, but has many issues for processes that spawn processes.
    • inheritance of env vars means every process will try to use the same port.
    • Even with a scheme to disambiguate, they inherit the default pause behavior.

The behavior above doesn't change the default non pause behavior (the pause only happens if DiagnosticsPort is set, but that's not the case here).
The inheritance is annoying, but it could be workaround: for instance, the .NET runtime could decide to unset the variable as soon as it is picked up. Though, it's definitely hacky.

But another problem with env variables is the global state within a process that makes it unsuitable if it needs to load several NativeAOT DLLs into the same process (and it could happen concurrently). It would require a different way to disambiguate (e.g. hashing the full path of the exe/DLL/dylib/so where the .NET runtime is).

This case - where you need ps information to disambiguate the process.
The case for multiple runtimes, where you point out a few issues that we've see.

Yes, also I discovered that the limit of the path of the Unix Socket Domain (108 chars) becomes quickly a problem, so disambiguating the path with a user defined name would be problematic.

But even if we can avoid collisions, a user would still want to disambiguate logically these different EventPipe instances.

So considering that the problem space doesn't have a simple solution for now and that proposal is not a good fit, I'm closing it and will think about it more. If I have time, will come back with another proposal.

For now, I'm able to workaround the problem by changing the TMPDIR variable, as I'm loading the dylib very early before the other .NET runtime is loaded, so it's not blocking.

@xoofx xoofx closed this as completed Dec 12, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jan 11, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Diagnostics-coreclr feature-request in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants