Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement 32-bit widestrings for Linux/WSL #1874

Closed
MarijnS95 opened this issue Jul 6, 2022 · 17 comments
Closed

Implement 32-bit widestrings for Linux/WSL #1874

MarijnS95 opened this issue Jul 6, 2022 · 17 comments
Labels
question Further information is requested

Comments

@MarijnS95
Copy link
Contributor

With recent work we've finally enabled windows-rs to run on Linux, many thanks for opening up 🥳!

However, for my specific use-case of wrapping DirectXShaderCompiler in Rust there's one more thing I need. Widestring types are prevalent in the DirectXShaderCompiler API, but don't currently work on Linux.

Wide chars and strings are typically 32-bit, with Windows being the exception rather than the rule at 16-bit. That is also how they're implemented in DXC when compiled for non-Windows, making them incompatible (read: segfaults) with the current UTF-16 "assumption" inside windows.

To support this format we can "simply" apply some cfg() changes that turn the anyway-custom BSTR / P*WSTR into *mut u32-carrying structures, with an API to match. In my experiment I have simply written a:

#[cfg(windows)]
type WCHAR = u16;
#[cfg(not(windows))]
type WCHAR = u32;

Problem 1: converting to and from str/String

In a previous (unaccepted) contribution I opted to use widestring, since it implements everything we need. That was (understandably) frowned upon as the windows crate should be the base for many other things, not the other way around (and not have many dependencies in any case).

The widestring implementation simply uses .chars() to acquire an iterator over char which is a "Unicode code point" but exactly fits UTF-32 chars too. It works and makes sense, but I'm no Unicode expert and don't know if this is the right way to implement it.

Problem 2: Allocations and length

Current implementations use HeapAlloc + GetProcessHeap + HeapFree to store owned strings. As per #1842 we can leave it up to the users on non-Windows machines to provide fallbacks for these. However, it's probably more efficient and friendly to provide direct fallbacks behind a cfg() in windows (i.e. inside heap_alloc and heap_free, or inside the string implementations even).

Regardless, this API is annoying/impossible to use correctly since most (all) Rust allocation APIs need to know the size of the allocation they're freeing. This includes a theoretical Box::<[WCHAR]> (where the slice has a dynamic size), and std::alloc::dealloc requires the same. Setting it to 0 seems to work (the underlying allocator used on my machine doesn't care about the size) but I don't think that's a fair assumption/workaround to apply.

Perhaps Param::Boxed can and should store this length? The string types can also have a .len() function to compute it at runtime based on a null-character (or BSTRs length-prefix).

@riverar
Copy link
Collaborator

riverar commented Jul 6, 2022

I think we should also create an issue at https://github.com/microsoft/DirectXShaderCompiler to understand the intended API shape on Linux targets. This could very well just be a bug on their part, resulting in possibly no work to be done here.

@kennykerr
Copy link
Collaborator

kennykerr commented Jul 6, 2022

And just to set expectations again - grumpy Windows guy here 😉 - while I have no interest in putting up roadblocks for developers, the windows-rs project is squarely about targeting Windows and specifically not about targeting non-Windows platforms. So I don't want to get in your way, but it's going to be a hard sell to add code specifically to support non-Windows targets, such as the alternative allocators you alluded to. On the other hand, I don't think anything prevents you from linking in your own implementations of functions like HeapAlloc.

So if this is something that you need to support the DirectX Shader Compiler, I suggest you reach out to that team to see how they want to support this further. Happy to work with them as needed.

@kennykerr kennykerr added the question Further information is requested label Jul 7, 2022
@MarijnS95
Copy link
Contributor Author

And just to set expectations again - grumpy Windows guy here 😉 - while I have no interest in putting up roadblocks for developers, the windows-rs project is squarely about targeting Windows and specifically not about targeting non-Windows platforms. So I don't want to get in your way, but it's going to be a hard sell to add code specifically to support non-Windows targets

* yawn *. Do we need to settle this grudge over a beer or something? It is getting old by now. Bring back com-rs (on the same amazing quality/support level as windows-rs, I can give you that) and I'll be out of your way.

As noted previously this project is still fresh and has trouble getting accustomed to outside contributions. As you also know I'm just a lone contributor, trying to leave the open-source space in a better position than I found it in. This means that I can tackle small things such as this miniscule addition to windows-rs. I can add tests for it, and commit to maintaining it long term (or risk it getting removed again if not) if that's what you need to accept this approach.

such as the alternative allocators you alluded to. On the other hand, I don't think anything prevents you from linking in your own implementations of functions like HeapAlloc.

Allocators are no must, but as explained HeapFree is hard to implement without having a size of the region to be freed. That's why it is perhaps easier to:

  1. Implement this directly in windows-rs;
  2. Piggyback enum Param to store the length.

Note again that this is just as simple as a Box::into_raw() + Box::from_raw(), nothing fancy/complex that carries the weight of "the alternative allocators".

So if this is something that you need to support the DirectX Shader Compiler, I suggest you reach out to that team to see how they want to support this further. Happy to work with them as needed.

I can reach out and create an issue, that's no problem. There's also microsoft/DirectXShaderCompiler#3210 requesting to get rid of widestrings entirely which would solve my issue, but it hasn't gained any traction in ±1.5 years.

But again, I'm just a lone contributor and it'd be very inefficient to serve as a bridge between Microsoft Rust and Microsoft DXC folks.

@kennykerr
Copy link
Collaborator

Do we need to settle this grudge over a beer or something? It is getting old by now.

I hope you're joking about a grudge. We've come a long way in a short amount of time with cross-compilation and cross-platform support thanks to everyone's help... for a crate that's all about Windows. 😁

HeapFree is hard to implement without having a size of the region to be freed

Typically, that's done by prepending the size in a header. You could even implement it using malloc and free from the CRT and your implementation would be a one-liner. I try to avoid platform dependencies and prefer portable library code as much as possible, but some Windows APIs require specific allocators, which is why HeapAlloc is used here.

@MarijnS95
Copy link
Contributor Author

MarijnS95 commented Jul 8, 2022

I'm not. As you say we were gaining traction in i.e. #1863 only to fall flat on our faces again here by denying the "Windows"-on-Linux case again. It's extremely tiring as I have no idea whether my contributions are useless (not to you / this project, but to me).

Granted, that's exactly how I typically contribute to open-source projects: make crates better despite not actively using them (anymore). But I'd be extremely disappointed to have contributed here to make this crate even better than it already is only to never be able to use it because three-or-so *mut u16s should be a *mut u32 on my platform (and some minimal conversions).

Typically, that's done by prepending the size in a header. You could even implement it using malloc and free from the CRT and your implementation would be a one-liner. I try to avoid platform dependencies and prefer portable library code as much as possible, but some Windows APIs require specific allocators, which is why HeapAlloc is used here.

Good suggestion, I hadn't thought of doing it that way despite implementing this for BSTR a long while ago in DXC! Regardless, the "allocator" I'd use here isn't any special dependency so it shouldn't be too big of a deal.

In other words, here's how I feel we can approach this:

  1. Discuss with DXC whether their choice for wchar_t is intentional (I presume yes). Since I haven't received any confirmation I'll ask again: who should file this issue?
  2. Presuming it is intentional, I propose a change that implements u32 WCHARs and minimal conversions behind #[cfg(not(windows))];
  3. Pending that implementation, we decide whether to have a minimal Box/Vec allocation implementation in windows-rs, or force it on the user through HeapAlloc/HeapFree?

@riverar
Copy link
Collaborator

riverar commented Jul 8, 2022

Adding a 32-bit wchar_t type to support a non-Windows implementation in windows-rs creates/continues down a slippery slope of increasing crate complexity to support non Windows targets. I don't believe this proposal will be accepted as-is.

Some suggestions to increase the chance of success:

  • Create an issue on the DXSC repository and provide a link/that context
    • Have a discussion with the DXSC team around your cross-platform needs
    • Identify the pain-point(s) of having non-interoperable types (e.g. wchar_t)
    • Demonstrate the actual problem with a sample (e.g. create a fault by feeding their APIs wtf16)
  • Provide a value proposition beyond I need this
  • Identify workarounds or blockers to creating workarounds

@kennykerr
Copy link
Collaborator

Unfortunately, I don't see this as a trivial accommodation. Cross-platform support tends to be a rabbit hole of problems. From experience, it's never as simple as merely conditionally defining WCHAR as either 16-bit or 32-bit. Feel free to keep the conversation going, but I'll close this issue for now as there's no actionable work.

@MarijnS95
Copy link
Contributor Author

This is not my project so you're free to choose the easy way out.

@riverar:

Create an issue on the DXSC repository and provide a link/that context

Saying that you are open to working with DXSC folks if they confirm utf-32 + wchar_t is indeed intended?

Provide a value proposition beyond I need this

That's how any (software) project exists, isn't it? Either someone, or some company, needs a (software) project, for personal/internal use or to sell?

Alas, this is literally explained in the first paragraph in the top post. What gives?

Identify workarounds or blockers to creating workarounds

Again, I redirect you to the top post. Can you be more specific about what isn't clear?

@riverar
Copy link
Collaborator

riverar commented Jul 13, 2022

[I'm an unpaid volunteer so please don't shoot the messenger.]

The windows crate is for developers targeting Windows. Targeting the DXSC implementation on Linux is out of scope for this crate. Being out of scope doesn't necessarily mean you're out of luck. But it does increase the difficulty greatly for your proposed changes to be accepted.

My initial suggestions were an attempt to help you build up a better argument for why Microsoft should spend time/money on this and take on the change risk, support burden, complexity, etc. The current argument is not very compelling. I think it's missing key data, such as but not limited to:

  • Who uses DXSC on Linux? Who wants to use DXSC on Linux? Is there measurable demand?
  • Is this even a real issue? Is there a test case that demonstrates the failure?
  • Is DXSC team aware of this string type discrepancy? Is it per design or a bug?
  • Is DXSC for Linux supported by Microsoft?
  • Why does windows-rs have to change? Have you explored non-windows-rs workarounds? Perhaps a thin wrapper crate that ingests the windows crate and provides string corrections?
  • Are there other APIs (e.g. dxcore) that need the alternate wchar representation? Are there any APIs that are expecting UTF-16 on Linux and will now break?

My suggestion for your next step is to open a DXSC issue that points out this cross-platform pain point, with a C++/Rust reproducible example. It's very possible the DXSC team will want to smooth out this developer experience and fix the APIs.

I hope you stick to your guns and keep working on this! Happy to help fight for you where I can.

@MarijnS95
Copy link
Contributor Author

Randomly stumbled upon microsoft/DirectXShaderCompiler#4242, indicating that 32-bit wchar support on Linux is entirely intended. As for the rest of the points brought up by @riverar, I answered and repeated most of them over the past few years, and the rest is only sidetracking the original topic.

As pointed out above I don't want to be the bridge between Microsoft's DXC and Rust team beyond sparking an issue over at DXSC, but haven't had any meaningful response from @kennykerr detailing how to proceed.

@MarijnS95
Copy link
Contributor Author

@riverar @kennykerr: created an issue over at DXC to discuss this, including possible solutions: microsoft/DirectXShaderCompiler#4742

I very much appreciate it if both of you could actively participate in the discussion, so that we can find a resolution that suits everyone in the end 🤩 👍

@jenatali
Copy link
Member

jenatali commented Nov 3, 2022

I've responded to the DXC issue, but figured I'd at least comment here to ping the relevant folks: microsoft/DirectXShaderCompiler#4742 (comment). Chiming in as a second Linux-ported COM component, we explicitly chose to use the platform native wide character (wchar_t) due to the ergonomics of it. While these days char16_t does exist, support for it is practically nonexistent outside of converting it to wide or narrow strings. Furthermore, in C++ there's no way to define a string literal in a cross-platform manner without resorting to per-platform macros, such that on Windows it expands to L"" and on Linux would expand to u16"". You can't implicitly convert a char16_t* to a wchar_t* on Windows.

So, to address these:

  • Who uses DXSC on Linux? Who wants to use DXSC on Linux? Is there measurable demand?
    • There's not a heavy demand, but yes it is measurable. Folks want to run shader compilation on build farms or in the cloud.
  • Is this even a real issue? Is there a test case that demonstrates the failure?
    • I don't have a test off-hand, but yes, passing a 16-bit char string to function which expects 32-bit char strings does not work.
  • Is DXSC team aware of this string type discrepancy? Is it per design or a bug?
    • Yes. I think for DXC's case it was unintentional, but for D3D's it was very much by design, and a design which happens to line up with DXC, so therefore I can't call it a bug.
  • Is DXSC for Linux supported by Microsoft?
    • Yes.
  • Why does windows-rs have to change? Have you explored non-windows-rs workarounds? Perhaps a thin wrapper crate that ingests the windows crate and provides string corrections?
    • (Not for me to answer)
  • Are there other APIs (e.g. dxcore) that need the alternate wchar representation? Are there any APIs that are expecting UTF-16 on Linux and will now break?
    • Yes, D3D12 expects native wide characters. I'm not aware of any APIs that are in the opposite situation - though that doesn't mean that there are none.

@MarijnS95
Copy link
Contributor Author

@jenatali Thank you for your insights, and sharing the fact that there are at least two Windows APIs that intentionally use 32-bit wide chars on Linux!

Allow me to expand on your points a bit:

So, to address these:

  • Who uses DXSC on Linux? Who wants to use DXSC on Linux? Is there measurable demand?

    • There's not a heavy demand, but yes it is measurable. Folks want to run shader compilation on build farms or in the cloud.

Not only that, DXC is used to compile HLSL to SPIR-V for use with Vulkan on Linux.

  • Is this even a real issue? Is there a test case that demonstrates the failure?

    • I don't have a test off-hand, but yes, passing a 16-bit char string to function which expects 32-bit char strings does not work.

Exactly that. I don't understand why this question even has to be asked after I've submitted multiple issues and PRs to resolve the issue. Is my explanation, or code to solve it, somehow a fairy tail or magic smoke, and do people insist on seeing a branch that fails until a patch to windows-rs is applied that brings in the right string type?

If so, here is an older minimal example I have been working on: https://github.com/MarijnS95/windows-rs/commits/dxc2

On Linux, drop the last patch that gets in early WCHAR support (only for PCWSTR which is used in that DXC example, and disallows any other to-be-converted code via todo!()), and be greeted with:

Compilation failed with HRESULT(0x80070057): `unable to parse shader model.`

Because the "cs_6_5" Rust string literal gets converted to a 16-bit wide char array by windows-rs, which DXC fails to understand when interpreting it as a 32-bit wide string. With the patch a compiled binary is successfully emitted and printed to the terminal.

Note that this branch also demonstrates #1842 by shimming necessary functions that are not available outside of Windows. Works a treat 🥳!

  • Is DXSC team aware of this string type discrepancy? Is it per design or a bug?

    • Yes. I think for DXC's case it was unintentional, but for D3D's it was very much by design, and a design which happens to line up with DXC, so therefore I can't call it a bug.

According to the main DXC developer that brought Linux support, this is very much intended: microsoft/DirectXShaderCompiler#4742 (comment)

  • Is DXSC for Linux supported by Microsoft?

    • Yes.

Yes indeed, they provide Linux testing and artifacts on every commit.

  • Why does windows-rs have to change? Have you explored non-windows-rs workarounds? Perhaps a thin wrapper crate that ingests the windows crate and provides string corrections?

    • (Not for me to answer)

None of the thin wrapper crates are actively maintained. They don't build off of win32metadata. They don't provide the safe, high-level helpers windows-rs provides. This is effectively a compliment for windows-rs, as long as I/we can actually put it to use for a DXSC wrapper 😉

Unless someone suggests to fork all of windows-rs just to make the necessary modifications, and keep that up to date somehow (please, no).

  • Are there other APIs (e.g. dxcore) that need the alternate wchar representation? Are there any APIs that are expecting UTF-16 on Linux and will now break?

    • Yes, D3D12 expects native wide characters. I'm not aware of any APIs that are in the opposite situation - though that doesn't mean that there are none.

Yes, thanks for backing me up on this. I wouldn't know of any other API either.

@riverar
Copy link
Collaborator

riverar commented Nov 4, 2022

[...] here is an older minimal example I have been working on: https://github.com/MarijnS95/windows-rs/commits/dxc2

On Linux, drop the last patch that gets in early WCHAR support (only for PCWSTR which is used in that DXC example, and disallows any other to-be-converted code via todo!()), and be greeted with:

Compilation failed with HRESULT(0x80070057): `unable to parse shader model.`

I cloned the example you provided and rolled back 1 commit (to 1e01916) per instructions, and am getting (on WSLv2):

...
thread 'main' panicked at 'attempt to subtract with overflow', crates/samples/dxc/src/main.rs:106:86
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

@MarijnS95
Copy link
Contributor Author

MarijnS95 commented Nov 4, 2022

@riverar That may be a mistake on my side: DXC's error-return API isn't known to be exactly clear, and I presume the UTF-8 error string isn't containing a friendly error message for you (this codepath is only called when fn Compile() returns an error). Its size is 0, and when I "strip" the trailing 0 character off of it with blob.GetBufferSize() - 1, it underflows.

@MarijnS95
Copy link
Contributor Author

MarijnS95 commented Nov 4, 2022

@riverar I force-pushed a fix (more like workaround, though) to MarijnS95@4c130136 - thanks in advance for trying!

MarijnS95@4c130136#diff-b9b4f35dbc7dd25c832aa5a60cecb0c7d754caa74704f911d55a140940aac4a5R100-R102

@MarijnS95
Copy link
Contributor Author

MarijnS95 commented Nov 6, 2022

For completeness the branch pointed out above has just been rebased on latest master to showcase that this is still an issue, and that the solution is still to change all the types and conversion functions. For simplicity I've disabled most of them, and the existing w! macro works just fine when plainly copying over the 32-bit codepoints.

EDIT: FWIW it was also brought up in microsoft/DirectXShaderCompiler#4742 (comment) that windows-rs could have multiple "wide" string types, and encode in metadata what "kind" it should use for each API when used outside of target_os="windows".

In addition windows-rs now uses the platform-native wcslen() function which uses wchar_t = 32-bit on Linux!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants