-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wsl.exe outputting unicode to stdout #4607
Comments
@therealkenc I agree with you, a BOM wouldn't change anything. I mentioned it just to be more precise as to the current output of the command. |
@0xbadfca11 You might have mis-understood the meaning of my bug report. I will try to say it again in a different way: this is not about removing unicode support and not about supporting asian languages. What this is about is interaction with other command-line tools. As The problem is also present when piping the output to grep.exe , but I did not specify it because some observers might argue that it is a third-party tool that might not support unicode properly. Whereas findstr is a built-in command that we all know works well. |
findstr doesn't support multi-byte Unicode codepoints, which is what WSL outputs. Agree, it's strange that WSL is not respecting the terminal's codepage, breaking these common scenarios. |
I think part of the issue is that wsl's entire purpose is to provide interoperability with Linux applications, and pretty much everything in the *nix sphere these days expects UTF8. WSL could easily find its output being piped either into a Windows program expecting the system codepage or a Linux program expecting UTF8, and outputting probably breaks the fewest things on either side. |
I'm running into what I believe is a similar issue. I'm creating a PowerShell script that reads the output of the WSL command. I want to manipulate the resulting text using the -replace operator, but the -replace operator isn't working properly with the output of the WSL command. The example below compares the output of the first line of the WSL --list command ($a[0]) with the output of a standard string using the same characters. You can see how the replace operator is unable to replace the string "Windows" with "Linux" with the WSL command output. However, the replace operator is able to replace "Windows" with "Linux" within the standard string. Furthermore, you can see that the WSL string length is twice as long as the standard string length. I show evidence of this in the last two lines: |
I had a similar requirement to turn the output on I consider this a bug, or at least unexpected output format - but for now changing the console encoding seems to work as a workaround: $console = ([console]::OutputEncoding)
[console]::OutputEncoding = New-Object System.Text.UnicodeEncoding
$distroArray = (wsl -l -v | Select-String -SimpleMatch 'Ubuntu-20.04') -split '\s+'
[console]::OutputEncoding = $console |
Hey Phil,
Good stuff. This worked and I learned something new! Thanks a bunch for
the quick response :)
|
This issue has just wasted a couple of hours of my time as well. I'm trying to process the list of distros using the standard CMD FOR /F command. A simple repro for this is: Piping the output of 'wsl -l' to the standard 'more' CMD also fails eg: |
I have the same problem.. |
@francogp check out falloutphil's answer. After changing the encoding to Unicode, things shall fall into place. |
Also, if anyone needs to parse the output of
Would love to see this fixed. It really makes parsing/automating very difficult with |
In my application, I don't have a console so I can't use the helpful workaround mentioned above. How can I decode the output properly for all cases? That is, suppose I have a function to run a wsl command given as a string by the caller. When it is invoked with one command, e.g. One solution would be for the calling context to signal which decoding to use, since it "knows" what kind of command it's running. Ok, but the error case is different. Sometimes, wsl will return errors in one encoding and sometimes in the other, depending on where that error originated. And, in this case, there is no context available to help determine the encoding. For example, the other day an odd looking error popped up in my log -
With some editing, I found that it reads |
I experience the same problem. After wasting couple of hours on the problem I found @falloutphil suggestion working. Putting following function into my function wsl {
begin { $pipe_in = "" }
process { if ($pipe_in -ne "") { $pipe_in += "`n" } $pipe_in += "$_" }
end {
$console = ([console]::OutputEncoding)
[console]::OutputEncoding = New-Object System.Text.UnicodeEncoding
$wsl_cmd = Get-Command -CommandType Application wsl | Select-Object -First 1 | Select-Object -ExpandProperty Source
if ( $pipe_in -ne "") {
Invoke-Expression "`"$pipe_in`" | $wsl_cmd $Args"
} else {
Invoke-Expression "$wsl_cmd $Args"
}
[console]::OutputEncoding = $console
}
} |
Thank you! Based on your solution and the recipe from Make a Bash alias that takes a parameter? so now |
Using nothing more than a
I cannot use Bash or PowerShell for this, and |
Wow, I am amazed that after nearly 3 years, this seemingly simple bug is still present. Any chance you can open-source the command-line tools (wsl.exe), so we can submit PR to fix it ourselves? |
It's ugly, but using a RegEx
Curious why PowerShell isn't an option for this since it's available pretty much anywhere WSL runs. I'm sure you have your reason; just curious ;-). |
This now appears to be fixed based on an opt-in environment variable in the latest Preview release 0.64.0. Simply adding the environment variable PowerShell: $env:WSL_UTF8=1
wsl --list | findstr Ubuntu Note that if you are running export WSL_UTF8=1
WSLENV="$WSLENV":WSL_UTF8
wsl.exe -l -v | grep -i Ubuntu |
As @NotTheDr01ds said the 0.64.0 release introduces the Unfortunately, we cannot make it the default behavior since other programs depend on it, but users can opt-in to the new behavior by setting WSL_UTF8=1. |
Windows is and has always been so fucked up in the age of the internet. Some very bad decisions back there at Redmon in the early 90'. |
Windows is and has always been so fucked up in the age of the internet. Some very bad decisions back there at Redmon in the early 90'.
Do bear in mind that UTF-8 was first publicly proposed less than 6 months before the release of NT 3.1, and the first Unicode Standard to include it was Unicode 2.0, published in 1996. By the time UTF-8 was available, all the design decisions for NT were in the past and MS was committed, whether they wanted to be or not, to UCS-2 (and later to UTF-16, which is back-compatible with UCS-2).
|
What the fuck: to get a bat file to export all my wsl instances I needed to run wsl to parse wsl crappy output: |
I can understand why changing the default encoding to always be UTF-8 might break existing programs, but why doesn't |
Please use the following bug reporting template to help produce issues which are actionable and reproducible, including all command-line steps necessary to induce the failure condition. Please fill out all the fields! Issues with missing or incomplete issue templates will be closed.
If you have a feature request, please post to the UserVoice.
If this is a console issue (a problem with layout, rendering, colors, etc.), please post to the console issue tracker.
Important: Do not open GitHub issues for Windows crashes (BSODs) or security issues. Please direct all Windows crashes and security issues to secure@microsoft.com. Ideally, please configure your machine to capture minidumps, repro the issue, and send the minidump from "C:\Windows\minidump".
Please fill out the below information:
Your Windows build number: (Type
ver
at a Windows Command Prompt)Microsoft Windows [Version 10.0.18362.418]
What you're doing and what's happening: (Copy&paste the full set of specific command-line steps necessary to reproduce the behavior, and their output. Include screen shots if that helps demonstrate the problem.)
and when trying to filter using "findstr" :
Nothing.
What's wrong / what should be happening instead:
The output of
wsl.exe --list
seems to be UTF-16 without BOM, so the output is not respecting the specified codepage of the system, and thus the "findstr" does not understand the input.The output of
wsl.exe --list
should be in the system codepage.Strace of the failing command, if applicable: (If
some_command
is failing, then runstrace -o some_command.strace -f some_command some_args
, and link the contents ofsome_command.strace
in a gist here).For WSL launch issues, please collect detailed logs.
See our contributing instructions for assistance.
The text was updated successfully, but these errors were encountered: