Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

case sensitivity for NTFS handled incorrectly in bash.exe #2081

Closed
raymod2 opened this issue May 9, 2017 · 44 comments
Closed

case sensitivity for NTFS handled incorrectly in bash.exe #2081

raymod2 opened this issue May 9, 2017 · 44 comments

Comments

@raymod2
Copy link

raymod2 commented May 9, 2017

NTFS accepts both uppercase and lowercase characters (and preserves them) for file system entries. However, this is only a cosmetic feature. You cannot have two file system entries in the same directory that differ only in case (ie. 'Foo' and 'foo'). Therefore, when specifying an existing filename or directory, the case is ignored. For example, the following works fine in cmd.exe:

mkdir Foo
cd foo

In bash.exe, however, the following fails:

mkdir /mnt/c/Foo
cd /mnt/c/foo

Also TAB completion fails if you don't use the correct case for each character in a partial filename. This is expected behavior when working with native Linux filesystems. However, it is not expected behavior when using an NTFS mount. This significantly impacts usability for developers working at the command line.

@therealkenc
Copy link
Collaborator

However, this is only a cosmetic feature. You cannot have two file system entries in the same directory that differ only in case (ie. 'Foo' and 'foo').


cmd-foo

@raymod2
Copy link
Author

raymod2 commented May 9, 2017

@therealkenc:

I'm not sure what you are showing us here. You shouldn't be able to create two different files or directories with the same name.

C:\Users\Dan>mkdir foo && mkdir Foo
A subdirectory or file Foo already exists.

@aseering
Copy link
Contributor

@raymod2 -- what he's showing you is that NTFS's expected behavior is more complicated than you think it is :-)

For what it's worth, I appreciate that Windows is adhering to correct Linux semantics here. Having paths that are valid but that don't show up in ls would confuse my scripts, and would increase the odds that I would (for example) commit a symlink or path to a git repository that would not work properly on our production Linux machines.

@0xbadfca11
Copy link

bash supports case sensitivity on DrvFs except Windows's root directory.
https://msdn.microsoft.com/en-us/commandline/wsl/faq
https://msdn.microsoft.com/en-us/commandline/wsl/release_notes#build-14361
Therefore, I think should not be to WSL kernel case insensitive handling.
Also, tab completion is not job of kernel.

@therealkenc
Copy link
Collaborator

Also, tab completion is not job of kernel.

Precisely. This is more or less a rerun of #2003 (message). To paraphrase: ...it isn't really a WSL Thing. If on Real Linux™ you think tab completion should be case insensitive, that's basically the same ask with the same solutions.

@raymod2
Copy link
Author

raymod2 commented May 10, 2017

LOL, I am seeing a pattern here. Whenever someone suggests a feature for WSL that deviates from a pure Linux environment there are 3 or 4 people who jump on it and claim it will cause a rift in the space/time continuum. Do you 3 or 4 people realize that this is a Microsoft WINDOWS product??? I am truly astonished that you guys don't just run Linux.

@aseering
Copy link
Contributor

@raymod2 -- my apologies. It's a fair criticism. I don't mean to silence differing opinions; I think it's important that the WSL team hear that there's interest in case-insensitivity and other similar features. I hope they have heard your commentary.

That said: The top of the main WSL documentation page says that one of the core objectives of WSL is to run, to quote that page, "UNMODIFIED" Linux binaries, including the bash shell itself. That ain't me; no one on this thread put that text there, it's the official Microsoft line as far as I know. If you're going to advocate (as here) that a project should do something that goes against against the big bold text at the top of its main page ... I mean, honestly that's kind of awesome; WSL's Windows interop isn't going to get any better without someone being bold and breaking traditional Linux semantics somewhere. But it's not going to be an easy argument to make. And us in the peanut gallery don't have much to do with that :-)

@therealkenc
Copy link
Collaborator

therealkenc commented May 10, 2017

I mean, honestly that's kind of awesome ... Windows interop isn't going to get any better without someone being bold ... And us in the peanut gallery don't have much to do with that

People in the peanut gallery are both empowered and encouraged to do just that. The peanut gallery has everything to do with it. If someone (your words) wants to add (for example) case insensitive filename completion to bash (a Free Software Foundation product not a Microsoft product), by several means available, just do it. Neither Microsoft nor Linus Torvalds will stop you, because case insensitive filename completion is not a feature of lxcore.sys/lxss.sys (WSL) nor vmlinuz (Linux). It is not a feature of NTFS or Ext4. It is a feature of bash.

@aseering
Copy link
Contributor

Ah, references... "that" -> "not going to be an easy argument"

@fpqc
Copy link

fpqc commented May 10, 2017

Didn't read the whole thread, but microsoft disabled this feature in the root of /mnt/c for security reasons . You can have files differing by a capital letter in any other directory.

@fpqc
Copy link

fpqc commented May 10, 2017

@therealkenc it's also an option in zsh's settings fwiw.

@therealkenc
Copy link
Collaborator

No doubt zsh has such a feature (because zsh). Regarding C:\ and case sensitivity rules I say... Badges? We don't need no stinkin' badges.

wsl-c-cmd-foo

@fpqc
Copy link

fpqc commented May 10, 2017

What is this witchcraft?!

Guessing you didn't create this through WSL haha

@raymod2
Copy link
Author

raymod2 commented May 10, 2017

@fpqc: You can't have two file system entries with the same name in any directory. Have you tried it?

@raymod2
Copy link
Author

raymod2 commented May 10, 2017

Here is an article that clarifies a few things regarding case sensitivity on Windows:

https://support.microsoft.com/en-us/help/100625/filenames-are-case-sensitive-on-ntfs-volumes

What I am requesting in this ticket is for Bash on Windows to use the second mode of operation for NTFS mounts (to be case preserving but case insensitive).

@fpqc
Copy link

fpqc commented May 10, 2017

@raymod2 I think I am understanding what you mean now: You're asking for DrvFS ntfs mounts to actually enforce case preserving + insensitivity like fat32.

The reason we don't have that right now, I think, is that direct drvfs filesystem access is the only interop method we have right now. I'm hoping that in the future, there will be a Windows-side driver and a Linux-side server that mounts lxfs as like nfs on Windows and serves it as nfs from WSL, or something like that.

@therealkenc
Copy link
Collaborator

therealkenc commented May 10, 2017

You're asking for DrvFS ntfs mounts to actually enforce case preserving + insensitivity like fat32.

So... User Voice. The feature request reads something like: "The WSL open(pathname, O_CREAT) syscall on DrvFS mounts should return EPERM if pathname matches any file in the directory with the same case-insensitive name". I'll even upvote, for the comedic value.

The reason we don't have that right now, I think...

...is that there would be no use case for such behavior. In particular it would not give you case insensitive tab completion, which is the ask. Because whether bash does a case sensitive or case insensitive search on the dirents returned by getdents() is up to bash not "WSL".

And, unsurprisingly, is an option in bash.

cd $HOME
echo "set completion-ignore-case on" > .inputrc
# restart bash

This significantly impacts usability for developers working at the command line.

FTFY

@raymod2
Copy link
Author

raymod2 commented May 11, 2017

@therealkenc: Are you trolling? What is this nonsense about open() returning EPERM? Your "set completion-ignore-case on" doesn't work in Bash on Windows. Have you tried it? Also, it won't fix the example I gave in my first post:

mkdir /mnt/c/Foo && cd /mnt/c/foo

@benhillis
Copy link
Member

benhillis commented May 11, 2017

@raymod2 - @therealkenc's suggestion works for me (I pressed tab after inputting "ex"):

benhill@BENHILL-DELL ~> echo "set completion-ignore-case on" > .inputrc
benhill@BENHILL-DELL ~> bash
benhill@BENHILL-DELL:~$ touch EXAMPLE
benhill@BENHILL-DELL:~$ vi ex
benhill@BENHILL-DELL:~$ vi EXAMPLE

@fpqc
Copy link

fpqc commented May 11, 2017

@therealkenc I think he means use win32 semantics in the drvfs driver for NTFS (another option would be to add mount options). It does make sense, even if it might be undesirable. There are drivers that do this or can do it with options, in particular vfat (by necessity) and ntfs-3g (as an option)

ntfs-3g has an option windows_names, which enforces Win32 restrictions on new files and another option ignore_case, which enforces case insensitivity when using things like cd or rm (although it does happen to screw up ls by hiding case entirely).

@raymod2
Copy link
Author

raymod2 commented May 11, 2017

@benhillis: Ah, I was trying to run the command directly in the bash shell (or in my .bashrc) and it wasn't doing anything. Putting it in ~/.inputrc works. This is an improvement for usability but doesn't resolve the other issues.

  1. mkdir /mnt/c/Foo && cd /mnt/c/foo

    This fails when it should succeed.

  2. mkdir /mnt/c/foo && echo lowercase > /mnt/c/foo/bar && echo uppercase > /mnt/c/foo/BAR

    This succeeds when it should fail. It puts the NTFS filesystem into a bad state as demonstrated below.

image

@therealkenc
Copy link
Collaborator

therealkenc commented May 11, 2017

Also, it won't fix the example I gave in my first post:
mkdir /mnt/c/foo && echo lowercase > /mnt/c/foo/bar && echo uppercase > /mnt/c/foo/BAR

Yes please put up a User Voice for that as soon as possible. The quote I gave you is the technical change; you want open(2) to fail with EACCES (sorry not EPERM) in order for that second echo fail at the commandline, as requested. The more eyeballs on this request the better.

@raymod2
Copy link
Author

raymod2 commented May 11, 2017

In my second example it shouldn't outright fail. It should overwrite the existing file with the new contents. That is what happens in cmd.exe:

image

@billziss-gh
Copy link

billziss-gh commented May 11, 2017

I am somewhat ambivalent on this, but I believe I lean "against".

ARGUING FOR

So an argument could be made that since DrvFs is different anyway, it might as well be case-insensitive to play nice with Win32.

ARGUING AGAINST

  • As mentioned earlier the intent with WSL is to run unmodified Linux binaries. Linux programs are written with a general understanding that file systems are case-sensitive. Changing this in WSL could spell trouble for some programs.

  • NTFS is a case-sensitive file system that just happens to support case-insensitive queries. [I usually refer to NTFS as a "mixed" sensitivity file system.]

The first reason above (compatibility) is what makes me lean "against" currently.


@therealkenc

The quote I gave you is the technical change; you want open(2) to fail with EACCES (sorry not EPERM) in order for that second echo fail at the commandline, as requested.

If I have followed the discussion correctly the correct error return should be EEXIST IMO.

@redbaron
Copy link

As mentioned earlier the intent with WSL is to run unmodified Linux binaries. Linux programs are written with a general understanding that file systems are case-sensitive.

/thread

@therealkenc
Copy link
Collaborator

/thread

I am still hoping for a User Voice entry. Because of the pure awesome.

@billziss-gh - Microsoft recently added FileDispositionInformationEx() and FILE_DISPOSITION_POSIX_SEMANTICS. I haven't had a chance to look at it closely yet though.

@raymod2
Copy link
Author

raymod2 commented May 11, 2017

@therealkenc: I think it's time you stepped away from this topic. Your arrogance and sarcasm are not contributing to the discussion.

@billziss-gh
Copy link

@therealkenc wrote:

Microsoft recently added FileDispositionInformationEx() and FILE_DISPOSITION_POSIX_SEMANTICS. I haven't had a chance to look at it closely yet though.

Wow, thanks. I missed this originally, but now found a link to the NT Insider which has a good discussion.

http://www.osronline.com/2017/ntinsider_2017_01.pdf

@therealkenc
Copy link
Collaborator

Wow, thanks. I missed this originally

Yeah that is where I read about it. There is also this github repo, which you probably saw if you did the same google search as me. I have still been looking at a client SMB solution on and off, FWIW. I might show up over at winfsp if it ever goes anywhere.

@fpqc
Copy link

fpqc commented May 12, 2017

@therealkenc neat! Do you think that filedisposition stuff is related to the work that the WSL team is doing with the ntfs team for better performance and support?

@therealkenc
Copy link
Collaborator

therealkenc commented May 12, 2017

Better "support", yes that's the gist. Better performance is a slightly different dimension. Russ did a really good post back in August here explaining the challenges. It's an analogous (but unrelated) situation with assumptions Linux-first apps make about the memory manager. As you know the devs are working furiously on all of this, but they are also in the unenviable position of having to weigh "git works but could be faster" against the 1363 users who want CUDA and the 906 users who want systemd. Only so many hours in the day. The good news is stuff improves one way or another every Insider roll.

@billziss-gh
Copy link

billziss-gh commented May 12, 2017

@therealkenc wrote:

I have still been looking at a client SMB solution on and off, FWIW. I might show up over at winfsp if it ever goes anywhere.

Feel free to drop by any time.

From the NT Insider:

FILE_DISPOSITION_POSIX_SEMANTICS

With this new behavior, the directory entry is deleted as soon as the handle where the file was deleted is closed.

[Disclaimer: what I say below may be based on a complete misunderstanding of how FILE_DISPOSITION_POSIX_SEMANTICS works. I only have the information in the NT Insider article.]

I am not convinced that this behavior would do anything magical to get things to work better re: compatibility with unlink.

The normal delete protocol on Windows involves sending the IRP messages CREATE, SET_INFORMATION, CLEANUP, CLOSE. Under the normal rules the directory entry goes away when all handles are closed (the last CLEANUP is received). The directory entry effectively becomes a "tombstone" for the file until all handles are closed; the file cannot be opened again (STATUS_DELETE_PENDING / ERROR_ACCESS_DENIED), but it can be seen with QUERY_DIRECTORY (FindFirstFileW / FindNextFileW on Win32).

Under the FILE_DISPOSITION_POSIX_SEMANTICS rules the directory entry goes away when the specific handle is closed. What if a Win32 process has already a file opened that a Linux process decides to unlink? Ignoring FILE_SHARE_* problems, perhaps the Win32 process relies on the fact that the tombstone exists in the directory and that it can be queried using QUERY_DIRECTORY. Also what if the Win32 Process that had the file open queries for the file name?

[I am also not sure how this is better than the old trick of renaming the file to a new directory / file and then deleting that file.]

The other problem withCLEANUP is that it does not allow an error return. Because files on Windows are really deleted during CLEANUP, you can never be 100% certain whether the delete succeeded or not. [The file system promised to delete the file during the SET_INFORMATION call, but who knows what happens during CLEANUP.]


In any case this is largely off topic. I am happy to continue this discussion elsewhere.

@sunilmut
Copy link
Member

@raymod2 - Thanks for your post and starting a discussion for a requirement. We appreciate it. I haven't read the full post, but what it appears is that you are asking to have some kind of Windows semantics applied to DrvFs files. I can understand the ask. If that is something you care about, I would really recommend that you open a user voice ticket and see how others also feel about it.

As others have noted, and as you understand, this ask goes beyond compatibility. To give you some idea, right now our focus is compatibility. There is lot of surface area still left in that space, that we are not able to focus on many things beyond that. Hopefully that provides a glimpse of the view from this side of the camp.

We do appreciate any and all feedback. So, please keep it coming.

@fpqc
Copy link

fpqc commented May 12, 2017

@sunilmut Just in case you didn't see my post up the thread, given how many posts there are, this does have precedent as a pair of mount options in the ntfs-3g driver for Linux, namely windows_names and ignore_case (taken together, these seem to fully implement win32 semantics for ntfs). Given that drvfs now supports mount options with the mount command, that's probably how it should be implemented if you do ever get around to it.

I do agree however that this isn't the most important ask at this time.

@benhillis
Copy link
Member

@fpqc - agreed, a mount option would be a good way to add this.

@raymod2
Copy link
Author

raymod2 commented May 12, 2017

@sunilmut: As far as compatibility is concerned, we don't want users to corrupt their NTFS file systems because of something they do in Bash on Windows.

@sunilmut
Copy link
Member

@fpqc - Thanks for the suggestion. I think that's a neat way of going about this. Adding @SvenGroot as FYI.

@raymod2 - Yes, I think we understand the ramifications. That's one of the reasons it is disabled on system folders such c: etc. We haven't had a lot of people complain about corruption yet. So, at this point, it's not very high on the list. We understand the ask though.

@fpqc
Copy link

fpqc commented May 12, 2017

@sunilmut By the way, offtopic, but supporting some of the other simple ntfs-3g mount options like uid,gid,umask,fmask,dmask , for example, if you wanted to restrict interop functionality to only the root user, you could set umask=0077 (nb, I was just wondering what kind of escape beyond the WSL 'sandbox' you could get by dropping a copy of cmd.exe into a linux user directory and executing it, even if you did block drvfs access to System32).

@benhillis
Copy link
Member

Marking as a feature, the idea for DrvFs mount options to disable case sensitivity or set file owner is a good suggestion.

@erkinalp
Copy link

DrvFs should honor "HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\kernel\ dword:ObCaseInsensitive".

@TSlivede
Copy link

TSlivede commented Nov 14, 2017

DrvFs should honor "HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\kernel\ dword:ObCaseInsensitive"

I agree, especially as the the corresponding GPO documentation talks about non-win32-subsystems. (By corresponding I mean, that the linked GPO sets the mentioned registry key.) It explicitly states: "Set this policy to Enabled. All subsystems will be forced to observe case insensitivity." That is clearly not true here, so either the documentation is wrong (or at least misleading) or this is a bug.

@benhillis add "bug" label?

Note: I still think "case sensitiv" (=current behaviour) is the best default option for DrvFs, but that doesn't change the fact that WSL ignores a (security-)GPO.

@philwalk
Copy link

I have run into this issue in a context that I believe illustrates that the correct behavior is not a concession to Windows, but a requirement for any mounted case-ignoring filesystem (e.g., NTFS).

I have a linux process that passes a request to a windows process to create a file. The windows process creates the requested file, but in all UPPERCASE letters, so when the linux process tests to see if the requested file was actually created, the linux libraries incorrectly report that the file does not exist, when (in fact) it does.

@SvenGroot
Copy link
Member

SvenGroot commented Apr 27, 2018

In the current insider builds (and the soon-to-be-released spring update), you can mount DrvFs with the "case=off" option to get case-insensitive behavior. See here for details and here for information on how to set that as the default.

@philwalk
Copy link

Thanks for the info!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests