-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of ntpath.isdir/isfile/exists/islink #102765
Comments
They're probably actually best to fall back to the existing code. It will be slightly more efficient than a new |
I started with the assumption that using I implemented one version of I also implemented Footnotes
|
The rest of your analysis is fantastic, as usual. But here's where we disagree:
The main scenario this is likely to be impactful is enumerating large directories (many files/subdirectories), which are unlikely to be predominantly reparse points on a Windows machine. So I weight it more heavily, and come out about equal on the perf impacts (as in, both approaches seem fine). However, I think So on balance, I prefer |
About 80% of desktop PCs use Windows 10, which wouldn't have the improved I expected this issue to be created, but I expected it would be to use Given my experiment with using |
That's applying just as much "in theory" as I am, and we're talking about unreleased software that can still change, so all theories could be invalidated.
About the only alternative I would entertain is to go back to calling So for now, I want to stick with the more-efficient-than-before option everywhere, especially since it reduces our code repetition. Meanwhile, I'll try and get |
Do you think that
If you mean going all he way back to |
Doubtful. But then your argument must be to leave the functions alone (at least for now), because GetFileInformationByName will also not appear in Windows 10. If we're just going to prepare for future OS updates, we can hold off on any change here until it's clearer what the faster path will be. However, if we're going to assume that reparse points are common enough to need a fast path, |
When the by-name check is very fast, as
Implementing a three-tier query that starts with What I was worried about with using I just built and ran another comparison, this time of two implementations of In fairness, the symlink case can also be faster when using |
I'm loving all this analysis. Two questions about your tests:
I can't speak to the speed of the final released implementation of (Also, this is coming from somebody that is not an expert, let alone owner, of the file system APIs. Just another dev interested in performance playing with these new shiny toys). |
That 10% overhead seems like a lot, but I guess nobody is doing much work in this stack. In any case, the main question isn't really whether to use a lower-level function or not, but how to balance the cost of handling reparse points vs. optimising for non-reparse points. After that, the question is whether the code complexity is worth optimising for non-reparse points with |
Today I expanded my last test of
Using In Windows 11 22H2, I have neither My implementation of #include <winternl.h>
typedef struct _RTLP_CURDIR_REF
{
LONG RefCount;
HANDLE Handle;
} RTLP_CURDIR_REF, *PRTLP_CURDIR_REF;
typedef struct _RTL_RELATIVE_NAME_U
{
UNICODE_STRING RelativeName;
HANDLE ContainingDirectory;
PRTLP_CURDIR_REF CurDirRef;
} RTL_RELATIVE_NAME_U, *PRTL_RELATIVE_NAME_U;
FILE_INFORMATION_CLASS FileStatInformation = 68;
typedef struct _FILE_STAT_INFORMATION {
LARGE_INTEGER FileId;
LARGE_INTEGER CreationTime;
LARGE_INTEGER LastAccessTime;
LARGE_INTEGER LastWriteTime;
LARGE_INTEGER ChangeTime;
LARGE_INTEGER AllocationSize;
LARGE_INTEGER EndOfFile;
ULONG FileAttributes;
ULONG ReparseTag;
ULONG NumberOfLinks;
ACCESS_MASK EffectiveAccess;
} FILE_STAT_INFORMATION, *PFILE_STAT_INFORMATION;
NTSTATUS (NTAPI *RtlDosPathNameToRelativeNtPathName_U_WithStatus)(
PCWSTR DosFileName,
PUNICODE_STRING NtFileName,
PCWSTR *FilePart,
PRTL_RELATIVE_NAME_U RelativeName) = NULL;
VOID (NTAPI *RtlSetLastWin32ErrorAndNtStatusFromNtStatus)(
NTSTATUS Status) = NULL;
NTSTATUS (NTAPI *NtQueryInformationByName)(
POBJECT_ATTRIBUTES ObjectAttributes,
PIO_STATUS_BLOCK IoStatusBlock,
PVOID FileInformation,
ULONG Length,
FILE_INFORMATION_CLASS FileInformationClass) = NULL;
static BOOL
My_GetFileInformationByName(
PCWSTR fileName,
FILE_INFO_BY_NAME_CLASS fileInfoClass,
PVOID fileInfoBuffer,
ULONG fileInfoBufferSize)
{
if (!NtQueryInformationByName) {
HMODULE ntdll = GetModuleHandleW(L"ntdll");
*(FARPROC *)&NtQueryInformationByName =
GetProcAddress(ntdll, "NtQueryInformationByName");
*(FARPROC *)&RtlDosPathNameToRelativeNtPathName_U_WithStatus =
GetProcAddress(ntdll,
"RtlDosPathNameToRelativeNtPathName_U_WithStatus");
*(FARPROC *)&RtlSetLastWin32ErrorAndNtStatusFromNtStatus =
GetProcAddress(ntdll,
"RtlSetLastWin32ErrorAndNtStatusFromNtStatus");
}
if (!NtQueryInformationByName || fileInfoClass != FileStatByNameInfo) {
SetLastError(ERROR_NOT_SUPPORTED);
return FALSE;
}
NTSTATUS status;
UNICODE_STRING objectName;
RTL_RELATIVE_NAME_U relativeName;
OBJECT_ATTRIBUTES obja;
IO_STATUS_BLOCK iosb;
status = RtlDosPathNameToRelativeNtPathName_U_WithStatus(
fileName, &objectName, NULL, &relativeName);
if (!NT_SUCCESS(status)) {
RtlSetLastWin32ErrorAndNtStatusFromNtStatus(status);
return FALSE;
}
if (relativeName.RelativeName.Length) {
InitializeObjectAttributes(
&obja, &relativeName.RelativeName, OBJ_CASE_INSENSITIVE,
relativeName.ContainingDirectory, NULL);
}
else {
InitializeObjectAttributes(
&obja, &objectName, OBJ_CASE_INSENSITIVE, NULL, NULL);
}
status = NtQueryInformationByName(
&obja, &iosb, fileInfoBuffer, fileInfoBufferSize,
FileStatInformation);
HeapFree(GetProcessHeap(), 0, objectName.Buffer);
if (!NT_SUCCESS(status)) {
RtlSetLastWin32ErrorAndNtStatusFromNtStatus(status);
return FALSE;
}
return TRUE;
} |
This second thought is the crux of the matter. We don't currently allow ourselves to call into Do you still have raw numbers for your table? Normalising each row separately means we can't figure out the ratio of regular files to links that would break even between any of these options. |
Here's another set of 10 trials, still based on a set of 50 files and a set 50 symlinks to those files, checked 1000 times per trial. This time it includes the results for Best of 10
Average of 10
The current implementation of the It states that the new Windows 10 will still have this performance improvement if we add an initial step that tries
I only shared I just noticed that we need to fix a bug in the builtin functions. When passed a file descriptor, we get the handle via |
Thanks. So what this says to me is that But even in the >10K case, it only seems like an incremental improvement compared to what's coming in the future.
Yes, I'm coming around to this POV now. Win11 gets Okay, let's go with that then. Same logic as is used in https://github.com/python/cpython/blob/main/Modules/posixmodule.c#L2036-L2042
Did you file a new issue already or shall I? Footnotes
|
…d slow_path logic
For implementers, note that a reparse point that's set on a directory is invalid if it resolves to a file, and a reparse point that's set on a file is invalid if it resolves to a directory. Thus if |
I implemented the benchmark myself1 and tried it in a number of scenarios. Most of the time, Interestingly, I also noticed that every operation that goes via a mount point rather than a drive name is ridiculously slow. That it's slower isn't a surprise, but that it's over 100% slower was. Not a case we can do anything about, but felt it was worth mentioning. Given the range of performance characteristics we may be dealing with here, I'm inclined to stick with the current implementation plan. I didn't actually test against the new API (don't have a machine handy right now), but since we wouldn't get any Footnotes
|
To clarify, your results are that |
Yep. It varied in significance, and usually was so close as to be irrelevant, but was no slower in either files/symlinks case. |
I can confirm Steve's result. Here's the result I get for testing
I expected traversing the mountpoint "C:\Mount\volume_e" to take more time since the system has to parse a path on volume "C:" (slower than other volumes, for whatever reason), and then restart parsing on volume "E:". |
…ormationByName when available (GH-103485)
Can this be resolved now that #103485 has landed? |
Let's assume "yes" and close this, we can always re-open it if necessary. |
These functions should use
GetFileAttributesW
for their fast paths, and the ones that traverse links can fall back tostat
if a reparse point is found.Linked PRs
The text was updated successfully, but these errors were encountered: