Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

path: fix win32.relative() for some Unicode paths #27662

Closed
wants to merge 1 commit into from

Conversation

mscdex
Copy link
Contributor

@mscdex mscdex commented May 12, 2019

Fixes: #27534

This is an alternative to #27644 that avoids any performance regressions in cases where lowercased versions of paths do not differ in length to the original.

The only minor issue (in practice it may not really matter) is that return values are always returned normalized using canonical composition if the lowercased version differed in length to the original, which may be unexpected if you passed in a path that was normalized using canonical decomposition for example.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

@mscdex mscdex added the path Issues and PRs related to the path subsystem. label May 12, 2019
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@mscdex
Copy link
Contributor Author

mscdex commented May 12, 2019

Perhaps for builds without intl we could utilize a polyfill, such as unorm.

@mscdex
Copy link
Contributor Author

mscdex commented May 12, 2019

Alternatively we could ignore additional code points and only compare base symbols, but I'm not sure if that's how Windows/NTFS/etc. compares characters in paths.

Copy link
Member

@jdalton jdalton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather see more work on the original PR as it's a first time contributor instead of jumping to this one. Let's hold off for the time being.

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not fix all edge cases. There could theoretically be a combination of the Russian I and the German ß. Those together could return the identical length after lowercasing the input and then still producing wrong results.
Update: I've mistaken.

@mscdex
Copy link
Contributor Author

mscdex commented May 22, 2019

@BridgeAR do you have a specific test case in mind?

@BridgeAR
Copy link
Member

@mscdex sorry I made a mistake while looking at this earlier. It should indeed work properly in all cases (if intl is available).
About the polyfill: we could also change the algorithm itself to better reflect what we want in case the length is not identical. Using that would also prevent the edge case mentioned above.

One of these implementations could e.g. look like: "loop over the original first argument and slice out parts, lowercase them and compare them with the other side in case you find a slash." That should be almost as fast as the generic implementation.

@lundibundi
Copy link
Member

ping @mscdex @BridgeAR @jdalton is this ready basically but needs reviews?

@mscdex
Copy link
Contributor Author

mscdex commented Aug 25, 2020

@lundibundi I guess?

@lundibundi
Copy link
Member

@jdalton I think your request changes can now be dismissed?

ping @nodejs/path

@lundibundi lundibundi added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 2, 2020
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 2, 2020
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@whyboris
Copy link

Seems like it would fix a crash in my app for users with İ in the file path:
whyboris/Video-Hub-App#533 (comment)

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BridgeAR BridgeAR dismissed jdalton’s stale review October 13, 2020 18:14

Please take another look. The alternative PR is long closed.

Copy link

@whyboris whyboris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have enough background to confirm the code is 💯 but as far as I can tell it's good 👍

@aduh95 aduh95 removed the stalled Issues and PRs that are stalled. label Oct 19, 2020
@aduh95 aduh95 added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. request-ci Add this label to start a Jenkins CI on a PR. labels Oct 19, 2020
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Oct 19, 2020
@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@aduh95
Copy link
Contributor

aduh95 commented Oct 20, 2020

@mscdex The CI is failing consistently on very specific jobs, I'm not sure if this needs action or is just flaky CI.

@nodejs-github-bot
Copy link
Collaborator

CI: https://ci.nodejs.org/job/node-test-pull-request/33803/

@mscdex
Copy link
Contributor Author

mscdex commented Oct 22, 2020

@aduh95 The error in the intl-less build environment makes sense. I don't think we came to a consensus about how to handle that situation. I suggested using a pure JS polyfill for a consistent experience whether intl is available or not. @BridgeAR had a separate suggestion. Maybe someone else has additional input into this or is ok with one of these solutions?

@jasnell jasnell added stalled Issues and PRs that are stalled. and removed author ready PRs that have at least one approval, no pending requests for changes, and a CI started. labels Dec 15, 2020
@github-actions
Copy link
Contributor

This issue/PR was marked as stalled, it will be automatically closed in 30 days. If it should remain open, please leave a comment explaining why it should remain open.

@whyboris
Copy link

@mscdex -- could you resolve the conflicting files (currently only lib/path.js)

@jdalton -- is there any reason to delay merging after the merge conflicts are resolved? 🙏

@monoblaine
Copy link

@whyboris I'll be happy to cherry-pick @mscdex's commit, resolve the conflicts and create another PR if they're not available at the moment.

I've been using a custom-built (using #27644) node for almost two years and I really want this officially fixed. 😐

@mscdex
Copy link
Contributor Author

mscdex commented Jan 30, 2021

@monoblaine the issue about what to do about intl-less node builds hasn't been resolved yet, so resolving merge conflicts isn't worthwhile at this point

@monoblaine
Copy link

Oh, NOW I understand the problem here. (It took only two years)

A note to my future self and other fast learners like me:

  • Node uses ICU to implement internationalization support.
  • This significantly increases the file size.
  • Therefore node has the option to make a build wihout intl support.
  • The changes introduced in this PR is dependent on intl support's existence and that's a problem. (An alternative solution is proposed but not yet discussed.)

@bnb
Copy link
Contributor

bnb commented Jan 11, 2022

@mscdex is this PR still something that can/should land? Is there anything that's currently blocking?

@mscdex
Copy link
Contributor Author

mscdex commented Jan 11, 2022

@bnb

is this PR still something that can/should land?

I'm not really the one to ask, I was just providing an alternative solution to an issue.

Is there anything that's currently blocking?

#27662 (comment)

fromOrig = fromOrig.normalize('NFD');
from = fromOrig.toLowerCase();
}
let toNormalized;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚙️ (optional) Let's use false rather than undefined since it's boolean type:

Suggested change
let toNormalized;
let toNormalized = false;


if (fromOrig === toOrig)
return '';

from = fromOrig.toLowerCase();
to = toOrig.toLowerCase();

if (fromOrig.length !== from.length) {
fromOrig = fromOrig.normalize('NFD');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't be NFC?

@BridgeAR
Copy link
Member

BridgeAR commented Aug 6, 2024

Superseded by another PR

@BridgeAR BridgeAR closed this Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
path Issues and PRs related to the path subsystem. stalled Issues and PRs that are stalled.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UTF Characters for relative path