Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for APC #316: In-place case change methods for String #135885

Open
3 tasks
krtab opened this issue Jan 22, 2025 · 4 comments
Open
3 tasks

Tracking Issue for APC #316: In-place case change methods for String #135885

krtab opened this issue Jan 22, 2025 · 4 comments
Assignees
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@krtab
Copy link
Contributor

krtab commented Jan 22, 2025

Feature gate: #![feature(string_make_uplowercase)]

This is a tracking issue for APC #316

This APC proposes that a new API is added to String to change cases and do so efficiently by consuming self and reusing the buffer, not allocating in most cases.

The exact implementation remains to be discussed, but the idea would be that in cases where it is possible, the case change is done in place. Once that isn't possible, a auxiliary DE-queue can be used to store bytes temporarily.

Public API

This would add the following methods (names to be determined)

impl String {
	fn into_uppercase(&mut self);
	fn into_lowercase(&mut self);
}

Steps / History

Unresolved Questions

  • None yet.

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@krtab krtab added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jan 22, 2025
@krtab
Copy link
Contributor Author

krtab commented Jan 22, 2025

@rustbot claim

@clarfonthey
Copy link
Contributor

Commenting this here specifically for the implementation since I know that the initial PR isn't focused on performance, but this should always be doable without an extra queue if you compute the necessary length in advance, and I feel like it should be possible to do this while still preserving a fast path for the case where the length doesn't change.

Essentially, keep a signed difference in length and skip characters which would increase that signed length above zero. (If we reduce length at some point and then add more back up to zero, that's okay.) Also keep track of the index of the first character this occurs for.

Then, if you do have that length above zero at the end, then write backwards from the end of the string until the first character that wasn't right, now that you know its absolute position.

My assumption that this would be as fast as the queue version at least, since you would still need to copy all the characters from the queue back into the string, but in this version, you're instead copying them from the string to itself.

I can probably write up some code for this later so it can actually be benchmarked and compared to other options. Also because my explanation may not be clear.

@krtab
Copy link
Contributor Author

krtab commented Jan 24, 2025

Thanks for your interest.

I am not sure I have well understood your explanations. Feel free to contact me on rust lang's zulip if you want to discuss further, either by DM or in this topic: https://rust-lang.zulipchat.com/#narrow/channel/219381-t-libs/topic/In.20place.20String.20case.20change.

@clarfonthey
Copy link
Contributor

My explanation could definitely use more work; I've shared the code for an implementation that you can read in Zulip.

We should continue the discussion on Zulip, but I figured I'd mention this here just to tie that thread for anyone reading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

2 participants