Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform Schlinkert pruning both forwards and reverse, picking whichever saves more words #43

Merged
merged 9 commits into from
May 1, 2023

Conversation

sts10
Copy link
Owner

@sts10 sts10 commented Apr 24, 2023

Performs Schlinkert prune on given word list in both the direction it was provided and in reverse. Why in reverse? Because (a) that is still a valid way to make a list uniquely decodable and (b) in some cases it saves more words from the original list (BIPS39 English word list is such an example).

BIPS 39 word list as an example

If you perform a Schlinkert prune on the BIPS39 list in its given order, 1911 words are saved. If you reverse all words on the BIPS39 list and run a Schlinkert prune, 2011 words are saved.

With this PR, when a user tells Tidy to do a Schlinkert prune (-K), it now does it in both directions and then actually executes whichever the one saves more words. Cool!

Carefully reversing words with accented characters

Use graphemes!

/// Reverse all words on given list. For example,
/// `["hotdog", "hamburger", "alligator"]` becomes
/// `["godtoh", "regrubmah", "rotagilla"]`
/// Uses graphemes to ensure it handles accented characters correctly.
pub fn reverse_all_words(list: &[String]) -> Vec<String> {
    let mut reversed_list = vec![];
    for word in list {
        reversed_list.push(word.graphemes(true).rev().collect::<String>());
    }
    reversed_list
}

Downsides

It does roughly double the time it takes to run a Schlinkert prune. But I think it's worth it if it does save more words. Tidy isn't built for speed, its built for thoroughness.

@sts10
Copy link
Owner Author

sts10 commented Apr 24, 2023

length prefix-free suffix-free Schlinkert-pruned Reverse Schlinkert-pruned
BIPS39 2048 1999 1881 1914 2011

@sts10 sts10 merged commit 8bff8d5 into main May 1, 2023
@sts10 sts10 deleted the both-schlinkert branch May 1, 2023 13:45
@sts10
Copy link
Owner Author

sts10 commented May 1, 2023

This felt good enough to merge. Will keep an eye out for bugs!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant