-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StringsToAutomaton#build to take List as parameter instead of Collection #12427
Conversation
Thanks @shubhamvishu for working on this! As I look at the PR, I'm wondering if accepting a I wonder if the ctor for Since Apologies for leading us in the wrong direction in the issue I opened. I think a |
@gsmiller I totally agree its not helpful if the input/data |
d820f8c
to
27b08c4
Compare
Yeah, good points/questions! I'd be curious how much overhead it would actually add to sort the input when it's already sorted? But to take a step back for a moment, we also have automata building methods that accept a If we want to make these functions a bit more user-friendly, we could look at changing the |
I was looking at this problem too and the change in test classes to meld arbitrary collections to +1 to throw an On a related note, since we only traverse the items in their given order, maybe we can relax |
Agreed @gsmiller! Yes it does seem maybe its better leave this as is.
I like the idea to throw IAE if wrong input is provided. I think this would only affect the cases with assert disabled? otherwise with asserts enabled we anyways always keep track of previous term.
Good point @gautamworah96. |
27b08c4
to
a27a4d4
Compare
On the same note, since both the methods expects It would have been much better if |
That's right. I imagine a lot of users disable asserts when running in a production environment for performance reasons, so the only concern here is that we're adding some new work for them to do this validation. I still think the explicit validation is better in this situation, but the counter-argument would be a user that knows for sure they're always properly providing sorted input and don't want the overhead of the validation. As for
I think it's so that |
Agreed! As we discussed above one way could be to sort based on user input that if the collection is sorted or not but since that will be for only one flavour it might be confusing. I'll try to benchmark this change to see if its causing any regression or not.
Ahh I see. That makes sense. Thanks @gsmiller ! |
@gsmiller I finally got a chance to run the benchmarks for this change. Below are the results(looks all good to me). Let me know what do you think? Thanks!
|
+1 -- looks like just noise to me. |
+1 looks good to me as well. I like that this small change, 1) makes the API a little more general, allowing users to provide any Iterable instead of Collection, and 2) adds an explicit check for expected ordering of the iterable using IAE instead of relying on asserts being enabled. Thanks for following up! |
…rted (#12427) Also update the API to accept more general Iterable<BytesRef> instead of Collection<BytesRef>
Something weird is happening. This commit is causing the following failure:
I am starting to investigate, but folks involved with this change might find the issue quicker.
|
@benwtrent is this an issue of the code only being partially rebuilt? The signature for UPDATE: It didn't repro for me locally. Have you tried a |
@gsmiller you are 100% correct 🤦 |
@benwtrent glad it was a simple fix. Sorry it created churn! |
See #12742 - it's a problem with our javac task settings. Running a clean is a workaround. I will provide a fix for this when I get a chance, it's not a critical issue (but it is a bug in how we currently set up task dependencies). |
Nice!....I like it how we discovered a build issue bug(which was hidden for sometime now I think) and fixed it. Thanks for solving @dweiss! |
…rted (apache#12427) Also update the API to accept more general Iterable<BytesRef> instead of Collection<BytesRef>
Description
StringsToAutomaton
previously calledDaciukMihovAutomatonBuilder
(renamed in this PR) should take List as input instead of Collection as it expects sorted input.Closes #12319