Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understand performance of string comparisons #196

Open
nsmith- opened this issue Jul 20, 2023 · 0 comments
Open

Understand performance of string comparisons #196

nsmith- opened this issue Jul 20, 2023 · 0 comments
Labels
evaluator Issues related to the evaluator help wanted Extra attention is needed

Comments

@nsmith-
Copy link
Collaborator

nsmith- commented Jul 20, 2023

For corrections that use string input fields extensively, string comparison could have a significant performance impact. It may be the case that the strings are relatively static (e.g. constant for an entire dataset or subset of dataset), so one would expect branch prediction to do a good job in amortizing the expense. Nevertheless, some profiling to understand the extent of the issue would be useful.
There are a few improvements we could make to reduce string comparison:

  • Add an API that allows to pre-create some integer token that represents the string and pass that as an argument in the Correction::evaluate call, which internally would then do a faster lookup
  • Project out the string dimension and return a reduced correction, as discussed in Partially evaluated correction object #38
  • Provide a context manager in which certain nodes the correction's evaluation tree are frozen to pre-defined values (c.f. @arizzi)
@nsmith- nsmith- added help wanted Extra attention is needed evaluator Issues related to the evaluator labels Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluator Issues related to the evaluator help wanted Extra attention is needed
Development

No branches or pull requests

1 participant