Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More regex functions - find, substring, replace #1598

Merged
merged 7 commits into from
Nov 4, 2021
Merged

Conversation

sc1f
Copy link
Contributor

@sc1f sc1f commented Oct 28, 2021

PR on top of #1596, but this one can be merged first to show all the changes - it's in a separate PR to make code review easier.

This PR adds more regex functions:

  • find(string, pattern, output_vector): writes the start and end-index (0-indexed) of the first match in string of the first capturing group in pattern. Values can be read from the output vector as follows: var vec[2]; find("x", '(.*)', vec) ? vec[0] + vec[1] : null. Returns true if a match was found and values were written to the output vector, and false otherwise.
  • substring(string, start_idx, length?): returns the substring of string from start_idx (0-indexed) with the specified length. If the length is not provided, returns the substring from start_idx to the end of the string, or null if the string is invalid or indices are invalid.
  • replace(string, pattern, replacer): replaces the first match of pattern in string with replacer, and returns the replaced string or the original string if no replacements were made.
  • replace_all(string, pattern, replacer): replaces all matches (non-overlapping) of pattern in string with replacer, and returns the replaced string or the original string if no replacements were made.

This PR also renames fullmatch from #1596 to match_all to remain consistent with the replace/replace_all API.

With this suite of Regex functions, one can easily clean and convert data from within Perspective - poorly formatted strings, dates that could not be parsed, monetary values that you want to convert to a float but cannot because of other string tokens, etc. can all be cleaned and properly converted from within Perspective's expression API.

@sc1f sc1f force-pushed the more_regex_functions branch from 7ed7ffd to f4caa66 Compare October 28, 2021 23:08
@sc1f sc1f marked this pull request as ready for review October 29, 2021 15:33
@sc1f sc1f added C++ enhancement Feature requests or improvements labels Oct 29, 2021
@sc1f sc1f force-pushed the more_regex_functions branch from 7a5299d to 567f478 Compare October 29, 2021 17:53
@sc1f sc1f force-pushed the more_regex_functions branch from 3b35f4d to 55f5db7 Compare November 1, 2021 16:26
Copy link
Member

@texodus texodus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the PR!

@texodus texodus merged commit cf3a194 into master Nov 4, 2021
@texodus texodus deleted the more_regex_functions branch November 4, 2021 05:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ enhancement Feature requests or improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants