-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-10312: Make stemming configurable on PersianAnalyzer #906
LUCENE-10312: Make stemming configurable on PersianAnalyzer #906
Conversation
…mer-configurability
* @param stemExclusionSet a set of terms not to be stemmed | ||
*/ | ||
public PersianAnalyzer(CharArraySet stopwords, CharArraySet stemExclusionSet) { | ||
public PersianAnalyzer( | ||
CharArraySet stopwords, boolean useStemming, CharArraySet stemExclusionSet) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this three-args constructor private and keep the two-args constructor public PersianAnalyzer(CharArraySet stopwords, CharArraySet stemExclusionSet)
so that we make the API changes minimum on the next major release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suppose when stemExclusionSet
is set useStemming
flag is always set to true, so I think the three-args constructor can be internal-use only.
I don't think we should make this configurable. Please let's not do this to the analyzers. Just add the stemmer in 10.x, done. |
That also works for me, the TokenFilter will be available in 9.2 but you have to create an analyzer yourself to use it, and we delegate BWC handling to factories. I'll update this PR to just add a MIGRATE entry explaining how the analyzer has changed, and open a separate one against 9x to remove the stem filter from PersianAnalyzer. |
I wonder if my first PR #904 would work? |
Closing in favour of #904 |
No description provided.