-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] Reapply the trained categorical columns when predicting #5246
Conversation
This appears to not work when loading a saved model via @jameslamb pointed out the params issue is likely #2613 (#4802) |
Hi @johnpaulett. We've merged a PR that loads the parameters from the model file, so now you can access bst = lgb.Booster(model_file='model.txt')
bst.params['categorical_feature'] Please let us know if you want to continue with this. |
@jmoralez Wonderful -- let me look at rebasing and testing. I do think this would be valuable, as I currently maintain a fork of kserve's lgbserver docker image that side loads these features in. |
thanks! Please use merge commits instead of rebasing, though, for the reasons described in #5252 (comment). |
Hi! I was wondering what the progress is on this PR and whether it's still on the roadmap? As I'm running into the exact problem @johnpaulett described in the first post. And I'm not sure what a different workaround would look like if I want to keep the category dtypes and not do some category-integer mapping. |
It's been over 2 years since the last commit here and well over a year since the last comment. @johnpaulett I guess you are no longer interested in pursuing this. I'm closing this, so others know that they can work on #5244 instead of waiting for this PR. Thanks very much for your interest in LightGBM. We'd be happy to have you come back and contribute some time in the future when you have time to work with us. |
Fixes #5244. During prediction, force any columns that were categorical during training to dtype
category
again. Useful when hosted via kserve and the user is sending a HTTP JSON POST that will not natively get translated to a categorical column in the DataFrame.Initially tried coding this change in
_data_from_pandas
, but elected to pull it into a separate method that is only called bypredict()
. I'm open to any feedback or suggestion on how to better implement this change.