[AIR] Maintain dtype info in LightGBMPredictor #28673

Yard1 · 2022-09-21T17:52:31Z

Signed-off-by: Antoni Baum antoni.baum@protonmail.com

Why are these changes needed?

We always convert to numpy and then back to dataframe in LightGBMPredictor, and try to infer dtypes in between. This is imprecise and allows for an edge case where a Categorical column composed of integers is classified as an int column, and it also decreases performance. This PR keeps dtype information if possible by not converting to numpy unnecessarily. The inference logic is still present for the tensor column case - I am not familiar enough with it to fix it here (if it needs fixing in the first place).

Related issue number

Closes #28619

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>

krfricke

LGTM

[AIR] Maintain dtype info in GBDT Predictors

ac3a233

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>

Yard1 requested a review from krfricke September 21, 2022 17:52

Yard1 assigned krfricke Sep 21, 2022

Fix

3e4040c

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>

Yard1 changed the title ~~[AIR] Maintain dtype info in GBDT Predictors~~ [AIR] Maintain dtype info in LightGBMPredictor Sep 21, 2022

krfricke approved these changes Sep 22, 2022

View reviewed changes

krfricke merged commit b7f0346 into ray-project:master Sep 22, 2022

Yard1 deleted the fix_gbdt_categorical_prediction branch September 22, 2022 16:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIR] Maintain dtype info in LightGBMPredictor #28673

[AIR] Maintain dtype info in LightGBMPredictor #28673

Yard1 commented Sep 21, 2022 •

edited

Loading

krfricke left a comment

[AIR] Maintain dtype info in LightGBMPredictor #28673

[AIR] Maintain dtype info in LightGBMPredictor #28673

Conversation

Yard1 commented Sep 21, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

krfricke left a comment

Choose a reason for hiding this comment

Yard1 commented Sep 21, 2022 •

edited

Loading