-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speed for a single record #12
Comments
Any conclusion here? |
Looks like benchmark results posted in the README.md file is quite misleading, they claim that current JVM version is few orders of magnitude faster than xgboost4j, and if you would run benchmark you will be able to get similar results. However, if you will dig deeper you would figure out that most of the time xgboost4j spend on creating DMatrix object - which is not in sparse format (by default) and has huge size: 100x100000. I believe that using sparse matrix format would boost performance. I've checked benchmark with DMatrix of size 80x100 - more suitable for my case and performance of xgboost4j was better (30-40% faster). |
I have made a benchmark on some of the different libraries available, among them XGBoost4j and XGBoost-Predictor, you can take a look here if you are interested. |
Did you know about dmlc/xgboost#1849 (comment)
Apparently xgboost4j is quicker for batch predictions in the current version than this library.
Do you have a test which compares predicting a single new value and not 200k values? As described in the linked xgboost issue xgboost4j,s api is only supporting batch mode. What about your library?
The text was updated successfully, but these errors were encountered: