-
Notifications
You must be signed in to change notification settings - Fork 397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert back to Spark 2.3 #399
Conversation
… made to decision tree pruning in Spark 2.4. If nodes are split, but both child nodes lead to the same prediction then the split is pruned away. This updates the test so this doesn't happen for feature 'b'
Codecov Report
@@ Coverage Diff @@
## master #399 +/- ##
===========================================
- Coverage 86.89% 74.87% -12.03%
===========================================
Files 337 337
Lines 11076 11054 -22
Branches 351 590 +239
===========================================
- Hits 9625 8277 -1348
- Misses 1451 2777 +1326
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Bug fixes: - Ensure correct metrics despite model failures on some CV folds [#404](#404) - Fix flaky `ModelInsight` tests [#395](#395) - Avoid creating `SparseVector`s for LOCO [#377](#377) New features / updates: - Model combiner [#385](#399) - Added new sample for HousingPrices [#365](#365) - Test to verify that custom metrics appear in model insight metrics [#387](#387) - Add `FeatureDistribution` to `SerializationFormat`s [#383](#383) - Add metadata to `OpStandadrdScaler` to allow for descaling [#378](#378) - Improve json serde error in `evalMetFromJson` [#380](#380) - Track mean & standard deviation as metrics for numeric features and for text length of text features [#354](#354) - Making model selectors robust to failing models [#372](#372) - Use compact and compressed model json by default [#375](#375) - Descale feature contribution for Linear Regression & Logistic Regression [#345](#345) Dependency updates: - Update tika version [#382](#382)
curious to know why we are not ready for spark 2.4? i didnt observe any issues |
The main suite of products that use TransmogrifAI @ Salesforce requires Spark 2.3. Once they are ready to get upgrade we will move to 2.4. |
* Revert "Revert back to Spark 2.3 (#399)" This reverts commit 95a77b1. * Update to Spark 2.4.3 and XGBoost 0.90 * special double serializer fix * fix serialization * fix serialization * docs * fixed missng value for test * meta fix * Updated DecisionTreeNumericMapBucketizer test to deal with the change made to decision tree pruning in Spark 2.4. If nodes are split, but both child nodes lead to the same prediction then the split is pruned away. This updates the test so this doesn't happen for feature 'b' * fix params meta test * FIxed failing xgboost test * ident * cleanup * added dataframe reader and writer extensions * added const * cherrypick fixes * added xgboost params + update models to use public predict method * blarg * double ser test * update mleap and spark testing base * Update README.md * type fix * bump minor version * Update Spark version in the README * bump version * Update build.gradle * Update pom.xml * set correct json4s version * upgrade helloworld deps * upgrade notebook deps on TMog and Spark * bump to version 0.7.0 for Spark update * align helloworld dependencies * align helloworld dependencies * get -> getOrElse with exception * fix helloworld compilation * Spark 2.4.5 * Spark 2.4.5 * Spark 2.4.5 * Update OpTitanicSimple.ipynb * Update OpIris.ipynb * Revert "Spark 2.4.5" This reverts commit b3c0a74. * Revert "Spark 2.4.5" This reverts commit f4ab3fd. * Revert "Spark 2.4.5" This reverts commit 50d9dfb. * Revert "Update OpTitanicSimple.ipynb" This reverts commit 3417972. * Revert "Update OpIris.ipynb" This reverts commit df38bcc. Co-authored-by: Christopher Suchanek <cris.suchanek@gmail.com> Co-authored-by: Kevin Moore <jauntbox@gmail.com> Co-authored-by: Nico de Vos <njdevos@gmail.com>
* Revert "Revert back to Spark 2.3 (#399)" This reverts commit 95a77b1. * Update to Spark 2.4.3 and XGBoost 0.90 * special double serializer fix * fix serialization * fix serialization * docs * fixed missng value for test * meta fix * Updated DecisionTreeNumericMapBucketizer test to deal with the change made to decision tree pruning in Spark 2.4. If nodes are split, but both child nodes lead to the same prediction then the split is pruned away. This updates the test so this doesn't happen for feature 'b' * fix params meta test * FIxed failing xgboost test * ident * cleanup * added dataframe reader and writer extensions * added const * cherrypick fixes * added xgboost params + update models to use public predict method * blarg * double ser test * update mleap and spark testing base * Update README.md * type fix * bump minor version * Update Spark version in the README * bump version * Update build.gradle * Update pom.xml * set correct json4s version * upgrade helloworld deps * upgrade notebook deps on TMog and Spark * bump to version 0.7.0 for Spark update * align helloworld dependencies * align helloworld dependencies * get -> getOrElse with exception * fix helloworld compilation * style * WIP release notes * TMog version bump * update release notes * update release notes * updates to changelog * updates to changelog * updates to changelog * updates to changelog * updates to changelog * updates to changelog * fix changelog * fix changelog * keep helloworld on 0.6.1 until release Co-authored-by: Matthew Tovbin <tovbinm@users.noreply.github.com> Co-authored-by: Matthew Tovbin <mtovbin@salesforce.com> Co-authored-by: Christopher Suchanek <cris.suchanek@gmail.com> Co-authored-by: Kevin Moore <kevinmoore@salesforce.com> Co-authored-by: Matthew Tovbin <tovbinm@gmail.com>
Thanks for the contribution! Before we can merge this, we need @wsuchy to sign the Salesforce.com Contributor License Agreement. |
Thanks for the contribution! It looks like @Jauntbox is an internal user so signing the CLA is not required. However, we need to confirm this. |
Related issues
We are not ready for Spark 2.4 (#327)
Describe the proposed solution
Reverting to Spark 2.3 for now.
I will raise another PR with the 2.4 so we can have it ready to go once needed.
Describe alternatives you've considered
N/A