Skip to content

Commit

Permalink
Revert back since tag for traits and fix docs.
Browse files Browse the repository at this point in the history
  • Loading branch information
yanboliang committed Nov 30, 2016
1 parent 019e5af commit 27b07ef
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 4 deletions.
4 changes: 2 additions & 2 deletions docs/ml-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -1189,8 +1189,8 @@ that the number of buckets used will be smaller than this value, for example, if
distinct values of the input to create enough distinct quantiles.

NaN values:
NaN values will be removed from the column when `QuantileDiscretizer` fitting. This will produce
a `Bucketizer` model for making prediction and transformation. During the transformation, `Bucketizer`
NaN values will be removed from the column during `QuantileDiscretizer` fitting. This will produce
a `Bucketizer` model for making predictions. During the transformation, `Bucketizer`
will raise an error when it finds NaN values in the dataset, but the user can also choose to either
keep or remove NaN values within the dataset by setting `handleInvalid`. If the user chooses to keep
NaN values, they will be handled specially and placed into their own bucket, for example, if 4 buckets
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,15 @@ private[feature] trait ChiSqSelectorParams extends Params
*
* @group param
*/
@Since("1.6.0")
final val numTopFeatures = new IntParam(this, "numTopFeatures",
"Number of features that selector will select, ordered by ascending p-value. If the" +
" number of features is < numTopFeatures, then this will select all features.",
ParamValidators.gtEq(1))
setDefault(numTopFeatures -> 50)

/** @group getParam */
@Since("1.6.0")
def getNumTopFeatures: Int = $(numTopFeatures)

/**
Expand All @@ -64,12 +66,14 @@ private[feature] trait ChiSqSelectorParams extends Params
* Default value is 0.1.
* @group param
*/
@Since("2.1.0")
final val percentile = new DoubleParam(this, "percentile",
"Percentile of features that selector will select, ordered by ascending p-value.",
ParamValidators.inRange(0, 1))
setDefault(percentile -> 0.1)

/** @group getParam */
@Since("2.1.0")
def getPercentile: Double = $(percentile)

/**
Expand All @@ -78,25 +82,29 @@ private[feature] trait ChiSqSelectorParams extends Params
* Default value is 0.05.
* @group param
*/
@Since("2.1.0")
final val fpr = new DoubleParam(this, "fpr", "The highest p-value for features to be kept.",
ParamValidators.inRange(0, 1))
setDefault(fpr -> 0.05)

/** @group getParam */
@Since("2.1.0")
def getFpr: Double = $(fpr)

/**
* The selector type of the ChisqSelector.
* Supported options: "numTopFeatures" (default), "percentile", "fpr".
* @group param
*/
@Since("2.1.0")
final val selectorType = new Param[String](this, "selectorType",
"The selector type of the ChisqSelector. " +
"Supported options: " + OldChiSqSelector.supportedSelectorTypes.mkString(", "),
ParamValidators.inArray[String](OldChiSqSelector.supportedSelectorTypes))
setDefault(selectorType -> OldChiSqSelector.NumTopFeatures)

/** @group getParam */
@Since("2.1.0")
def getSelectorType: String = $(selectorType)
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,13 +73,15 @@ private[feature] trait QuantileDiscretizerBase extends Params
* @group param
*/
// TODO: SPARK-18619 Make QuantileDiscretizer inherit from HasHandleInvalid.
@Since("2.1.0")
val handleInvalid: Param[String] = new Param[String](this, "handleInvalid", "how to handle " +
"invalid entries. Options are skip (filter out rows with invalid values), " +
"error (throw an error), or keep (keep invalid values in a special additional bucket).",
ParamValidators.inArray(Bucketizer.supportedHandleInvalids))
setDefault(handleInvalid, Bucketizer.ERROR_INVALID)

/** @group getParam */
@Since("2.1.0")
def getHandleInvalid: String = $(handleInvalid)

}
Expand All @@ -91,8 +93,8 @@ private[feature] trait QuantileDiscretizerBase extends Params
* are too few distinct values of the input to create enough distinct quantiles.
*
* NaN handling:
* NaN values will be removed from the column when `QuantileDiscretizer` fitting. This will produce
* a `Bucketizer` model for making prediction and transformation. During the transformation,
* NaN values will be removed from the column during `QuantileDiscretizer` fitting. This will
* produce a `Bucketizer` model for making predictions. During the transformation,
* `Bucketizer` will raise an error when it finds NaN values in the dataset, but the user can
* also choose to either keep or remove NaN values within the dataset by setting `handleInvalid`.
* If the user chooses to keep NaN values, they will be handled specially and placed into their own
Expand Down

0 comments on commit 27b07ef

Please sign in to comment.