Skip to content

Commit

Permalink
[SPARK-6657] [PYSPARK] Fix doc warnings
Browse files Browse the repository at this point in the history
Fixed the following warnings in `make clean html` under `python/docs`:

~~~
/Users/meng/src/spark/python/pyspark/mllib/evaluation.py:docstring of pyspark.mllib.evaluation.RankingMetrics.ndcgAt:3: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/mllib/evaluation.py:docstring of pyspark.mllib.evaluation.RankingMetrics.ndcgAt:4: WARNING: Block quote ends without a blank line; unexpected unindent.
/Users/meng/src/spark/python/pyspark/mllib/fpm.py:docstring of pyspark.mllib.fpm.FPGrowth.train:3: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/mllib/fpm.py:docstring of pyspark.mllib.fpm.FPGrowth.train:4: WARNING: Block quote ends without a blank line; unexpected unindent.
/Users/meng/src/spark/python/pyspark/sql/__init__.py:docstring of pyspark.sql.DataFrame.replace:16: WARNING: Field list ends without a blank line; unexpected unindent.
/Users/meng/src/spark/python/pyspark/streaming/kafka.py:docstring of pyspark.streaming.kafka.KafkaUtils.createRDD:8: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/streaming/kafka.py:docstring of pyspark.streaming.kafka.KafkaUtils.createRDD:9: WARNING: Block quote ends without a blank line; unexpected unindent.
~~~

davies

Author: Xiangrui Meng <meng@databricks.com>

Closes apache#6221 from mengxr/SPARK-6657 and squashes the following commits:

e3f83fe [Xiangrui Meng] fix sql and streaming doc warnings
2b4371e [Xiangrui Meng] fix mllib python doc warnings
  • Loading branch information
mengxr authored and jeanlyn committed May 28, 2015
1 parent 992d1c6 commit a3e3df1
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 10 deletions.
5 changes: 2 additions & 3 deletions python/pyspark/mllib/evaluation.py
Original file line number Diff line number Diff line change
Expand Up @@ -334,11 +334,10 @@ def ndcgAt(self, k):
"""
Compute the average NDCG value of all the queries, truncated at ranking position k.
The discounted cumulative gain at position k is computed as:
sum,,i=1,,^k^ (2^{relevance of ''i''th item}^ - 1) / log(i + 1),
sum,,i=1,,^k^ (2^{relevance of ''i''th item}^ - 1) / log(i + 1),
and the NDCG is obtained by dividing the DCG value on the ground truth set.
In the current implementation, the relevance value is binary.
If a query has an empty ground truth set, zero will be used as ndcg together with
If a query has an empty ground truth set, zero will be used as NDCG together with
a log warning.
"""
return self.call("ndcgAt", int(k))
Expand Down
12 changes: 6 additions & 6 deletions python/pyspark/mllib/fpm.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,12 @@ class FPGrowth(object):
def train(cls, data, minSupport=0.3, numPartitions=-1):
"""
Computes an FP-Growth model that contains frequent itemsets.
:param data: The input data set, each element
contains a transaction.
:param minSupport: The minimal support level
(default: `0.3`).
:param numPartitions: The number of partitions used by parallel
FP-growth (default: same as input data).
:param data: The input data set, each element contains a
transaction.
:param minSupport: The minimal support level (default: `0.3`).
:param numPartitions: The number of partitions used by
parallel FP-growth (default: same as input data).
"""
model = callMLlibFunc("trainFPGrowthModel", data, float(minSupport), int(numPartitions))
return FPGrowthModel(model)
Expand Down
1 change: 1 addition & 0 deletions python/pyspark/sql/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -943,6 +943,7 @@ def replace(self, to_replace, value, subset=None):
Columns specified in subset that do not have matching data type are ignored.
For example, if `value` is a string, and subset contains a non-string column,
then the non-string column is simply ignored.
>>> df4.replace(10, 20).show()
+----+------+-----+
| age|height| name|
Expand Down
3 changes: 2 additions & 1 deletion python/pyspark/streaming/kafka.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,11 +132,12 @@ def createRDD(sc, kafkaParams, offsetRanges, leaders={},
.. note:: Experimental
Create a RDD from Kafka using offset ranges for each topic and partition.
:param sc: SparkContext object
:param kafkaParams: Additional params for Kafka
:param offsetRanges: list of offsetRange to specify topic:partition:[start, end) to consume
:param leaders: Kafka brokers for each TopicAndPartition in offsetRanges. May be an empty
map, in which case leaders will be looked up on the driver.
map, in which case leaders will be looked up on the driver.
:param keyDecoder: A function used to decode key (default is utf8_decoder)
:param valueDecoder: A function used to decode value (default is utf8_decoder)
:return: A RDD object
Expand Down

0 comments on commit a3e3df1

Please sign in to comment.