Skip to content

Commit

Permalink
Fixed stat.py doc test to work for Python versions printing nan or NaN.
Browse files Browse the repository at this point in the history
  • Loading branch information
jkbradley committed Aug 18, 2014
1 parent b20d90a commit 60c72d9
Showing 1 changed file with 12 additions and 10 deletions.
22 changes: 12 additions & 10 deletions python/pyspark/mllib/stat.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,16 +118,18 @@ def corr(x, y=None, method=None):
>>> from linalg import Vectors
>>> rdd = sc.parallelize([Vectors.dense([1, 0, 0, -2]), Vectors.dense([4, 5, 0, 3]),
... Vectors.dense([6, 7, 0, 8]), Vectors.dense([9, 0, 0, 1])])
>>> Statistics.corr(rdd)
array([[ 1. , 0.05564149, NaN, 0.40047142],
[ 0.05564149, 1. , NaN, 0.91359586],
[ NaN, NaN, 1. , NaN],
[ 0.40047142, 0.91359586, NaN, 1. ]])
>>> Statistics.corr(rdd, method="spearman")
array([[ 1. , 0.10540926, NaN, 0.4 ],
[ 0.10540926, 1. , NaN, 0.9486833 ],
[ NaN, NaN, 1. , NaN],
[ 0.4 , 0.9486833 , NaN, 1. ]])
>>> pearsonCorr = Statistics.corr(rdd)
>>> print str(pearsonCorr).replace('nan', 'NaN')
[[ 1. 0.05564149 NaN 0.40047142]
[ 0.05564149 1. NaN 0.91359586]
[ NaN NaN 1. NaN]
[ 0.40047142 0.91359586 NaN 1. ]]
>>> spearmanCorr = Statistics.corr(rdd, method="spearman")
>>> print str(spearmanCorr).replace('nan', 'NaN')
[[ 1. 0.10540926 NaN 0.4 ]
[ 0.10540926 1. NaN 0.9486833 ]
[ NaN NaN 1. NaN]
[ 0.4 0.9486833 NaN 1. ]]
>>> try:
... Statistics.corr(rdd, "spearman")
... print "Method name as second argument without 'method=' shouldn't be allowed."
Expand Down

0 comments on commit 60c72d9

Please sign in to comment.