Skip to content

Commit

Permalink
use java dataset to wrap rdd api
Browse files Browse the repository at this point in the history
  • Loading branch information
goldmedal committed Sep 26, 2017
1 parent 350a93d commit 4040103
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion python/pyspark/sql/readwriter.py
Original file line number Diff line number Diff line change
Expand Up @@ -438,7 +438,10 @@ def func(iterator):
keyed = path.mapPartitions(func)
keyed._bypass_serializer = True
jrdd = keyed._jrdd.map(self._spark._jvm.BytesToString())
return self._df(self._jreader.csv(jrdd))
jdataset = self._spark._ssql_ctx.createDataset(
jrdd.rdd(),
self._spark._sc._jvm.Encoders.STRING())
return self._df(self._jreader.csv(jdataset))
else:
raise TypeError("path can be only string, list or RDD")

Expand Down

0 comments on commit 4040103

Please sign in to comment.