Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-19900][core]Remove driver when relaunching. #17888

Closed
wants to merge 142 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
142 commits
Select commit Hold shift + click to select a range
2992b00
Remove driver when relaunching.
liyichao May 7, 2017
44baeb3
Add some.
liyichao May 7, 2017
49fa480
[SPARK-20518][CORE] Supplement the new blockidsuite unit tests
heary-cao May 7, 2017
b810cf0
[SPARK-20484][MLLIB] Add documentation to ALS code
danielyli May 7, 2017
01cf5b9
[SPARK-7481][BUILD] Add spark-hadoop-cloud module to pull in object s…
steveloughran May 7, 2017
b254568
[SPARK-20543][SPARKR][FOLLOWUP] Don't skip tests on AppVeyor
felixcheung May 7, 2017
128bcad
[MINOR][SQL][DOCS] Improve unix_timestamp's scaladoc (and typo hunting)
jaceklaskowski May 7, 2017
753d497
[SPARK-20550][SPARKR] R wrapper for Dataset.alias
zero323 May 7, 2017
11e4e90
[SPARK-16931][PYTHON][SQL] Add Python wrapper for bucketBy
zero323 May 8, 2017
43d7363
[SPARK-12297][SQL] Hive compatibility for Parquet Timestamps
squito May 8, 2017
c5389f6
[SPARK-20626][SPARKR] address date test warning with timezone on windows
felixcheung May 8, 2017
94ecc28
[SPARK-20380][SQL] Unable to set/unset table comment property using A…
sujith71955 May 8, 2017
1055955
[SPARKR][DOC] fix typo in vignettes
May 8, 2017
17fd9b1
[SPARK-20519][SQL][CORE] Modify to prevent some possible runtime exce…
10110346 May 8, 2017
abcdfb7
[SPARK-19956][CORE] Optimize a location order of blocks with topology…
ConeyLiu May 8, 2017
843aef1
[SPARK-20596][ML][TEST] Consolidate and improve ALS recommendAll test…
May 8, 2017
72ad486
[SPARK-20621][DEPLOY] Delete deprecated config parameter in 'spark-en…
ConeyLiu May 8, 2017
2a52bab
[SPARK-20605][CORE][YARN][MESOS] Deprecate not used AM and executor p…
jerryshao May 8, 2017
3723332
[SPARK-20661][SPARKR][TEST] SparkR tableNames() test fails
falaki May 8, 2017
80a8521
[SPARK-20661][SPARKR][TEST][FOLLOWUP] SparkR tableNames() test fails
felixcheung May 9, 2017
f9997d4
[SPARK-11968][MLLIB] Optimize MLLIB ALS recommendForAll
May 9, 2017
6212edb
[SPARK-20587][ML] Improve performance of ML ALS recommendForAll
May 9, 2017
577c20e
[SPARK-20615][ML][TEST] SparseVector.argmax throws IndexOutOfBoundsEx…
May 9, 2017
f1d527e
[SPARK-20606][ML] ML 2.2 QA: Remove deprecated methods for ML
yanboliang May 9, 2017
97cb700
[SPARK-20667][SQL][TESTS] Cleanup the cataloged metadata after comple…
gatorsmile May 9, 2017
ed89ab6
[SPARK-20311][SQL] Support aliases for table value functions
maropu May 9, 2017
e9be43f
[SPARK-20355] Add per application spark version on the history server…
May 9, 2017
6c1236a
[SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases
cloud-fan May 9, 2017
37a4a61
[SPARK-20674][SQL] Support registering UserDefinedFunction as named UDF
rxin May 9, 2017
1d08a08
[SPARK-19876][BUILD] Move Trigger.java to java source hierarchy
srowen May 9, 2017
9d786d9
[SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Pyt…
holdenk May 9, 2017
8083062
Revert "[SPARK-12297][SQL] Hive compatibility for Parquet Timestamps"
rxin May 9, 2017
dc8126e
Revert "[SPARK-20311][SQL] Support aliases for table value functions"
yhuai May 9, 2017
5f12faa
[SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataFrame.withWate…
uncleGen May 9, 2017
510f150
[SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when …
wangyum May 10, 2017
6f5e107
[SPARK-20590][SQL] Use Spark internal datasource if multiples are fou…
HyukjinKwon May 10, 2017
bbb810d
[SPARK-20686][SQL] PropagateEmptyRelation incorrectly handles aggrega…
JoshRosen May 10, 2017
1ebbc22
[SPARK-20670][ML] Simplify FPGrowth transform
YY-OnCall May 10, 2017
68f6527
[SPARK-20668][SQL] Modify ScalaUDF to handle nullability.
ueshin May 10, 2017
723574d
[SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsisten…
zero323 May 10, 2017
829d540
[SPARK-20630][WEB UI] Fixed column visibility in Executor Tab
ajbozarth May 10, 2017
be9d425
[SPARK-20637][CORE] Remove mention of old RDD classes from comments
michaelmior May 10, 2017
68b6184
[SPARK-20393][WEBU UI] Strengthen Spark to prevent XSS vulnerabilities
n-marion May 10, 2017
b34705e
[SPARK-20688][SQL] correctly check analysis for scalar sub-queries
cloud-fan May 10, 2017
29f16b2
[SPARK-20678][SQL] Ndv for columns not in filter condition should als…
wzhfy May 10, 2017
3f1f81a
[MINOR][BUILD] Fix lint-java breaks.
ConeyLiu May 10, 2017
65ae665
[SPARK-19447] Remove remaining references to generated rows metric
ala May 10, 2017
cb107b4
[SPARK-20689][PYSPARK] python doctest leaking bucketed table
felixcheung May 10, 2017
73f4726
[SPARK-20685] Fix BatchPythonEvaluation bug in case of single UDF w/ …
JoshRosen May 10, 2017
8e806b7
[SPARK-20606][ML] Revert "[] ML 2.2 QA: Remove deprecated methods for…
yanboliang May 11, 2017
7ab5144
[SPARK-17029] make toJSON not go through rdd form but operate on data…
May 11, 2017
501fc78
[SPARK-20569][SQL] RuntimeReplaceable functions should not take extra…
cloud-fan May 11, 2017
10e528a
[SPARK-20311][SQL] Support aliases for table value functions
maropu May 11, 2017
bfc2745
[SPARK-20416][SQL] Print UDF names in EXPLAIN
maropu May 11, 2017
df7b47b
[SPARK-20600][SS] KafkaRelation should be pretty printed in web UI
jaceklaskowski May 11, 2017
2d26b95
[SPARK-20431][SQL] Specify a schema by using a DDL-formatted string
maropu May 11, 2017
e2a5fb9
[SPARK-20399][SQL] Add a config to fallback string literal parsing co…
viirya May 12, 2017
4518530
[SPARK-20665][SQL] Bround" and "Round" function return NULL
10110346 May 12, 2017
2d4c69b
[SPARK-20718][SQL] FileSourceScanExec with different filter orders sh…
wzhfy May 12, 2017
6ab97e8
[SPARK-20704][SPARKR] change CRAN test to run single thread
felixcheung May 12, 2017
20e2ce4
[SPARK-20619][ML] StringIndexer supports multiple ways to order label
May 12, 2017
e28ba06
[SPARK-20639][SQL] Add single argument support for to_timestamp in SQ…
HyukjinKwon May 12, 2017
3ed6852
[SPARK-20554][BUILD] Remove usage of scala.language.reflectiveCalls
srowen May 12, 2017
ab7e19c
[SPARK-17424] Fix unsound substitution bug in ScalaReflection.
rdblue May 12, 2017
1f81a2a
[SPARK-20718][SQL][FOLLOWUP] Fix canonicalization for HiveTableScanExec
wzhfy May 12, 2017
c03f0b3
[SPARK-20710][SQL] Support aliases in CUBE/ROLLUP/GROUPING SETS
maropu May 12, 2017
35fe3c1
[SPARK-19951][SQL] Add string concatenate operator || to Spark SQL
maropu May 12, 2017
b07d424
[SPARK-20702][CORE] TaskContextImpl.markTaskCompleted should not hide…
zsxwing May 12, 2017
6b8fc88
[SPARK-20714][SS] Fix match error when watermark is set with timeout …
tdas May 12, 2017
130cc78
[SPARK-20594][SQL] The staging directory should be a child directory …
May 12, 2017
dce2aaf
[SPARK-20719][SQL] Support LIMIT ALL
gatorsmile May 12, 2017
6729b75
[SPARK-18772][SQL] Avoid unnecessary conversion try for special float…
HyukjinKwon May 13, 2017
826ba2f
[SPARK-20725][SQL] partial aggregate should behave correctly for same…
cloud-fan May 13, 2017
ec5741c
[DOCS][SPARKR] Use verbose names for family annotations in functions.R
zero323 May 14, 2017
2d735ea
[SPARK-20726][SPARKR] wrapper for SQL broadcast
zero323 May 14, 2017
b2b9f88
[SPARK-20705][WEB-UI] The sort function can not be used in the master…
May 15, 2017
f1f0c3a
[SPARK-20720][WEB-UI] Executor Summary' should show the exact number,…
May 15, 2017
d0b14cf
[SPARK-20730][SQL] Add an optimizer rule to combine nested Concat
maropu May 15, 2017
b05691d
[SPARK-20669][ML] LoR.family and LDA.optimizer should be case insensi…
zhengruifeng May 15, 2017
97b10f6
[SPARK-12837][SPARK-20666][CORE][FOLLOWUP] getting name should not fa…
cloud-fan May 15, 2017
c38654a
[SPARK-20716][SS] StateStore.abort() should not throw exceptions
tdas May 15, 2017
baac25b
[SPARK-17729][SQL] Enable creating hive bucketed tables
tejasapatil May 15, 2017
0df8e31
[SPARK-20717][SS] Minor tweaks to the MapGroupsWithState behavior
tdas May 15, 2017
eebadc2
[SPARK-20735][SQL][TEST] Enable cross join in TPCDSQueryBenchmark
dongjoon-hyun May 15, 2017
4474326
[SPARK-20588][SQL] Cache TimeZone instances.
ueshin May 15, 2017
703a9b4
[SPARK-20707][ML] ML deprecated APIs should be removed in major release.
yanboliang May 16, 2017
d649dfe
[SPARK-20501][ML] ML 2.2 QA: New Scala APIs, docs
yanboliang May 16, 2017
092dfcb
[SPARK-20553][ML][PYSPARK] Update ALS examples with recommend-all met…
May 16, 2017
7716bcc
[SPARK-20677][MLLIB][ML] Follow-up to ALS recommend-all performance PRs
May 16, 2017
a61e607
[SPARK-20529][CORE] Allow worker and master work with a proxy server
zsxwing May 16, 2017
0d028ce
[SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due t…
kiszk May 16, 2017
b9b7c34
[SPARK-20140][DSTREAM] Remove hardcoded kinesis retry wait and max re…
yashs360 May 16, 2017
2cb8c3d
[SQL][TRIVIAL] Lower parser log level to debug
hvanhovell May 16, 2017
d2f4887
[SPARK-20690][SQL] Subqueries in FROM should have alias names
viirya May 17, 2017
f0862f5
[SPARK-20776] Fix perf. problems in JobProgressListener caused by Tas…
JoshRosen May 17, 2017
6b8cf0b
[SPARK-20769][DOC] Incorrect documentation for using Jupyter notebook
aray May 17, 2017
94b7ff5
[SPARK-20788][CORE] Fix the Executor task reaper's false alarm warnin…
zsxwing May 17, 2017
bd86d40
[SPARK-13747][CORE] Add ThreadUtils.awaitReady and disallow Await.ready
zsxwing May 18, 2017
9a8b9b5
[SPARK-20505][ML] Add docs and examples for ml.stat.Correlation and m…
yanboliang May 18, 2017
d0d4ed4
[SPARK-20700][SQL] InferFiltersFromConstraints stackoverflows for que…
jiangxb1987 May 18, 2017
d7ec85c
[SPARK-20779][EXAMPLES] The ASF header placed in an incorrect locatio…
May 18, 2017
5a68da4
[SPARK-20796] the location of start-master.sh in spark-standalone.md …
liu-zhaokun May 18, 2017
5f8612e
[SPARK-20364][SQL] Disable Parquet predicate pushdown for fields havi…
HyukjinKwon May 18, 2017
91fb054
[DSTREAM][DOC] Add documentation for kinesis retry configurations
yashs360 May 18, 2017
53a85d9
[SPARK-20798] GenerateUnsafeProjection should check if a value is nul…
ala May 19, 2017
4edcfe9
[SPARK-20773][SQL] ParquetWriteSupport.writeFields is quadratic in nu…
tpoterba May 19, 2017
251a3b8
[SPARK-20607][CORE] Add new unit tests to ShuffleSuite
heary-cao May 19, 2017
325bc75
[SPARK-20759] SCALA_VERSION in _config.yml should be consistent with …
liu-zhaokun May 19, 2017
0ed0d06
[SPARK-20751][SQL] Add built-in SQL Function - COT
wangyum May 19, 2017
e48aec2
[SPARK-20763][SQL] The function of `month` and `day` return the value…
10110346 May 19, 2017
5ff5a06
[SPARKR][DOCS][MINOR] Use consistent names in rollup and cube examples
zero323 May 19, 2017
73f6524
[SPARKR] Fix bad examples in DataFrame methods and style issues
May 19, 2017
83e0e96
[SPARK-20506][DOCS] 2.2 migration guide
May 19, 2017
4ae1608
[SPARK-20781] the location of Dockerfile in docker.properties.templat…
liu-zhaokun May 19, 2017
d139e84
[SPARK-20806][DEPLOY] Launcher: redundant check for Spark lib dir
srowen May 20, 2017
12e46c7
[SPARK-20792][SS] Support same timeout operations in mapGroupsWithSta…
tdas May 21, 2017
a4e0b15
[SPARK-20736][PYTHON] PySpark StringIndexer supports StringOrderType
May 21, 2017
2d8d293
[SPARK-20786][SQL] Improve ceil and floor handle the value which is n…
heary-cao May 22, 2017
168b00a
[SPARK-20770][SQL] Improve ColumnStats
kiszk May 22, 2017
ada1717
[SPARK-19089][SQL] Add support for nested sequences
michalsenkyr May 22, 2017
4d15095
[SPARK-20687][MLLIB] mllib.Matrices.fromBreeze may crash when convert…
ghoto May 22, 2017
7ca6de5
[SPARK-20506][DOCS] Add HTML links to highlight list in MLlib guide f…
May 22, 2017
b87ef70
[SPARK-20591][WEB UI] Succeeded tasks num not equal in all jobs page …
fjh100456 May 22, 2017
c8ea14f
[SPARK-20609][CORE] Run the SortShuffleSuite unit tests have residual…
heary-cao May 22, 2017
fb718ec
[SPARK-20813][WEB UI] Fixed Web UI executor page tab search by status…
May 22, 2017
506195a
[SPARK-20801] Record accurate size of blocks in MapStatus when it's a…
May 22, 2017
f03d8ed
[SPARK-20831][SQL] Fix INSERT OVERWRITE data source tables with IF NO…
gatorsmile May 22, 2017
a50b59c
[SPARK-20764][ML][PYSPARK] Fix visibility discrepancy with numInstanc…
May 22, 2017
321fd5b
[SPARK-20756][YARN] yarn-shuffle jar references unshaded guava
markgrover May 22, 2017
cc76ed6
[SPARK-15767][ML][SPARKR] Decision Tree wrapper in SparkR
zhengruifeng May 22, 2017
7b5a638
[SPARK-20814][MESOS] Restore support for spark.executor.extraClassPath.
May 22, 2017
02c0df1
[SPARK-20751][SQL][FOLLOWUP] Add cot test in MathExpressionsSuite
wangyum May 22, 2017
e3fd39c
[SPARK-17410][SPARK-17284] Move Hive-generated Stats Info to HiveClie…
gatorsmile May 23, 2017
605f301
[SPARK-20815][SPARKR] NullPointerException in RPackageUtils#checkMani…
jrshust May 23, 2017
144dcac
[SPARK-20727] Skip tests that use Hadoop utils on CRAN Windows
shivaram May 23, 2017
c7fc0ad
[SPARK-20399][SQL][FOLLOW-UP] Add a config to fallback string literal…
viirya May 23, 2017
2eec959
[MINOR][SPARKR][ML] Joint coefficients with intercept for SparkR line…
yanboliang May 23, 2017
11fce17
[SPARK-20857][SQL] Generic resolved hint node
rxin May 23, 2017
903ef7e
[SPARK-15648][SQL] Add teradataDialect for JDBC connection to Teradata
May 23, 2017
ce4d7a9
Remove driver when relaunching.
liyichao May 7, 2017
6b65755
Add some.
liyichao May 7, 2017
fab6856
Add a test case.
liyichao May 24, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
10 changes: 5 additions & 5 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -249,11 +249,11 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.8 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)
Expand Down
6 changes: 6 additions & 0 deletions R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ exportMethods("glm",
"spark.als",
"spark.kstest",
"spark.logit",
"spark.decisionTree",
"spark.randomForest",
"spark.gbt",
"spark.bisectingKmeans",
Expand All @@ -84,6 +85,7 @@ exportClasses("SparkDataFrame")
exportMethods("arrange",
"as.data.frame",
"attach",
"broadcast",
"cache",
"checkpoint",
"coalesce",
Expand Down Expand Up @@ -413,6 +415,8 @@ export("as.DataFrame",
"print.summary.GeneralizedLinearRegressionModel",
"read.ml",
"print.summary.KSTest",
"print.summary.DecisionTreeRegressionModel",
"print.summary.DecisionTreeClassificationModel",
"print.summary.RandomForestRegressionModel",
"print.summary.RandomForestClassificationModel",
"print.summary.GBTRegressionModel",
Expand Down Expand Up @@ -451,6 +455,8 @@ S3method(print, structField)
S3method(print, structType)
S3method(print, summary.GeneralizedLinearRegressionModel)
S3method(print, summary.KSTest)
S3method(print, summary.DecisionTreeRegressionModel)
S3method(print, summary.DecisionTreeClassificationModel)
S3method(print, summary.RandomForestRegressionModel)
S3method(print, summary.RandomForestClassificationModel)
S3method(print, summary.GBTRegressionModel)
Expand Down
75 changes: 65 additions & 10 deletions R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -549,7 +549,7 @@ setMethod("registerTempTable",
#' sparkR.session()
#' df <- read.df(path, "parquet")
#' df2 <- read.df(path2, "parquet")
#' createOrReplaceTempView(df, "table1")
#' saveAsTable(df, "table1")
#' insertInto(df2, "table1", overwrite = TRUE)
#'}
#' @note insertInto since 1.4.0
Expand Down Expand Up @@ -1125,7 +1125,8 @@ setMethod("dim",
#' path <- "path/to/file.json"
#' df <- read.json(path)
#' collected <- collect(df)
#' firstName <- collected[[1]]$name
#' class(collected)
#' firstName <- names(collected)[1]
#' }
#' @note collect since 1.4.0
setMethod("collect",
Expand Down Expand Up @@ -2814,7 +2815,7 @@ setMethod("except",
#' path <- "path/to/file.json"
#' df <- read.json(path)
#' write.df(df, "myfile", "parquet", "overwrite")
#' saveDF(df, parquetPath2, "parquet", mode = saveMode, mergeSchema = mergeSchema)
#' saveDF(df, parquetPath2, "parquet", mode = "append", mergeSchema = TRUE)
#' }
#' @note write.df since 1.4.0
setMethod("write.df",
Expand Down Expand Up @@ -3097,8 +3098,8 @@ setMethod("fillna",
#' @family SparkDataFrame functions
#' @aliases as.data.frame,SparkDataFrame-method
#' @rdname as.data.frame
#' @examples \dontrun{
#'
#' @examples
#' \dontrun{
#' irisDF <- createDataFrame(iris)
#' df <- as.data.frame(irisDF[irisDF$Species == "setosa", ])
#' }
Expand Down Expand Up @@ -3175,7 +3176,8 @@ setMethod("with",
#' @aliases str,SparkDataFrame-method
#' @family SparkDataFrame functions
#' @param object a SparkDataFrame
#' @examples \dontrun{
#' @examples
#' \dontrun{
#' # Create a SparkDataFrame from the Iris dataset
#' irisDF <- createDataFrame(iris)
#'
Expand Down Expand Up @@ -3667,8 +3669,8 @@ setMethod("checkpoint",
#' mean(cube(df, "cyl", "gear", "am"), "mpg")
#'
#' # Following calls are equivalent
#' agg(cube(carsDF), mean(carsDF$mpg))
#' agg(carsDF, mean(carsDF$mpg))
#' agg(cube(df), mean(df$mpg))
#' agg(df, mean(df$mpg))
#' }
#' @note cube since 2.3.0
#' @seealso \link{agg}, \link{groupBy}, \link{rollup}
Expand Down Expand Up @@ -3702,8 +3704,8 @@ setMethod("cube",
#' mean(rollup(df, "cyl", "gear", "am"), "mpg")
#'
#' # Following calls are equivalent
#' agg(rollup(carsDF), mean(carsDF$mpg))
#' agg(carsDF, mean(carsDF$mpg))
#' agg(rollup(df), mean(df$mpg))
#' agg(df, mean(df$mpg))
#' }
#' @note rollup since 2.3.0
#' @seealso \link{agg}, \link{cube}, \link{groupBy}
Expand Down Expand Up @@ -3745,3 +3747,56 @@ setMethod("hint",
jdf <- callJMethod(x@sdf, "hint", name, parameters)
dataFrame(jdf)
})

#' alias
#'
#' @aliases alias,SparkDataFrame-method
#' @family SparkDataFrame functions
#' @rdname alias
#' @name alias
#' @export
#' @examples
#' \dontrun{
#' df <- alias(createDataFrame(mtcars), "mtcars")
#' avg_mpg <- alias(agg(groupBy(df, df$cyl), avg(df$mpg)), "avg_mpg")
#'
#' head(select(df, column("mtcars.mpg")))
#' head(join(df, avg_mpg, column("mtcars.cyl") == column("avg_mpg.cyl")))
#' }
#' @note alias(SparkDataFrame) since 2.3.0
setMethod("alias",
signature(object = "SparkDataFrame"),
function(object, data) {
stopifnot(is.character(data))
sdf <- callJMethod(object@sdf, "alias", data)
dataFrame(sdf)
})

#' broadcast
#'
#' Return a new SparkDataFrame marked as small enough for use in broadcast joins.
#'
#' Equivalent to \code{hint(x, "broadcast")}.
#'
#' @param x a SparkDataFrame.
#' @return a SparkDataFrame.
#'
#' @aliases broadcast,SparkDataFrame-method
#' @family SparkDataFrame functions
#' @rdname broadcast
#' @name broadcast
#' @export
#' @examples
#' \dontrun{
#' df <- createDataFrame(mtcars)
#' avg_mpg <- mean(groupBy(createDataFrame(mtcars), "cyl"), "mpg")
#'
#' head(join(df, broadcast(avg_mpg), df$cyl == avg_mpg$cyl))
#' }
#' @note broadcast since 2.3.0
setMethod("broadcast",
signature(x = "SparkDataFrame"),
function(x) {
sdf <- callJStatic("org.apache.spark.sql.functions", "broadcast", x@sdf)
dataFrame(sdf)
})
3 changes: 2 additions & 1 deletion R/pkg/R/WindowSpec.R
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,8 @@ setMethod("rangeBetween",
#' @aliases over,Column,WindowSpec-method
#' @family colum_func
#' @export
#' @examples \dontrun{
#' @examples
#' \dontrun{
#' df <- createDataFrame(mtcars)
#'
#' # Partition by am (transmission) and order by hp (horsepower)
Expand Down
20 changes: 11 additions & 9 deletions R/pkg/R/column.R
Original file line number Diff line number Diff line change
Expand Up @@ -130,19 +130,20 @@ createMethods <- function() {

createMethods()

#' alias
#'
#' Set a new name for a column
#'
#' @param object Column to rename
#' @param data new name to use
#'
#' @rdname alias
#' @name alias
#' @aliases alias,Column-method
#' @family colum_func
#' @export
#' @note alias since 1.4.0
#' @examples
#' \dontrun{
#' df <- createDataFrame(iris)
#'
#' head(select(
#' df, alias(df$Sepal_Length, "slength"), alias(df$Petal_Length, "plength")
#' ))
#' }
#' @note alias(Column) since 1.4.0
setMethod("alias",
signature(object = "Column"),
function(object, data) {
Expand Down Expand Up @@ -244,7 +245,8 @@ setMethod("between", signature(x = "Column"),
#' @family colum_func
#' @aliases cast,Column-method
#'
#' @examples \dontrun{
#' @examples
#' \dontrun{
#' cast(df$age, "string")
#' }
#' @note cast since 1.4.0
Expand Down
4 changes: 2 additions & 2 deletions R/pkg/R/context.R
Original file line number Diff line number Diff line change
Expand Up @@ -258,15 +258,15 @@ includePackage <- function(sc, pkg) {
#'
#' # Large Matrix object that we want to broadcast
#' randomMat <- matrix(nrow=100, ncol=10, data=rnorm(1000))
#' randomMatBr <- broadcast(sc, randomMat)
#' randomMatBr <- broadcastRDD(sc, randomMat)
#'
#' # Use the broadcast variable inside the function
#' useBroadcast <- function(x) {
#' sum(value(randomMatBr) * x)
#' }
#' sumRDD <- lapply(rdd, useBroadcast)
#'}
broadcast <- function(sc, object) {
broadcastRDD <- function(sc, object) {
objName <- as.character(substitute(object))
serializedObj <- serialize(object, connection = NULL)

Expand Down
Loading