Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch 2.2 sparkmlib's output of many algorithms is not clear #19347

Closed
wants to merge 455 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
455 commits
Select commit Hold shift + click to select a range
e936a96
[SPARK-20764][ML][PYSPARK][FOLLOWUP] Fix visibility discrepancy with …
May 24, 2017
1d10724
[SPARK-20631][FOLLOW-UP] Fix incorrect tests.
zero323 May 24, 2017
83aeac9
[SPARK-20862][MLLIB][PYTHON] Avoid passing float to ndarray.reshape i…
MrBago May 24, 2017
c59ad42
[SPARK-20848][SQL] Shutdown the pool after reading parquet files
viirya May 24, 2017
b7a2a16
[SPARK-20867][SQL] Move hints from Statistics into HintInfo class
rxin May 24, 2017
2405afc
[SPARK-20872][SQL] ShuffleExchange.nodeName should handle null coordi…
rednaxelafx May 25, 2017
ae65d30
[SPARK-16202][SQL][DOC] Follow-up to Correct The Description of Creat…
jaceklaskowski May 25, 2017
3f82d65
[SPARK-20403][SQL] Modify the instructions of some functions
10110346 May 25, 2017
e0aa239
[SPARK-20848][SQL][FOLLOW-UP] Shutdown the pool after reading parquet…
viirya May 25, 2017
b52a06d
[SPARK-20250][CORE] Improper OOM error when a task been killed while …
ConeyLiu May 25, 2017
8896c4e
[SPARK-19659] Fetch big blocks to disk when shuffle-read.
May 25, 2017
9cbf39f
[SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrowth.
yanboliang May 25, 2017
e01f1f2
[SPARK-20768][PYSPARK][ML] Expose numPartitions (expert) param of PyS…
facaiy May 25, 2017
022a495
[SPARK-20741][SPARK SUBMIT] Added cleanup of JARs archive generated b…
liorregev May 25, 2017
5ae1c65
[SPARK-19707][SPARK-18922][TESTS][SQL][CORE] Fix test failures/the in…
HyukjinKwon May 25, 2017
7a21de9
[SPARK-20874][EXAMPLES] Add Structured Streaming Kafka Source to exam…
zsxwing May 25, 2017
289dd17
[SPARK-20888][SQL][DOCS] Document change of default setting of spark.…
May 26, 2017
fafe283
[SPARK-20868][CORE] UnsafeShuffleWriter should verify the position af…
cloud-fan May 26, 2017
f99456b
[SPARK-20393][WEBU UI] Strengthen Spark to prevent XSS vulnerabilities
n-marion May 10, 2017
92837ae
[SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due t…
kiszk May 16, 2017
2b59ed4
[SPARK-20844] Remove experimental from Structured Streaming APIs
marmbrus May 26, 2017
30922de
[SPARK-20694][DOCS][SQL] Document DataFrameWriter partitionBy, bucket…
zero323 May 26, 2017
fc799d7
[SPARK-10643][CORE] Make spark-submit download remote files to local …
loneknightpy May 26, 2017
39f7665
[SPARK-19659][CORE][FOLLOW-UP] Fetch big blocks to disk when shuffle-…
cloud-fan May 27, 2017
f2408bd
[SPARK-20843][CORE] Add a config to set driver terminate timeout
zsxwing May 27, 2017
25e87d8
[SPARK-20897][SQL] cached self-join should not fail
cloud-fan May 27, 2017
dc51be1
[SPARK-20908][SQL] Cache Manager: Hint should be ignored in plan matc…
gatorsmile May 28, 2017
26640a2
[SPARK-20907][TEST] Use testQuietly for test suites that generate lon…
kiszk May 29, 2017
3b79e4c
[SPARK-8184][SQL] Add additional function description for weekofyear
wangyum May 29, 2017
f6730a7
[SPARK-19968][SS] Use a cached instance of `KafkaProducer` instead of…
ScrapCodes May 30, 2017
5fdc7d8
[SPARK-20924][SQL] Unable to call the function registered in the not-…
gatorsmile May 30, 2017
287440d
[SPARK-20275][UI] Do not display "Completed" column for in-progress a…
jerryshao May 31, 2017
3cad66e
[SPARK-20877][SPARKR][WIP] add timestamps to test runs
felixcheung May 31, 2017
3686c2e
[SPARK-20790][MLLIB] Correctly handle negative values for implicit fe…
May 31, 2017
f59f9a3
[SPARK-20876][SQL][BACKPORT-2.2] If the input parameter is float type…
10110346 May 31, 2017
a607a26
[SPARK-20940][CORE] Replace IllegalAccessError with IllegalStateExcep…
zsxwing Jun 1, 2017
14fda6f
[SPARK-20244][CORE] Handle incorrect bytesRead metrics when using PyS…
jerryshao Jun 1, 2017
4ab7b82
[MINOR][SQL] Fix a few function description error.
wangyum Jun 1, 2017
6a4e023
[SPARK-20941][SQL] Fix SubqueryExec Reuse
gatorsmile Jun 1, 2017
b81a702
[SPARK-20365][YARN] Remove local scheme when add path to ClassPath.
liyichao Jun 1, 2017
4cba3b5
[SPARK-20922][CORE] Add whitelist of classes that can be deserialized…
Jun 1, 2017
bb3d900
[SPARK-20854][SQL] Extend hint syntax to support expressions
bogdanrdc Jun 1, 2017
25cc800
[SPARK-20942][WEB-UI] The title style about field is error in the his…
Jun 2, 2017
ae00d49
[SPARK-20967][SQL] SharedState.externalCatalog is not really lazy
cloud-fan Jun 2, 2017
f36c3ee
[SPARK-20946][SQL] simplify the config setting logic in SparkSession.…
cloud-fan Jun 2, 2017
7f35f5b
[SPARK-20955][CORE] Intern "executorId" to reduce the memory usage
zsxwing Jun 2, 2017
9a4a8e1
[SPARK-19236][SQL][BACKPORT-2.2] Added createOrReplaceGlobalTempView …
gatorsmile Jun 2, 2017
cc5dbd5
Preparing Spark release v2.2.0-rc3
pwendell Jun 2, 2017
0c42279
Preparing development version 2.2.0-SNAPSHOT
pwendell Jun 2, 2017
6c628e7
[MINOR][SQL] Update the description of spark.sql.files.ignoreCorruptF…
gatorsmile Jun 2, 2017
b560c97
Revert "[SPARK-20946][SQL] simplify the config setting logic in Spark…
yhuai Jun 2, 2017
377cfa8
Preparing Spark release v2.2.0-rc4
pwendell Jun 3, 2017
478874e
Preparing development version 2.2.1-SNAPSHOT
pwendell Jun 3, 2017
c8bbab6
[SPARK-20974][BUILD] we should run REPL tests if SQL module has code …
cloud-fan Jun 3, 2017
acd4481
[SPARK-20790][MLLIB] Remove extraneous logging in test
Jun 3, 2017
1388fdd
[SPARK-20926][SQL] Removing exposures to guava library caused by dire…
Jun 5, 2017
421d8ec
[SPARK-20957][SS][TESTS] Fix o.a.s.sql.streaming.StreamingQueryManage…
zsxwing Jun 5, 2017
3f93d07
[SPARK-20854][TESTS] Removing duplicate test case
bogdanrdc Jun 7, 2017
9a4341b
[MINOR][DOC] Update deprecation notes on Python/Hadoop/Scala.
dongjoon-hyun Jun 7, 2017
2f5eaa9
[SPARK-20914][DOCS] Javadoc contains code that is invalid
srowen Jun 8, 2017
02cf178
[SPARK-19185][DSTREAM] Make Kafka consumer cache configurable
markgrover Jun 8, 2017
3f6812c
[SPARK-20954][SQL][BRANCH-2.2][EXTENDED] DESCRIBE ` result should be …
dongjoon-hyun Jun 9, 2017
714153c
Fixed broken link
coreywoodfield Jun 9, 2017
869af5b
Fix bug in JavaRegressionMetricsExample.
masterwugui Jun 9, 2017
815a082
[SPARK-21042][SQL] Document Dataset.union is resolution by position
rxin Jun 10, 2017
0b0be47
[SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN
felixcheung Jun 11, 2017
26003de
[SPARK-20877][SPARKR][FOLLOWUP] clean up after test move
felixcheung Jun 11, 2017
a4d78e4
[DOCS] Fix error: ambiguous reference to overloaded definition
ZiyueHuang Jun 12, 2017
e677394
[SPARK-21041][SQL] SparkSession.range should be consistent with Spark…
dongjoon-hyun Jun 12, 2017
92f7c8f
[SPARK-17914][SQL] Fix parsing of timestamp strings with nanoseconds
Jun 12, 2017
a6b7875
[SPARK-20345][SQL] Fix STS error handling logic on HiveSQLException
dongjoon-hyun Jun 12, 2017
580ecfd
[SPARK-21059][SQL] LikeSimplification can NPE on null pattern
rxin Jun 12, 2017
48a843b
[SPARK-21050][ML] Word2vec persistence overflow bug fix
jkbradley Jun 12, 2017
dae1a98
[TEST][SPARKR][CORE] Fix broken SparkSubmitSuite
felixcheung Jun 13, 2017
24836be
[SPARK-20920][SQL] ForkJoinPool pools are leaked when writing hive ta…
srowen Jun 13, 2017
039c465
[SPARK-21060][WEB-UI] Css style about paging function is error in the…
Jun 13, 2017
2bc2c15
[SPARK-21064][CORE][TEST] Fix the default value bug in NettyBlockTran…
Jun 13, 2017
220943d
[SPARK-20979][SS] Add RateSource to generate values for tests and ben…
zsxwing Jun 12, 2017
53212c3
[SPARK-12552][CORE] Correctly count the driver resource when recoveri…
jerryshao Jun 14, 2017
42cc830
[SPARK-20986][SQL] Reset table's statistics after PruneFileSourcePart…
lianhuiwang Jun 14, 2017
9bdc835
[SPARK-21085][SQL] Failed to read the partitioned table created by Sp…
gatorsmile Jun 14, 2017
6265119
[SPARK-20211][SQL][BACKPORT-2.2] Fix the Precision and Scale of Decim…
gatorsmile Jun 14, 2017
3dda682
[SPARK-21089][SQL] Fix DESC EXTENDED/FORMATTED to Show Table Properties
gatorsmile Jun 14, 2017
e02e063
Revert "[SPARK-20941][SQL] Fix SubqueryExec Reuse"
gatorsmile Jun 14, 2017
af4f89c
[SPARK-20980][SQL] Rename `wholeFile` to `multiLine` for both CSV and…
gatorsmile Jun 15, 2017
b5504f6
[SPARK-20980][DOCS] update doc to reflect multiLine change
felixcheung Jun 15, 2017
76ee41f
[SPARK-16251][SPARK-20200][CORE][TEST] Flaky test: org.apache.spark.r…
jiangxb1987 Jun 15, 2017
a585c87
[SPARK-21111][TEST][2.2] Fix the test failure of describe.sql
gatorsmile Jun 16, 2017
9909be3
[SPARK-21072][SQL] TreeNode.mapChildren should only apply to the chil…
ConeyLiu Jun 16, 2017
653e6f1
[SPARK-12552][FOLLOWUP] Fix flaky test for "o.a.s.deploy.master.Maste…
jerryshao Jun 16, 2017
d3deeb3
[MINOR][DOCS] Improve Running R Tests docs
wangyum Jun 16, 2017
8747f8e
[SPARK-21126] The configuration which named "spark.core.connection.au…
liu-zhaokun Jun 18, 2017
c0d4acc
[MINOR][R] Add knitr and rmarkdown packages/improve output for versio…
HyukjinKwon Jun 18, 2017
d3c79b7
[SPARK-21090][CORE] Optimize the unified memory manager code
10110346 Jun 19, 2017
fab070c
[SPARK-21132][SQL] DISTINCT modifier of function arguments should not…
gatorsmile Jun 19, 2017
f7fcdec
[SPARK-19688][STREAMING] Not to read `spark.yarn.credentials.file` fr…
Jun 19, 2017
7b50736
[SPARK-21123][DOCS][STRUCTURED STREAMING] Options for file stream sou…
Jun 19, 2017
e329bea
[MINOR][BUILD] Fix Java linter errors
dongjoon-hyun Jun 19, 2017
cf10fa8
[SPARK-21138][YARN] Cannot delete staging dir when the clusters of "s…
Jun 19, 2017
8bf7f1e
[SPARK-21133][CORE] Fix HighlyCompressedMapStatus#writeExternal throw…
wangyum Jun 20, 2017
514a7e6
[SPARK-20929][ML] LinearSVC should use its own threshold param
jkbradley Jun 20, 2017
b8b80f6
[SPARK-21150][SQL] Persistent view stored in Hive metastore should be…
cloud-fan Jun 20, 2017
62e442e
Preparing Spark release v2.2.0-rc5
pwendell Jun 20, 2017
e883498
Preparing development version 2.2.1-SNAPSHOT
pwendell Jun 20, 2017
529c04f
[MINOR][DOCS] Add lost <tr> tag for configuration.md
wangyum Jun 21, 2017
198e3a0
[SPARK-18016][SQL][CATALYST][BRANCH-2.2] Code Generation: Constant Po…
Jun 22, 2017
6ef7a5b
[SPARK-21167][SS] Decode the path generated by File sink to handle sp…
zsxwing Jun 22, 2017
d625734
[SQL][DOC] Fix documentation of lpad
actuaryzhang Jun 22, 2017
b99c0e9
Revert "[SPARK-18016][SQL][CATALYST][BRANCH-2.2] Code Generation: Con…
cloud-fan Jun 23, 2017
b6749ba
[SPARK-21165] [SQL] [2.2] Use executedPlan instead of analyzedPlan in…
gatorsmile Jun 23, 2017
9d29808
[SPARK-21144][SQL] Print a warning if the data schema and partition s…
maropu Jun 23, 2017
f160267
[SPARK-21181] Release byteBuffers to suppress netty error messages
dhruve Jun 23, 2017
3394b06
[MINOR][DOCS] Docs in DataFrameNaFunctions.scala use wrong method
ongmingyang Jun 23, 2017
a3088d2
[SPARK-20555][SQL] Fix mapping of Oracle DECIMAL types to Spark types…
Jun 24, 2017
96c04f1
[SPARK-21159][CORE] Don't try to connect to launcher in standalone cl…
Jun 24, 2017
ad44ab5
[SPARK-21203][SQL] Fix wrong results of insertion of Array of Struct
gatorsmile Jun 24, 2017
d8e3a4a
[SPARK-21079][SQL] Calculate total size of a partition table as a sum…
mbasmanova Jun 25, 2017
970f68c
[SPARK-19104][SQL] Lambda variables in ExternalMapToCatalyst should b…
viirya Jun 27, 2017
17a04b9
[SPARK-21210][DOC][ML] Javadoc 8 fixes for ML shared param traits
Jun 29, 2017
20cf511
[SPARK-21253][CORE] Fix a bug that StreamCallback may not be notified…
zsxwing Jun 30, 2017
8de67e3
[SPARK-21253][CORE] Disable spark.reducer.maxReqSizeShuffleToMem
zsxwing Jun 30, 2017
c6ba647
[SPARK-21176][WEB UI] Limit number of selector threads for admin ui p…
IngoSchuster Jun 30, 2017
d16e262
[SPARK-21253][CORE][HOTFIX] Fix Scala 2.10 build
zsxwing Jun 30, 2017
8b08fd0
[SPARK-21258][SQL] Fix WindowExec complex object aggregation with spi…
hvanhovell Jun 30, 2017
29a0be2
[SPARK-21129][SQL] Arguments of SQL function call should not be named…
gatorsmile Jun 30, 2017
a2c7b21
Preparing Spark release v2.2.0-rc6
pwendell Jun 30, 2017
85fddf4
Preparing development version 2.2.1-SNAPSHOT
pwendell Jun 30, 2017
6fd39ea
[SPARK-21170][CORE] Utils.tryWithSafeFinallyAndFailureCallbacks throw…
Jul 1, 2017
db21b67
[SPARK-20256][SQL] SessionState should be created more lazily
dongjoon-hyun Jul 4, 2017
770fd2a
[SPARK-21300][SQL] ExternalMapToCatalyst should null-check map key pr…
ueshin Jul 5, 2017
6e1081c
[SPARK-21312][SQL] correct offsetInBytes in UnsafeRow.writeToStream
Jul 6, 2017
4e53a4e
[SS][MINOR] Fix flaky test in DatastreamReaderWriterSuite. temp check…
tdas Jul 6, 2017
576fd4c
[SPARK-21267][SS][DOCS] Update Structured Streaming Documentation
tdas Jul 7, 2017
ab12848
[SPARK-21069][SS][DOCS] Add rate source to programming guide.
ScrapCodes Jul 8, 2017
7d0b1c9
[SPARK-21228][SQL][BRANCH-2.2] InSet incorrect handling of structs
bogdanrdc Jul 8, 2017
a64f108
[SPARK-21345][SQL][TEST][TEST-MAVEN] SparkSessionBuilderSuite should …
dongjoon-hyun Jul 8, 2017
c8d7855
[SPARK-20342][CORE] Update task accumulators before sending task end …
Jul 8, 2017
964332b
[SPARK-21343] Refine the document for spark.reducer.maxReqSizeShuffle…
Jul 8, 2017
3bfad9d
[SPARK-21083][SQL][BRANCH-2.2] Store zero size and row count when ana…
wzhfy Jul 9, 2017
40fd0ce
[SPARK-21342] Fix DownloadCallback to work well with RetryingBlockFet…
Jul 10, 2017
a05edf4
[SPARK-21272] SortMergeJoin LeftAnti does not update numOutputRows
juliuszsompolski Jul 10, 2017
edcd9fb
[SPARK-21369][CORE] Don't use Scala Tuple2 in common/network-*
zsxwing Jul 11, 2017
399aa01
[SPARK-21366][SQL][TEST] Add sql test for window functions
jiangxb1987 Jul 11, 2017
cb6fc89
[SPARK-21219][CORE] Task retry occurs on same executor due to race co…
Jul 12, 2017
39eba30
[SPARK-18646][REPL] Set parent classloader as null for ExecutorClassL…
taroplus Jul 13, 2017
cf0719b
Revert "[SPARK-18646][REPL] Set parent classloader as null for Execut…
cloud-fan Jul 13, 2017
bfe3ba8
[SPARK-21376][YARN] Fix yarn client token expire issue when cleaning …
jerryshao Jul 13, 2017
1cb4369
[SPARK-21344][SQL] BinaryType comparison does signed byte array compa…
kiszk Jul 15, 2017
8e85ce6
[SPARK-21267][DOCS][MINOR] Follow up to avoid referencing programming…
srowen Jul 15, 2017
0ef98fd
[SPARK-21321][SPARK CORE] Spark very verbose on shutdown
Jul 17, 2017
83bdb04
[SPARK-21332][SQL] Incorrect result type inferred for some decimal ex…
Jul 18, 2017
99ce551
[SPARK-21445] Make IntWrapper and LongWrapper in UTF8String Serializable
brkyvz Jul 18, 2017
df061fd
[SPARK-21457][SQL] ExternalCatalog.listPartitions should correctly ha…
cloud-fan Jul 18, 2017
5a0a76f
[SPARK-21414] Refine SlidingWindowFunctionFrame to avoid OOM.
Jul 19, 2017
4c212ee
[SPARK-21441][SQL] Incorrect Codegen in SortMergeJoinExec results fai…
DonnyZone Jul 19, 2017
86cd3c0
[SPARK-21464][SS] Minimize deprecation warnings caused by ProcessingT…
tdas Jul 19, 2017
308bce0
[SPARK-21446][SQL] Fix setAutoCommit never executed
DFFuture Jul 19, 2017
9949fed
[SPARK-21333][DOCS] Removed invalid joinTypes from javadoc of Dataset…
coreywoodfield Jul 19, 2017
88dccda
[SPARK-21243][CORE] Limit no. of map outputs in a shuffle fetch
dhruve Jul 21, 2017
da403b9
[SPARK-21434][PYTHON][DOCS] Add pyspark pip documentation.
holdenk Jul 21, 2017
62ca13d
[SPARK-20904][CORE] Don't report task failures to driver during shutd…
Jul 23, 2017
e5ec339
[SPARK-21383][YARN] Fix the YarnAllocator allocates more Resource
Jul 25, 2017
c91191b
[SPARK-21447][WEB UI] Spark history server fails to render compressed
Jul 25, 2017
1bfd1a8
[SPARK-21494][NETWORK] Use correct app id when authenticating to exte…
Jul 26, 2017
06b2ef0
[SPARK-21538][SQL] Attribute resolution inconsistency in the Dataset API
Jul 27, 2017
9379031
[SPARK-21306][ML] OneVsRest should support setWeightCol
facaiy Jul 28, 2017
df6cd35
[SPARK-21508][DOC] Fix example code provided in Spark Streaming Docum…
Jul 29, 2017
24a9bac
[SPARK-21555][SQL] RuntimeReplaceable should be compared semantically…
viirya Jul 29, 2017
66fa6bd
[SPARK-19451][SQL] rangeBetween method should accept Long value as bo…
jiangxb1987 Jul 29, 2017
e2062b9
Revert "[SPARK-19451][SQL] rangeBetween method should accept Long val…
gatorsmile Jul 30, 2017
1745434
[SPARK-21522][CORE] Fix flakiness in LauncherServerSuite.
Aug 1, 2017
79e5805
[SPARK-21593][DOCS] Fix 2 rendering errors on configuration page
srowen Aug 1, 2017
67c60d7
[SPARK-21339][CORE] spark-shell --packages option does not add jars t…
Aug 1, 2017
397f904
[SPARK-21597][SS] Fix a potential overflow issue in EventTimeStats
zsxwing Aug 2, 2017
467ee8d
[SPARK-21546][SS] dropDuplicates should ignore watermark when it's no…
zsxwing Aug 2, 2017
690f491
[SPARK-12717][PYTHON][BRANCH-2.2] Adding thread-safe broadcast pickle…
BryanCutler Aug 3, 2017
1bcfa2a
Fix Java SimpleApp spark application
christiam Aug 3, 2017
f9aae8e
[SPARK-21330][SQL] Bad partitioning does not allow to read a JDBC tab…
aray Aug 4, 2017
841bc2f
[SPARK-21580][SQL] Integers in aggregation expressions are wrongly ta…
10110346 Aug 5, 2017
098aaec
[SPARK-21588][SQL] SQLContext.getConf(key, null) should return null
vinodkc Aug 6, 2017
7a04def
[SPARK-21621][CORE] Reset numRecordsWritten after DiskBlockObjectWrit…
ConeyLiu Aug 7, 2017
4f0eb0c
[SPARK-21647][SQL] Fix SortMergeJoin when using CROSS
gatorsmile Aug 7, 2017
43f9c84
[SPARK-21374][CORE] Fix reading globbed paths from S3 into DF with di…
Aug 5, 2017
fa92a7b
[SPARK-21565][SS] Propagate metadata in attribute replacement.
Aug 7, 2017
a1c1199
[SPARK-21648][SQL] Fix confusing assert failure in JDBC source when p…
gatorsmile Aug 7, 2017
86609a9
[SPARK-21567][SQL] Dataset should work with type alias
viirya Aug 8, 2017
e87ffca
Revert "[SPARK-21567][SQL] Dataset should work with type alias"
cloud-fan Aug 8, 2017
d023314
[SPARK-21503][UI] Spark UI shows incorrect task status for a killed E…
Aug 9, 2017
7446be3
[SPARK-21523][ML] update breeze to 0.13.2 for an emergency bugfix in …
WeichenXu123 Aug 9, 2017
f6d56d2
[SPARK-21596][SS] Ensure places calling HDFSMetadataLog.get check the…
zsxwing Aug 9, 2017
3ca55ea
[SPARK-21663][TESTS] test("remote fetch below max RPC message size") …
wangjiaochun Aug 9, 2017
c909496
[SPARK-21699][SQL] Remove unused getTableOption in ExternalCatalog
rxin Aug 11, 2017
406eb1c
[SPARK-21595] Separate thresholds for buffering and spilling in Exter…
tejasapatil Aug 11, 2017
7b98077
[SPARK-21563][CORE] Fix race condition when serializing TaskDescripti…
ash211 Aug 14, 2017
48bacd3
[SPARK-21696][SS] Fix a potential issue that may generate partial sna…
zsxwing Aug 14, 2017
d9c8e62
[SPARK-21721][SQL] Clear FileSystem deleteOnExit cache when paths are…
viirya Aug 15, 2017
f1accc8
[SPARK-21723][ML] Fix writing LibSVM (key not found: numFeatures)
Aug 16, 2017
f5ede0d
[SPARK-21656][CORE] spark dynamic allocation should not idle timeout …
Aug 16, 2017
2a96975
[SPARK-18464][SQL][BACKPORT] support old table which doesn't store sc…
cloud-fan Aug 16, 2017
fdea642
[SPARK-21739][SQL] Cast expression should initialize timezoneId when …
DonnyZone Aug 18, 2017
6c2a38a
[MINOR] Correct validateAndTransformSchema in GaussianMixture and AFT…
sharp-pixel Aug 20, 2017
0f640e9
[SPARK-21721][SQL][FOLLOWUP] Clear FileSystem deleteOnExit cache when…
viirya Aug 20, 2017
526087f
[SPARK-21617][SQL] Store correct table metadata when altering schema …
Aug 21, 2017
236b2f4
[SPARK-21805][SPARKR] Disable R vignettes code on Windows
felixcheung Aug 24, 2017
a585367
[SPARK-21826][SQL] outer broadcast hash join should not throw NPE
cloud-fan Aug 24, 2017
2b4bd79
[SPARK-21681][ML] fix bug of MLOR do not work correctly when featureS…
WeichenXu123 Aug 24, 2017
0d4ef2f
[SPARK-21818][ML][MLLIB] Fix bug of MultivariateOnlineSummarizer.vari…
WeichenXu123 Aug 28, 2017
59bb7eb
[SPARK-21798] No config to replace deprecated SPARK_CLASSPATH config …
Aug 28, 2017
59529b2
[SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading remote resour…
jerryshao Aug 29, 2017
917fe66
Revert "[SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading remot…
Aug 29, 2017
a6a9944
[SPARK-21254][WEBUI] History UI performance fixes
2ooom Aug 30, 2017
d10c9dc
[SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading remote resour…
jerryshao Aug 30, 2017
14054ff
[SPARK-21834] Incorrect executor request in case of dynamic allocation
Aug 30, 2017
50f86e1
[SPARK-21884][SPARK-21477][BACKPORT-2.2][SQL] Mark LocalTableScanExec…
gatorsmile Sep 1, 2017
fb1b5f0
[SPARK-21418][SQL] NoSuchElementException: None.get in DataSourceScan…
srowen Sep 4, 2017
1f7c486
[SPARK-21925] Update trigger interval documentation in docs with beha…
brkyvz Sep 5, 2017
7da8fbf
[MINOR][DOC] Update `Partition Discovery` section to enumerate all av…
dongjoon-hyun Sep 5, 2017
9afab9a
[SPARK-21924][DOCS] Update structured streaming programming guide doc
Sep 6, 2017
342cc2a
[SPARK-21901][SS] Define toString for StateOperatorProgress
jaceklaskowski Sep 6, 2017
49968de
Fixed pandoc dependency issue in python/setup.py
Sep 7, 2017
0848df1
[SPARK-21890] Credentials not being passed to add the tokens
Sep 7, 2017
4304d0b
[SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTests2 should s…
ueshin Sep 8, 2017
781a1f8
[SPARK-21915][ML][PYSPARK] Model 1 and Model 2 ParamMaps Missing
marktab Sep 8, 2017
08cb06a
[SPARK-21936][SQL][2.2] backward compatibility test framework for Hiv…
cloud-fan Sep 8, 2017
9ae7c96
[SPARK-21946][TEST] fix flaky test: "alter table: rename cached table…
kiszk Sep 8, 2017
9876821
[SPARK-21128][R][BACKPORT-2.2] Remove both "spark-warehouse" and "met…
HyukjinKwon Sep 8, 2017
182478e
[SPARK-21954][SQL] JacksonUtils should verify MapType's value type in…
viirya Sep 9, 2017
b1b5a7f
[SPARK-20098][PYSPARK] dataType's typeName fix
szalai1 Sep 10, 2017
10c6836
[SPARK-21976][DOC] Fix wrong documentation for Mean Absolute Error.
FavioVazquez Sep 12, 2017
63098dc
[DOCS] Fix unreachable links in the document
sarutak Sep 12, 2017
b606dc1
[SPARK-18608][ML] Fix double caching
zhengruifeng Sep 12, 2017
3a692e3
[SPARK-21980][SQL] References in grouping functions should be indexed…
DonnyZone Sep 13, 2017
51e5a82
[SPARK-18608][ML][FOLLOWUP] Fix double caching for PySpark OneVsRest.
yanboliang Sep 14, 2017
42852bb
[SPARK-21985][PYSPARK] PairDeserializer is broken for double-zipped RDDs
aray Sep 17, 2017
309c401
[SPARK-21953] Show both memory and disk bytes spilled if either is pr…
ash211 Sep 18, 2017
a86831d
[SPARK-22043][PYTHON] Improves error message for show_profiles and du…
HyukjinKwon Sep 18, 2017
48d6aef
[SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSuite
cloud-fan Sep 18, 2017
d0234eb
[SPARK-22047][FLAKY TEST] HiveExternalCatalogVersionsSuite
cloud-fan Sep 19, 2017
6764408
[SPARK-22052] Incorrect Metric assigned in MetricsReporter.scala
Taaffy Sep 19, 2017
5d10586
[SPARK-22076][SQL] Expand.projections should not be a Stream
cloud-fan Sep 20, 2017
401ac20
[SPARK-21384][YARN] Spark + YARN fails with LocalFileSystem as defaul…
Sep 20, 2017
765fd92
[SPARK-21928][CORE] Set classloader on SerializerManager's private kryo
squito Sep 21, 2017
090b987
[SPARK-22094][SS] processAllAvailable should check the query state
zsxwing Sep 22, 2017
de6274a
[SPARK-22072][SPARK-22071][BUILD] Improve release build scripts
holdenk Sep 22, 2017
c0a34a9
[SPARK-18136] Fix SPARK_JARS_DIR for Python pip install on Windows
jsnowacki Sep 23, 2017
1a829df
[SPARK-22092] Reallocation in OffHeapColumnVector.reserveInternal cor…
ala Sep 23, 2017
211d81b
[SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts between string…
HyukjinKwon Sep 23, 2017
8acce00
[SPARK-22107] Change as to alias in python quickstart
Sep 25, 2017
9836ea1
[SPARK-22083][CORE] Release locks in MemoryStore.evictBlocksToFreeSpace
squito Sep 25, 2017
b0f30b5
[SPARK-22120][SQL] TestHiveSparkSession.reset() should clean out Hive…
Sep 25, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
11 changes: 6 additions & 5 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -249,11 +249,11 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.8 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)
Expand Down Expand Up @@ -297,3 +297,4 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
(MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)
(MIT License) machinist (https://github.com/typelevel/machinist)
6 changes: 1 addition & 5 deletions R/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,11 +66,7 @@ To run one of them, use `./bin/spark-submit <filename> <args>`. For example:
```bash
./bin/spark-submit examples/src/main/r/dataframe.R
```
You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
```bash
R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh
```
You can run R unit tests by following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests).

### Running on YARN

Expand Down
3 changes: 1 addition & 2 deletions R/WINDOWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,9 @@ To run the SparkR unit tests on Windows, the following steps are required —ass

4. Set the environment variable `HADOOP_HOME` to the full path to the newly created `hadoop` directory.

5. Run unit tests for SparkR by running the command below. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
5. Run unit tests for SparkR by running the command below. You need to install the needed packages following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests) first:

```
R -e "install.packages('testthat', repos='http://cran.us.r-project.org')"
.\bin\spark-submit2.cmd --conf spark.hadoop.fs.defaultFS="file:///" R\pkg\tests\run-all.R
```

1 change: 1 addition & 0 deletions R/pkg/.Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
^README\.Rmd$
^src-native$
^html$
^tests/fulltests/*
4 changes: 2 additions & 2 deletions R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: SparkR
Type: Package
Version: 2.2.0
Version: 2.2.1
Title: R Frontend for Apache Spark
Description: The SparkR package provides an R Frontend for Apache Spark.
Description: Provides an R Frontend for Apache Spark.
Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),
email = "shivaram@cs.berkeley.edu"),
person("Xiangrui", "Meng", role = "aut",
Expand Down
1 change: 1 addition & 0 deletions R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ exportMethods("arrange",
"group_by",
"groupBy",
"head",
"hint",
"insertInto",
"intersect",
"isLocal",
Expand Down
33 changes: 32 additions & 1 deletion R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -591,7 +591,7 @@ setMethod("cache",
#'
#' Persist this SparkDataFrame with the specified storage level. For details of the
#' supported storage levels, refer to
#' \url{http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence}.
#' \url{http://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence}.
#'
#' @param x the SparkDataFrame to persist.
#' @param newLevel storage level chosen for the persistance. See available options in
Expand Down Expand Up @@ -2642,6 +2642,7 @@ generateAliasesForIntersectedCols <- function (x, intersectedColNames, suffix) {
#' Input SparkDataFrames can have different schemas (names and data types).
#'
#' Note: This does not remove duplicate rows across the two SparkDataFrames.
#' Also as standard in SQL, this function resolves columns by position (not by name).
#'
#' @param x A SparkDataFrame
#' @param y A SparkDataFrame
Expand Down Expand Up @@ -3642,3 +3643,33 @@ setMethod("checkpoint",
df <- callJMethod(x@sdf, "checkpoint", as.logical(eager))
dataFrame(df)
})

#' hint
#'
#' Specifies execution plan hint and return a new SparkDataFrame.
#'
#' @param x a SparkDataFrame.
#' @param name a name of the hint.
#' @param ... optional parameters for the hint.
#' @return A SparkDataFrame.
#' @family SparkDataFrame functions
#' @aliases hint,SparkDataFrame,character-method
#' @rdname hint
#' @name hint
#' @export
#' @examples
#' \dontrun{
#' df <- createDataFrame(mtcars)
#' avg_mpg <- mean(groupBy(createDataFrame(mtcars), "cyl"), "mpg")
#'
#' head(join(df, hint(avg_mpg, "broadcast"), df$cyl == avg_mpg$cyl))
#' }
#' @note hint since 2.2.0
setMethod("hint",
signature(x = "SparkDataFrame", name = "character"),
function(x, name, ...) {
parameters <- list(...)
stopifnot(all(sapply(parameters, is.character)))
jdf <- callJMethod(x@sdf, "hint", name, parameters)
dataFrame(jdf)
})
2 changes: 1 addition & 1 deletion R/pkg/R/RDD.R
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ setMethod("cacheRDD",
#'
#' Persist this RDD with the specified storage level. For details of the
#' supported storage levels, refer to
#'\url{http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence}.
#'\url{http://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence}.
#'
#' @param x The RDD to persist
#' @param newLevel The new storage level to be assigned
Expand Down
6 changes: 3 additions & 3 deletions R/pkg/R/SQLContext.R
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ setMethod("toDF", signature(x = "RDD"),
#'
#' Loads a JSON file, returning the result as a SparkDataFrame
#' By default, (\href{http://jsonlines.org/}{JSON Lines text format or newline-delimited JSON}
#' ) is supported. For JSON (one record per file), set a named property \code{wholeFile} to
#' ) is supported. For JSON (one record per file), set a named property \code{multiLine} to
#' \code{TRUE}.
#' It goes through the entire dataset once to determine the schema.
#'
Expand All @@ -348,7 +348,7 @@ setMethod("toDF", signature(x = "RDD"),
#' sparkR.session()
#' path <- "path/to/file.json"
#' df <- read.json(path)
#' df <- read.json(path, wholeFile = TRUE)
#' df <- read.json(path, multiLine = TRUE)
#' df <- jsonFile(path)
#' }
#' @name read.json
Expand Down Expand Up @@ -598,7 +598,7 @@ tableToDF <- function(tableName) {
#' df1 <- read.df("path/to/file.json", source = "json")
#' schema <- structType(structField("name", "string"),
#' structField("info", "map<string,double>"))
#' df2 <- read.df(mapTypeJsonPath, "json", schema, wholeFile = TRUE)
#' df2 <- read.df(mapTypeJsonPath, "json", schema, multiLine = TRUE)
#' df3 <- loadDF("data/test_table", "parquet", mergeSchema = "true")
#' }
#' @name read.df
Expand Down
6 changes: 5 additions & 1 deletion R/pkg/R/generics.R
Original file line number Diff line number Diff line change
Expand Up @@ -572,6 +572,10 @@ setGeneric("group_by", function(x, ...) { standardGeneric("group_by") })
#' @export
setGeneric("groupBy", function(x, ...) { standardGeneric("groupBy") })

#' @rdname hint
#' @export
setGeneric("hint", function(x, name, ...) { standardGeneric("hint") })

#' @rdname insertInto
#' @export
setGeneric("insertInto", function(x, tableName, ...) { standardGeneric("insertInto") })
Expand Down Expand Up @@ -1469,7 +1473,7 @@ setGeneric("write.ml", function(object, path, ...) { standardGeneric("write.ml")

#' @rdname awaitTermination
#' @export
setGeneric("awaitTermination", function(x, timeout) { standardGeneric("awaitTermination") })
setGeneric("awaitTermination", function(x, timeout = NULL) { standardGeneric("awaitTermination") })

#' @rdname isActive
#' @export
Expand Down
8 changes: 6 additions & 2 deletions R/pkg/R/install.R
Original file line number Diff line number Diff line change
Expand Up @@ -267,10 +267,14 @@ hadoopVersionName <- function(hadoopVersion) {
# The implementation refers to appdirs package: https://pypi.python.org/pypi/appdirs and
# adapt to Spark context
sparkCachePath <- function() {
if (.Platform$OS.type == "windows") {
if (is_windows()) {
winAppPath <- Sys.getenv("LOCALAPPDATA", unset = NA)
if (is.na(winAppPath)) {
stop(paste("%LOCALAPPDATA% not found.",
message("%LOCALAPPDATA% not found. Falling back to %USERPROFILE%.")
winAppPath <- Sys.getenv("USERPROFILE", unset = NA)
}
if (is.na(winAppPath)) {
stop(paste("%LOCALAPPDATA% and %USERPROFILE% not found.",
"Please define the environment variable",
"or restart and enter an installation path in localDir."))
} else {
Expand Down
42 changes: 18 additions & 24 deletions R/pkg/R/mllib_classification.R
Original file line number Diff line number Diff line change
Expand Up @@ -46,22 +46,25 @@ setClass("MultilayerPerceptronClassificationModel", representation(jobj = "jobj"
#' @note NaiveBayesModel since 2.0.0
setClass("NaiveBayesModel", representation(jobj = "jobj"))

#' linear SVM Model
#' Linear SVM Model
#'
#' Fits an linear SVM model against a SparkDataFrame. It is a binary classifier, similar to svm in glmnet package
#' Fits a linear SVM model against a SparkDataFrame, similar to svm in e1071 package.
#' Currently only supports binary classification model with linear kernel.
#' Users can print, make predictions on the produced model and save the model to the input path.
#'
#' @param data SparkDataFrame for training.
#' @param formula A symbolic description of the model to be fitted. Currently only a few formula
#' operators are supported, including '~', '.', ':', '+', and '-'.
#' @param regParam The regularization parameter.
#' @param regParam The regularization parameter. Only supports L2 regularization currently.
#' @param maxIter Maximum iteration number.
#' @param tol Convergence tolerance of iterations.
#' @param standardization Whether to standardize the training features before fitting the model. The coefficients
#' of models will be always returned on the original scale, so it will be transparent for
#' users. Note that with/without standardization, the models should be always converged
#' to the same solution when no regularization is applied.
#' @param threshold The threshold in binary classification, in range [0, 1].
#' @param threshold The threshold in binary classification applied to the linear model prediction.
#' This threshold can be any real number, where Inf will make all predictions 0.0
#' and -Inf will make all predictions 1.0.
#' @param weightCol The weight column name.
#' @param aggregationDepth The depth for treeAggregate (greater than or equal to 2). If the dimensions of features
#' or the number of partitions are large, this param could be adjusted to a larger size.
Expand Down Expand Up @@ -111,10 +114,10 @@ setMethod("spark.svmLinear", signature(data = "SparkDataFrame", formula = "formu
new("LinearSVCModel", jobj = jobj)
})

# Predicted values based on an LinearSVCModel model
# Predicted values based on a LinearSVCModel model

#' @param newData a SparkDataFrame for testing.
#' @return \code{predict} returns the predicted values based on an LinearSVCModel.
#' @return \code{predict} returns the predicted values based on a LinearSVCModel.
#' @rdname spark.svmLinear
#' @aliases predict,LinearSVCModel,SparkDataFrame-method
#' @export
Expand All @@ -124,36 +127,27 @@ setMethod("predict", signature(object = "LinearSVCModel"),
predict_internal(object, newData)
})

# Get the summary of an LinearSVCModel
# Get the summary of a LinearSVCModel

#' @param object an LinearSVCModel fitted by \code{spark.svmLinear}.
#' @param object a LinearSVCModel fitted by \code{spark.svmLinear}.
#' @return \code{summary} returns summary information of the fitted model, which is a list.
#' The list includes \code{coefficients} (coefficients of the fitted model),
#' \code{intercept} (intercept of the fitted model), \code{numClasses} (number of classes),
#' \code{numFeatures} (number of features).
#' \code{numClasses} (number of classes), \code{numFeatures} (number of features).
#' @rdname spark.svmLinear
#' @aliases summary,LinearSVCModel-method
#' @export
#' @note summary(LinearSVCModel) since 2.2.0
setMethod("summary", signature(object = "LinearSVCModel"),
function(object) {
jobj <- object@jobj
features <- callJMethod(jobj, "features")
labels <- callJMethod(jobj, "labels")
coefficients <- callJMethod(jobj, "coefficients")
nCol <- length(coefficients) / length(features)
coefficients <- matrix(unlist(coefficients), ncol = nCol)
intercept <- callJMethod(jobj, "intercept")
features <- callJMethod(jobj, "rFeatures")
coefficients <- callJMethod(jobj, "rCoefficients")
coefficients <- as.matrix(unlist(coefficients))
colnames(coefficients) <- c("Estimate")
rownames(coefficients) <- unlist(features)
numClasses <- callJMethod(jobj, "numClasses")
numFeatures <- callJMethod(jobj, "numFeatures")
if (nCol == 1) {
colnames(coefficients) <- c("Estimate")
} else {
colnames(coefficients) <- unlist(labels)
}
rownames(coefficients) <- unlist(features)
list(coefficients = coefficients, intercept = intercept,
numClasses = numClasses, numFeatures = numFeatures)
list(coefficients = coefficients, numClasses = numClasses, numFeatures = numFeatures)
})

# Save fitted LinearSVCModel to the input path
Expand Down
14 changes: 10 additions & 4 deletions R/pkg/R/streaming.R
Original file line number Diff line number Diff line change
Expand Up @@ -169,8 +169,10 @@ setMethod("isActive",
#' immediately.
#'
#' @param x a StreamingQuery.
#' @param timeout time to wait in milliseconds
#' @return TRUE if query has terminated within the timeout period.
#' @param timeout time to wait in milliseconds, if omitted, wait indefinitely until \code{stopQuery}
#' is called or an error has occured.
#' @return TRUE if query has terminated within the timeout period; nothing if timeout is not
#' specified.
#' @rdname awaitTermination
#' @name awaitTermination
#' @aliases awaitTermination,StreamingQuery-method
Expand All @@ -182,8 +184,12 @@ setMethod("isActive",
#' @note experimental
setMethod("awaitTermination",
signature(x = "StreamingQuery"),
function(x, timeout) {
handledCallJMethod(x@ssq, "awaitTermination", as.integer(timeout))
function(x, timeout = NULL) {
if (is.null(timeout)) {
invisible(handledCallJMethod(x@ssq, "awaitTermination"))
} else {
handledCallJMethod(x@ssq, "awaitTermination", as.integer(timeout))
}
})

#' stopQuery
Expand Down
12 changes: 12 additions & 0 deletions R/pkg/R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -899,3 +899,15 @@ basenameSansExtFromUrl <- function(url) {
isAtomicLengthOne <- function(x) {
is.atomic(x) && length(x) == 1
}

is_windows <- function() {
.Platform$OS.type == "windows"
}

hadoop_home_set <- function() {
!identical(Sys.getenv("HADOOP_HOME"), "")
}

windows_with_hadoop <- function() {
!is_windows() || hadoop_home_set()
}
Loading