Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-1429: Debian packaging #2

Closed
wants to merge 1,270 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1270 commits
Select commit Hold shift + click to select a range
24836be
[SPARK-20920][SQL] ForkJoinPool pools are leaked when writing hive ta…
srowen Jun 13, 2017
58a8a37
[SPARK-20920][SQL] ForkJoinPool pools are leaked when writing hive ta…
srowen Jun 13, 2017
039c465
[SPARK-21060][WEB-UI] Css style about paging function is error in the…
Jun 13, 2017
2bc2c15
[SPARK-21064][CORE][TEST] Fix the default value bug in NettyBlockTran…
Jun 13, 2017
ee0e74e
[SPARK-21064][CORE][TEST] Fix the default value bug in NettyBlockTran…
Jun 13, 2017
220943d
[SPARK-20979][SS] Add RateSource to generate values for tests and ben…
zsxwing Jun 12, 2017
53212c3
[SPARK-12552][CORE] Correctly count the driver resource when recoveri…
jerryshao Jun 14, 2017
42cc830
[SPARK-20986][SQL] Reset table's statistics after PruneFileSourcePart…
lianhuiwang Jun 14, 2017
9bdc835
[SPARK-21085][SQL] Failed to read the partitioned table created by Sp…
gatorsmile Jun 14, 2017
6265119
[SPARK-20211][SQL][BACKPORT-2.2] Fix the Precision and Scale of Decim…
gatorsmile Jun 14, 2017
a890466
[SPARK-20211][SQL][BACKPORT-2.2] Fix the Precision and Scale of Decim…
gatorsmile Jun 14, 2017
3dda682
[SPARK-21089][SQL] Fix DESC EXTENDED/FORMATTED to Show Table Properties
gatorsmile Jun 14, 2017
e02e063
Revert "[SPARK-20941][SQL] Fix SubqueryExec Reuse"
gatorsmile Jun 14, 2017
af4f89c
[SPARK-20980][SQL] Rename `wholeFile` to `multiLine` for both CSV and…
gatorsmile Jun 15, 2017
b5504f6
[SPARK-20980][DOCS] update doc to reflect multiLine change
felixcheung Jun 15, 2017
76ee41f
[SPARK-16251][SPARK-20200][CORE][TEST] Flaky test: org.apache.spark.r…
jiangxb1987 Jun 15, 2017
62f2b80
[SPARK-16251][SPARK-20200][CORE][TEST] Flaky test: org.apache.spark.r…
jiangxb1987 Jun 15, 2017
a585c87
[SPARK-21111][TEST][2.2] Fix the test failure of describe.sql
gatorsmile Jun 16, 2017
9909be3
[SPARK-21072][SQL] TreeNode.mapChildren should only apply to the chil…
ConeyLiu Jun 16, 2017
915a201
[SPARK-21072][SQL] TreeNode.mapChildren should only apply to the chil…
ConeyLiu Jun 16, 2017
0ebb3b8
[SPARK-21114][TEST][2.1] Fix test failure in Spark 2.1/2.0 due to nam…
gatorsmile Jun 16, 2017
653e6f1
[SPARK-12552][FOLLOWUP] Fix flaky test for "o.a.s.deploy.master.Maste…
jerryshao Jun 16, 2017
d3deeb3
[MINOR][DOCS] Improve Running R Tests docs
wangyum Jun 16, 2017
8747f8e
[SPARK-21126] The configuration which named "spark.core.connection.au…
liu-zhaokun Jun 18, 2017
c0d4acc
[MINOR][R] Add knitr and rmarkdown packages/improve output for versio…
HyukjinKwon Jun 18, 2017
d3c79b7
[SPARK-21090][CORE] Optimize the unified memory manager code
10110346 Jun 19, 2017
fab070c
[SPARK-21132][SQL] DISTINCT modifier of function arguments should not…
gatorsmile Jun 19, 2017
f7fcdec
[SPARK-19688][STREAMING] Not to read `spark.yarn.credentials.file` fr…
Jun 19, 2017
a44c118
[SPARK-19688][STREAMING] Not to read `spark.yarn.credentials.file` fr…
Jun 19, 2017
7b50736
[SPARK-21123][DOCS][STRUCTURED STREAMING] Options for file stream sou…
Jun 19, 2017
32bd9a7
Merge branch 'branch-2.1' of github.com:apache/spark into csd-2.1
markhamstra Jun 19, 2017
e329bea
[MINOR][BUILD] Fix Java linter errors
dongjoon-hyun Jun 19, 2017
cf10fa8
[SPARK-21138][YARN] Cannot delete staging dir when the clusters of "s…
Jun 19, 2017
7799f35
[SPARK-21138][YARN] Cannot delete staging dir when the clusters of "s…
Jun 19, 2017
8bf7f1e
[SPARK-21133][CORE] Fix HighlyCompressedMapStatus#writeExternal throw…
wangyum Jun 20, 2017
514a7e6
[SPARK-20929][ML] LinearSVC should use its own threshold param
jkbradley Jun 20, 2017
b8b80f6
[SPARK-21150][SQL] Persistent view stored in Hive metastore should be…
cloud-fan Jun 20, 2017
62e442e
Preparing Spark release v2.2.0-rc5
pwendell Jun 20, 2017
e883498
Preparing development version 2.2.1-SNAPSHOT
pwendell Jun 20, 2017
8923bac
[SPARK-21123][DOCS][STRUCTURED STREAMING] Options for file stream sou…
Jun 20, 2017
529c04f
[MINOR][DOCS] Add lost <tr> tag for configuration.md
wangyum Jun 21, 2017
6b37c86
[SPARK-18016][SQL][CATALYST][BRANCH-2.1] Code Generation: Constant Po…
Jun 22, 2017
198e3a0
[SPARK-18016][SQL][CATALYST][BRANCH-2.2] Code Generation: Constant Po…
Jun 22, 2017
6ef7a5b
[SPARK-21167][SS] Decode the path generated by File sink to handle sp…
zsxwing Jun 22, 2017
1a98d5d
[SPARK-21167][SS] Decode the path generated by File sink to handle sp…
zsxwing Jun 22, 2017
d625734
[SQL][DOC] Fix documentation of lpad
actuaryzhang Jun 22, 2017
b99c0e9
Revert "[SPARK-18016][SQL][CATALYST][BRANCH-2.2] Code Generation: Con…
cloud-fan Jun 23, 2017
b6749ba
[SPARK-21165] [SQL] [2.2] Use executedPlan instead of analyzedPlan in…
gatorsmile Jun 23, 2017
7b87527
Merge branch 'branch-2.1' of github.com:apache/spark into csd-2.1
markhamstra Jun 23, 2017
9d29808
[SPARK-21144][SQL] Print a warning if the data schema and partition s…
maropu Jun 23, 2017
f160267
[SPARK-21181] Release byteBuffers to suppress netty error messages
dhruve Jun 23, 2017
f8fd3b4
[SPARK-21181] Release byteBuffers to suppress netty error messages
dhruve Jun 23, 2017
3394b06
[MINOR][DOCS] Docs in DataFrameNaFunctions.scala use wrong method
ongmingyang Jun 23, 2017
bcaf06c
[MINOR][DOCS] Docs in DataFrameNaFunctions.scala use wrong method
ongmingyang Jun 23, 2017
a3088d2
[SPARK-20555][SQL] Fix mapping of Oracle DECIMAL types to Spark types…
Jun 24, 2017
f12883e
[SPARK-20555][SQL] Fix mapping of Oracle DECIMAL types to Spark types…
Jun 24, 2017
96c04f1
[SPARK-21159][CORE] Don't try to connect to launcher in standalone cl…
Jun 24, 2017
6750db3
[SPARK-21159][CORE] Don't try to connect to launcher in standalone cl…
Jun 24, 2017
ad44ab5
[SPARK-21203][SQL] Fix wrong results of insertion of Array of Struct
gatorsmile Jun 24, 2017
0d6b701
[SPARK-21203][SQL] Fix wrong results of insertion of Array of Struct
gatorsmile Jun 24, 2017
d8e3a4a
[SPARK-21079][SQL] Calculate total size of a partition table as a sum…
mbasmanova Jun 25, 2017
26f4f34
Revert "[SPARK-18016][SQL][CATALYST][BRANCH-2.1] Code Generation: Con…
cloud-fan Jun 25, 2017
61af209
Merge branch 'branch-2.1' of github.com:apache/spark into csd-2.1
markhamstra Jun 26, 2017
970f68c
[SPARK-19104][SQL] Lambda variables in ExternalMapToCatalyst should b…
viirya Jun 27, 2017
8fdc51b
treating empty string as null for csd
ianlcsd Jun 27, 2017
f3c40d5
TableNamePreprocessor support
markhamstra Jun 27, 2017
17a04b9
[SPARK-21210][DOC][ML] Javadoc 8 fixes for ML shared param traits
Jun 29, 2017
20cf511
[SPARK-21253][CORE] Fix a bug that StreamCallback may not be notified…
zsxwing Jun 30, 2017
8de67e3
[SPARK-21253][CORE] Disable spark.reducer.maxReqSizeShuffleToMem
zsxwing Jun 30, 2017
c6ba647
[SPARK-21176][WEB UI] Limit number of selector threads for admin ui p…
IngoSchuster Jun 30, 2017
083adb0
[SPARK-21176][WEB UI] Limit number of selector threads for admin ui p…
IngoSchuster Jun 30, 2017
d16e262
[SPARK-21253][CORE][HOTFIX] Fix Scala 2.10 build
zsxwing Jun 30, 2017
8b08fd0
[SPARK-21258][SQL] Fix WindowExec complex object aggregation with spi…
hvanhovell Jun 30, 2017
d995dac
[SPARK-21258][SQL] Fix WindowExec complex object aggregation with spi…
hvanhovell Jun 30, 2017
3ecef24
Revert "[SPARK-21258][SQL] Fix WindowExec complex object aggregation …
cloud-fan Jun 30, 2017
29a0be2
[SPARK-21129][SQL] Arguments of SQL function call should not be named…
gatorsmile Jun 30, 2017
a2c7b21
Preparing Spark release v2.2.0-rc6
pwendell Jun 30, 2017
85fddf4
Preparing development version 2.2.1-SNAPSHOT
pwendell Jun 30, 2017
6fd39ea
[SPARK-21170][CORE] Utils.tryWithSafeFinallyAndFailureCallbacks throw…
Jul 1, 2017
db21b67
[SPARK-20256][SQL] SessionState should be created more lazily
dongjoon-hyun Jul 4, 2017
8f1ca69
[SPARK-20256][SQL][BRANCH-2.1] SessionState should be created more la…
dongjoon-hyun Jul 5, 2017
770fd2a
[SPARK-21300][SQL] ExternalMapToCatalyst should null-check map key pr…
ueshin Jul 5, 2017
4a4d148
Merge branch 'branch-2.1' of github.com:apache/spark into csd-2.1
markhamstra Jul 5, 2017
6e1081c
[SPARK-21312][SQL] correct offsetInBytes in UnsafeRow.writeToStream
Jul 6, 2017
7f7b63b
[SPARK-21312][SQL] correct offsetInBytes in UnsafeRow.writeToStream
Jul 6, 2017
4e53a4e
[SS][MINOR] Fix flaky test in DatastreamReaderWriterSuite. temp check…
tdas Jul 6, 2017
576fd4c
[SPARK-21267][SS][DOCS] Update Structured Streaming Documentation
tdas Jul 7, 2017
6e33965
corrected String/Path refactoring of ParquetLocationSelection
markhamstra Jul 7, 2017
3f914aa
Merge branch 'branch-2.1' of github.com:apache/spark into csd-2.1
markhamstra Jul 7, 2017
ab12848
[SPARK-21069][SS][DOCS] Add rate source to programming guide.
ScrapCodes Jul 8, 2017
7d0b1c9
[SPARK-21228][SQL][BRANCH-2.2] InSet incorrect handling of structs
bogdanrdc Jul 8, 2017
a64f108
[SPARK-21345][SQL][TEST][TEST-MAVEN] SparkSessionBuilderSuite should …
dongjoon-hyun Jul 8, 2017
c8d7855
[SPARK-20342][CORE] Update task accumulators before sending task end …
Jul 8, 2017
964332b
[SPARK-21343] Refine the document for spark.reducer.maxReqSizeShuffle…
Jul 8, 2017
5e2bfd5
[SPARK-21345][SQL][TEST][TEST-MAVEN][BRANCH-2.1] SparkSessionBuilderS…
dongjoon-hyun Jul 9, 2017
3bfad9d
[SPARK-21083][SQL][BRANCH-2.2] Store zero size and row count when ana…
wzhfy Jul 9, 2017
2c28462
[SPARK-21083][SQL][BRANCH-2.1] Store zero size and row count when ana…
wzhfy Jul 10, 2017
40fd0ce
[SPARK-21342] Fix DownloadCallback to work well with RetryingBlockFet…
Jul 10, 2017
a05edf4
[SPARK-21272] SortMergeJoin LeftAnti does not update numOutputRows
juliuszsompolski Jul 10, 2017
73df649
foo
markhamstra Jul 10, 2017
edcd9fb
[SPARK-21369][CORE] Don't use Scala Tuple2 in common/network-*
zsxwing Jul 11, 2017
399aa01
[SPARK-21366][SQL][TEST] Add sql test for window functions
jiangxb1987 Jul 11, 2017
cb6fc89
[SPARK-21219][CORE] Task retry occurs on same executor due to race co…
Jul 12, 2017
39eba30
[SPARK-18646][REPL] Set parent classloader as null for ExecutorClassL…
taroplus Jul 13, 2017
cf0719b
Revert "[SPARK-18646][REPL] Set parent classloader as null for Execut…
cloud-fan Jul 13, 2017
bfe3ba8
[SPARK-21376][YARN] Fix yarn client token expire issue when cleaning …
jerryshao Jul 13, 2017
1cb4369
[SPARK-21344][SQL] BinaryType comparison does signed byte array compa…
kiszk Jul 15, 2017
ca4d2aa
[SPARK-21344][SQL] BinaryType comparison does signed byte array compa…
kiszk Jul 15, 2017
8e85ce6
[SPARK-21267][DOCS][MINOR] Follow up to avoid referencing programming…
srowen Jul 15, 2017
0ef98fd
[SPARK-21321][SPARK CORE] Spark very verbose on shutdown
Jul 17, 2017
a9efce4
[SPARK-19104][BACKPORT-2.1][SQL] Lambda variables in ExternalMapToCat…
kiszk Jul 18, 2017
83bdb04
[SPARK-21332][SQL] Incorrect result type inferred for some decimal ex…
Jul 18, 2017
caf32b3
[SPARK-21332][SQL] Incorrect result type inferred for some decimal ex…
Jul 18, 2017
99ce551
[SPARK-21445] Make IntWrapper and LongWrapper in UTF8String Serializable
brkyvz Jul 18, 2017
49e2ada
[SPARK-18631][SQL] Changed ExchangeCoordinator re-partitioning to avo…
markhamstra Nov 29, 2016
df061fd
[SPARK-21457][SQL] ExternalCatalog.listPartitions should correctly ha…
cloud-fan Jul 18, 2017
5a0a76f
[SPARK-21414] Refine SlidingWindowFunctionFrame to avoid OOM.
Jul 19, 2017
4c212ee
[SPARK-21441][SQL] Incorrect Codegen in SortMergeJoinExec results fai…
DonnyZone Jul 19, 2017
ac20693
[SPARK-21441][SQL] Incorrect Codegen in SortMergeJoinExec results fai…
DonnyZone Jul 19, 2017
9c61833
wip
markhamstra Jul 19, 2017
2cddd1c
Merge branch 'branch-2.1' of github.com:apache/spark into csd-2.1
markhamstra Jul 19, 2017
86cd3c0
[SPARK-21464][SS] Minimize deprecation warnings caused by ProcessingT…
tdas Jul 19, 2017
308bce0
[SPARK-21446][SQL] Fix setAutoCommit never executed
DFFuture Jul 19, 2017
9949fed
[SPARK-21333][DOCS] Removed invalid joinTypes from javadoc of Dataset…
coreywoodfield Jul 19, 2017
88dccda
[SPARK-21243][CORE] Limit no. of map outputs in a shuffle fetch
dhruve Jul 21, 2017
da403b9
[SPARK-21434][PYTHON][DOCS] Add pyspark pip documentation.
holdenk Jul 21, 2017
62ca13d
[SPARK-20904][CORE] Don't report task failures to driver during shutd…
Jul 23, 2017
e5ec339
[SPARK-21383][YARN] Fix the YarnAllocator allocates more Resource
Jul 25, 2017
0af0672
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Jul 25, 2017
c91191b
[SPARK-21447][WEB UI] Spark history server fails to render compressed
Jul 25, 2017
ec50897
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Jul 25, 2017
f3df120
Change PoolSuite tests for default FAIR scheduling mode
markhamstra Jul 25, 2017
1bfd1a8
[SPARK-21494][NETWORK] Use correct app id when authenticating to exte…
Jul 26, 2017
420e6e9
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Jul 27, 2017
464a934
fix mismerge
markhamstra Jul 27, 2017
06b2ef0
[SPARK-21538][SQL] Attribute resolution inconsistency in the Dataset API
Jul 27, 2017
9379031
[SPARK-21306][ML] OneVsRest should support setWeightCol
facaiy Jul 28, 2017
df6cd35
[SPARK-21508][DOC] Fix example code provided in Spark Streaming Docum…
Jul 29, 2017
24a9bac
[SPARK-21555][SQL] RuntimeReplaceable should be compared semantically…
viirya Jul 29, 2017
66fa6bd
[SPARK-19451][SQL] rangeBetween method should accept Long value as bo…
jiangxb1987 Jul 29, 2017
e2062b9
Revert "[SPARK-19451][SQL] rangeBetween method should accept Long val…
gatorsmile Jul 30, 2017
1745434
[SPARK-21522][CORE] Fix flakiness in LauncherServerSuite.
Aug 1, 2017
79e5805
[SPARK-21593][DOCS] Fix 2 rendering errors on configuration page
srowen Aug 1, 2017
67c60d7
[SPARK-21339][CORE] spark-shell --packages option does not add jars t…
Aug 1, 2017
8d04581
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 1, 2017
397f904
[SPARK-21597][SS] Fix a potential overflow issue in EventTimeStats
zsxwing Aug 2, 2017
467ee8d
[SPARK-21546][SS] dropDuplicates should ignore watermark when it's no…
zsxwing Aug 2, 2017
8820569
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 2, 2017
690f491
[SPARK-12717][PYTHON][BRANCH-2.2] Adding thread-safe broadcast pickle…
BryanCutler Aug 3, 2017
b1a731c
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 3, 2017
1bcfa2a
Fix Java SimpleApp spark application
christiam Aug 3, 2017
f9aae8e
[SPARK-21330][SQL] Bad partitioning does not allow to read a JDBC tab…
aray Aug 4, 2017
8aa9405
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 4, 2017
841bc2f
[SPARK-21580][SQL] Integers in aggregation expressions are wrongly ta…
10110346 Aug 5, 2017
098aaec
[SPARK-21588][SQL] SQLContext.getConf(key, null) should return null
vinodkc Aug 6, 2017
7a04def
[SPARK-21621][CORE] Reset numRecordsWritten after DiskBlockObjectWrit…
ConeyLiu Aug 7, 2017
4f0eb0c
[SPARK-21647][SQL] Fix SortMergeJoin when using CROSS
gatorsmile Aug 7, 2017
43f9c84
[SPARK-21374][CORE] Fix reading globbed paths from S3 into DF with di…
Aug 5, 2017
0aacb6b
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 7, 2017
fa92a7b
[SPARK-21565][SS] Propagate metadata in attribute replacement.
Aug 7, 2017
a1c1199
[SPARK-21648][SQL] Fix confusing assert failure in JDBC source when p…
gatorsmile Aug 7, 2017
86609a9
[SPARK-21567][SQL] Dataset should work with type alias
viirya Aug 8, 2017
e87ffca
Revert "[SPARK-21567][SQL] Dataset should work with type alias"
cloud-fan Aug 8, 2017
d023314
[SPARK-21503][UI] Spark UI shows incorrect task status for a killed E…
Aug 9, 2017
7446be3
[SPARK-21523][ML] update breeze to 0.13.2 for an emergency bugfix in …
WeichenXu123 Aug 9, 2017
f6d56d2
[SPARK-21596][SS] Ensure places calling HDFSMetadataLog.get check the…
zsxwing Aug 9, 2017
3ca55ea
[SPARK-21663][TESTS] test("remote fetch below max RPC message size") …
wangjiaochun Aug 9, 2017
c909496
[SPARK-21699][SQL] Remove unused getTableOption in ExternalCatalog
rxin Aug 11, 2017
406eb1c
[SPARK-21595] Separate thresholds for buffering and spilling in Exter…
tejasapatil Aug 11, 2017
7b98077
[SPARK-21563][CORE] Fix race condition when serializing TaskDescripti…
ash211 Aug 14, 2017
dc3cdd5
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 14, 2017
48bacd3
[SPARK-21696][SS] Fix a potential issue that may generate partial sna…
zsxwing Aug 14, 2017
3a02a3c
ExchangeCoordinatorSuite cleanup
markhamstra Aug 14, 2017
d9c8e62
[SPARK-21721][SQL] Clear FileSystem deleteOnExit cache when paths are…
viirya Aug 15, 2017
f1accc8
[SPARK-21723][ML] Fix writing LibSVM (key not found: numFeatures)
Aug 16, 2017
f5ede0d
[SPARK-21656][CORE] spark dynamic allocation should not idle timeout …
Aug 16, 2017
2a96975
[SPARK-18464][SQL][BACKPORT] support old table which doesn't store sc…
cloud-fan Aug 16, 2017
851e162
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 16, 2017
fdea642
[SPARK-21739][SQL] Cast expression should initialize timezoneId when …
DonnyZone Aug 18, 2017
6c2a38a
[MINOR] Correct validateAndTransformSchema in GaussianMixture and AFT…
sharp-pixel Aug 20, 2017
0f640e9
[SPARK-21721][SQL][FOLLOWUP] Clear FileSystem deleteOnExit cache when…
viirya Aug 20, 2017
b8d83ee
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 21, 2017
526087f
[SPARK-21617][SQL] Store correct table metadata when altering schema …
Aug 21, 2017
4876824
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 23, 2017
236b2f4
[SPARK-21805][SPARKR] Disable R vignettes code on Windows
felixcheung Aug 24, 2017
a585367
[SPARK-21826][SQL] outer broadcast hash join should not throw NPE
cloud-fan Aug 24, 2017
2b4bd79
[SPARK-21681][ML] fix bug of MLOR do not work correctly when featureS…
WeichenXu123 Aug 24, 2017
4e7d45e
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 25, 2017
0d4ef2f
[SPARK-21818][ML][MLLIB] Fix bug of MultivariateOnlineSummarizer.vari…
WeichenXu123 Aug 28, 2017
59bb7eb
[SPARK-21798] No config to replace deprecated SPARK_CLASSPATH config …
Aug 28, 2017
24baf03
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 28, 2017
59529b2
[SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading remote resour…
jerryshao Aug 29, 2017
917fe66
Revert "[SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading remot…
Aug 29, 2017
a6a9944
[SPARK-21254][WEBUI] History UI performance fixes
2ooom Aug 30, 2017
952c577
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 30, 2017
d10c9dc
[SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading remote resour…
jerryshao Aug 30, 2017
14054ff
[SPARK-21834] Incorrect executor request in case of dynamic allocation
Aug 30, 2017
c412c77
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Aug 31, 2017
50f86e1
[SPARK-21884][SPARK-21477][BACKPORT-2.2][SQL] Mark LocalTableScanExec…
gatorsmile Sep 1, 2017
fb1b5f0
[SPARK-21418][SQL] NoSuchElementException: None.get in DataSourceScan…
srowen Sep 4, 2017
d0df025
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Sep 5, 2017
1f7c486
[SPARK-21925] Update trigger interval documentation in docs with beha…
brkyvz Sep 5, 2017
7da8fbf
[MINOR][DOC] Update `Partition Discovery` section to enumerate all av…
dongjoon-hyun Sep 5, 2017
9afab9a
[SPARK-21924][DOCS] Update structured streaming programming guide doc
Sep 6, 2017
a7d0b0a
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Sep 6, 2017
342cc2a
[SPARK-21901][SS] Define toString for StateOperatorProgress
jaceklaskowski Sep 6, 2017
49968de
Fixed pandoc dependency issue in python/setup.py
Sep 7, 2017
0848df1
[SPARK-21890] Credentials not being passed to add the tokens
Sep 7, 2017
4304d0b
[SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTests2 should s…
ueshin Sep 8, 2017
781a1f8
[SPARK-21915][ML][PYSPARK] Model 1 and Model 2 ParamMaps Missing
marktab Sep 8, 2017
08cb06a
[SPARK-21936][SQL][2.2] backward compatibility test framework for Hiv…
cloud-fan Sep 8, 2017
9ae7c96
[SPARK-21946][TEST] fix flaky test: "alter table: rename cached table…
kiszk Sep 8, 2017
9876821
[SPARK-21128][R][BACKPORT-2.2] Remove both "spark-warehouse" and "met…
HyukjinKwon Sep 8, 2017
182478e
[SPARK-21954][SQL] JacksonUtils should verify MapType's value type in…
viirya Sep 9, 2017
b1b5a7f
[SPARK-20098][PYSPARK] dataType's typeName fix
szalai1 Sep 10, 2017
10c6836
[SPARK-21976][DOC] Fix wrong documentation for Mean Absolute Error.
FavioVazquez Sep 12, 2017
63098dc
[DOCS] Fix unreachable links in the document
sarutak Sep 12, 2017
c66ddce
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Sep 12, 2017
b606dc1
[SPARK-18608][ML] Fix double caching
zhengruifeng Sep 12, 2017
30e7298
parquet versioning
markhamstra Sep 12, 2017
7966c84
style fix
markhamstra Sep 12, 2017
3a692e3
[SPARK-21980][SQL] References in grouping functions should be indexed…
DonnyZone Sep 13, 2017
0e8f032
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Sep 13, 2017
51e5a82
[SPARK-18608][ML][FOLLOWUP] Fix double caching for PySpark OneVsRest.
yanboliang Sep 14, 2017
42852bb
[SPARK-21985][PYSPARK] PairDeserializer is broken for double-zipped RDDs
aray Sep 17, 2017
309c401
[SPARK-21953] Show both memory and disk bytes spilled if either is pr…
ash211 Sep 18, 2017
a86831d
[SPARK-22043][PYTHON] Improves error message for show_profiles and du…
HyukjinKwon Sep 18, 2017
48d6aef
[SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSuite
cloud-fan Sep 18, 2017
504732d
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Sep 18, 2017
dfbc6a5
Parquet versioning
markhamstra Sep 18, 2017
d0f83de
Merge branch 'csd-2.2' of github.com:markhamstra/spark into csd-2.2
markhamstra Sep 18, 2017
d0234eb
[SPARK-22047][FLAKY TEST] HiveExternalCatalogVersionsSuite
cloud-fan Sep 19, 2017
6764408
[SPARK-22052] Incorrect Metric assigned in MetricsReporter.scala
Taaffy Sep 19, 2017
5d10586
[SPARK-22076][SQL] Expand.projections should not be a Stream
cloud-fan Sep 20, 2017
401ac20
[SPARK-21384][YARN] Spark + YARN fails with LocalFileSystem as defaul…
Sep 20, 2017
765fd92
[SPARK-21928][CORE] Set classloader on SerializerManager's private kryo
squito Sep 21, 2017
090b987
[SPARK-22094][SS] processAllAvailable should check the query state
zsxwing Sep 22, 2017
de6274a
[SPARK-22072][SPARK-22071][BUILD] Improve release build scripts
holdenk Sep 22, 2017
c0a34a9
[SPARK-18136] Fix SPARK_JARS_DIR for Python pip install on Windows
jsnowacki Sep 23, 2017
1a829df
[SPARK-22092] Reallocation in OffHeapColumnVector.reserveInternal cor…
ala Sep 23, 2017
211d81b
[SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts between string…
HyukjinKwon Sep 23, 2017
8acce00
[SPARK-22107] Change as to alias in python quickstart
Sep 25, 2017
9836ea1
[SPARK-22083][CORE] Release locks in MemoryStore.evictBlocksToFreeSpace
squito Sep 25, 2017
d2b369a
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Sep 25, 2017
b0f30b5
[SPARK-22120][SQL] TestHiveSparkSession.reset() should clean out Hive…
Sep 25, 2017
8f39361
Merge branch 'csd-2.2' of github.com:markhamstra/spark into csd-2.2
markhamstra Sep 25, 2017
a406473
[SPARK-22141][BACKPORT][SQL] Propagate empty relation before checking…
gengliangwang Sep 27, 2017
6dbda6e
Merge branch 'branch-2.2' of github.com:apache/spark into csd-2.2
markhamstra Sep 27, 2017
28ae8fd
Merge branch 'csd-2.2' of github.com:markhamstra/spark into csd-2.2
markhamstra Sep 27, 2017
ef02a07
SPY-1429
ianlcsd Sep 26, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
11 changes: 6 additions & 5 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -249,11 +249,11 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.8 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)
Expand Down Expand Up @@ -297,3 +297,4 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
(MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)
(MIT License) machinist (https://github.com/typelevel/machinist)
6 changes: 1 addition & 5 deletions R/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,11 +66,7 @@ To run one of them, use `./bin/spark-submit <filename> <args>`. For example:
```bash
./bin/spark-submit examples/src/main/r/dataframe.R
```
You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
```bash
R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh
```
You can run R unit tests by following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests).

### Running on YARN

Expand Down
3 changes: 1 addition & 2 deletions R/WINDOWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,9 @@ To run the SparkR unit tests on Windows, the following steps are required —ass

4. Set the environment variable `HADOOP_HOME` to the full path to the newly created `hadoop` directory.

5. Run unit tests for SparkR by running the command below. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
5. Run unit tests for SparkR by running the command below. You need to install the needed packages following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests) first:

```
R -e "install.packages('testthat', repos='http://cran.us.r-project.org')"
.\bin\spark-submit2.cmd --conf spark.hadoop.fs.defaultFS="file:///" R\pkg\tests\run-all.R
```

1 change: 1 addition & 0 deletions R/pkg/.Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
^README\.Rmd$
^src-native$
^html$
^tests/fulltests/*
4 changes: 2 additions & 2 deletions R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: SparkR
Type: Package
Version: 2.2.0
Version: 2.2.1
Title: R Frontend for Apache Spark
Description: The SparkR package provides an R Frontend for Apache Spark.
Description: Provides an R Frontend for Apache Spark.
Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),
email = "shivaram@cs.berkeley.edu"),
person("Xiangrui", "Meng", role = "aut",
Expand Down
1 change: 1 addition & 0 deletions R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ exportMethods("arrange",
"group_by",
"groupBy",
"head",
"hint",
"insertInto",
"intersect",
"isLocal",
Expand Down
33 changes: 32 additions & 1 deletion R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -591,7 +591,7 @@ setMethod("cache",
#'
#' Persist this SparkDataFrame with the specified storage level. For details of the
#' supported storage levels, refer to
#' \url{http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence}.
#' \url{http://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence}.
#'
#' @param x the SparkDataFrame to persist.
#' @param newLevel storage level chosen for the persistance. See available options in
Expand Down Expand Up @@ -2642,6 +2642,7 @@ generateAliasesForIntersectedCols <- function (x, intersectedColNames, suffix) {
#' Input SparkDataFrames can have different schemas (names and data types).
#'
#' Note: This does not remove duplicate rows across the two SparkDataFrames.
#' Also as standard in SQL, this function resolves columns by position (not by name).
#'
#' @param x A SparkDataFrame
#' @param y A SparkDataFrame
Expand Down Expand Up @@ -3642,3 +3643,33 @@ setMethod("checkpoint",
df <- callJMethod(x@sdf, "checkpoint", as.logical(eager))
dataFrame(df)
})

#' hint
#'
#' Specifies execution plan hint and return a new SparkDataFrame.
#'
#' @param x a SparkDataFrame.
#' @param name a name of the hint.
#' @param ... optional parameters for the hint.
#' @return A SparkDataFrame.
#' @family SparkDataFrame functions
#' @aliases hint,SparkDataFrame,character-method
#' @rdname hint
#' @name hint
#' @export
#' @examples
#' \dontrun{
#' df <- createDataFrame(mtcars)
#' avg_mpg <- mean(groupBy(createDataFrame(mtcars), "cyl"), "mpg")
#'
#' head(join(df, hint(avg_mpg, "broadcast"), df$cyl == avg_mpg$cyl))
#' }
#' @note hint since 2.2.0
setMethod("hint",
signature(x = "SparkDataFrame", name = "character"),
function(x, name, ...) {
parameters <- list(...)
stopifnot(all(sapply(parameters, is.character)))
jdf <- callJMethod(x@sdf, "hint", name, parameters)
dataFrame(jdf)
})
2 changes: 1 addition & 1 deletion R/pkg/R/RDD.R
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ setMethod("cacheRDD",
#'
#' Persist this RDD with the specified storage level. For details of the
#' supported storage levels, refer to
#'\url{http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence}.
#'\url{http://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence}.
#'
#' @param x The RDD to persist
#' @param newLevel The new storage level to be assigned
Expand Down
6 changes: 3 additions & 3 deletions R/pkg/R/SQLContext.R
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ setMethod("toDF", signature(x = "RDD"),
#'
#' Loads a JSON file, returning the result as a SparkDataFrame
#' By default, (\href{http://jsonlines.org/}{JSON Lines text format or newline-delimited JSON}
#' ) is supported. For JSON (one record per file), set a named property \code{wholeFile} to
#' ) is supported. For JSON (one record per file), set a named property \code{multiLine} to
#' \code{TRUE}.
#' It goes through the entire dataset once to determine the schema.
#'
Expand All @@ -348,7 +348,7 @@ setMethod("toDF", signature(x = "RDD"),
#' sparkR.session()
#' path <- "path/to/file.json"
#' df <- read.json(path)
#' df <- read.json(path, wholeFile = TRUE)
#' df <- read.json(path, multiLine = TRUE)
#' df <- jsonFile(path)
#' }
#' @name read.json
Expand Down Expand Up @@ -598,7 +598,7 @@ tableToDF <- function(tableName) {
#' df1 <- read.df("path/to/file.json", source = "json")
#' schema <- structType(structField("name", "string"),
#' structField("info", "map<string,double>"))
#' df2 <- read.df(mapTypeJsonPath, "json", schema, wholeFile = TRUE)
#' df2 <- read.df(mapTypeJsonPath, "json", schema, multiLine = TRUE)
#' df3 <- loadDF("data/test_table", "parquet", mergeSchema = "true")
#' }
#' @name read.df
Expand Down
6 changes: 5 additions & 1 deletion R/pkg/R/generics.R
Original file line number Diff line number Diff line change
Expand Up @@ -572,6 +572,10 @@ setGeneric("group_by", function(x, ...) { standardGeneric("group_by") })
#' @export
setGeneric("groupBy", function(x, ...) { standardGeneric("groupBy") })

#' @rdname hint
#' @export
setGeneric("hint", function(x, name, ...) { standardGeneric("hint") })

#' @rdname insertInto
#' @export
setGeneric("insertInto", function(x, tableName, ...) { standardGeneric("insertInto") })
Expand Down Expand Up @@ -1469,7 +1473,7 @@ setGeneric("write.ml", function(object, path, ...) { standardGeneric("write.ml")

#' @rdname awaitTermination
#' @export
setGeneric("awaitTermination", function(x, timeout) { standardGeneric("awaitTermination") })
setGeneric("awaitTermination", function(x, timeout = NULL) { standardGeneric("awaitTermination") })

#' @rdname isActive
#' @export
Expand Down
8 changes: 6 additions & 2 deletions R/pkg/R/install.R
Original file line number Diff line number Diff line change
Expand Up @@ -267,10 +267,14 @@ hadoopVersionName <- function(hadoopVersion) {
# The implementation refers to appdirs package: https://pypi.python.org/pypi/appdirs and
# adapt to Spark context
sparkCachePath <- function() {
if (.Platform$OS.type == "windows") {
if (is_windows()) {
winAppPath <- Sys.getenv("LOCALAPPDATA", unset = NA)
if (is.na(winAppPath)) {
stop(paste("%LOCALAPPDATA% not found.",
message("%LOCALAPPDATA% not found. Falling back to %USERPROFILE%.")
winAppPath <- Sys.getenv("USERPROFILE", unset = NA)
}
if (is.na(winAppPath)) {
stop(paste("%LOCALAPPDATA% and %USERPROFILE% not found.",
"Please define the environment variable",
"or restart and enter an installation path in localDir."))
} else {
Expand Down
42 changes: 18 additions & 24 deletions R/pkg/R/mllib_classification.R
Original file line number Diff line number Diff line change
Expand Up @@ -46,22 +46,25 @@ setClass("MultilayerPerceptronClassificationModel", representation(jobj = "jobj"
#' @note NaiveBayesModel since 2.0.0
setClass("NaiveBayesModel", representation(jobj = "jobj"))

#' linear SVM Model
#' Linear SVM Model
#'
#' Fits an linear SVM model against a SparkDataFrame. It is a binary classifier, similar to svm in glmnet package
#' Fits a linear SVM model against a SparkDataFrame, similar to svm in e1071 package.
#' Currently only supports binary classification model with linear kernel.
#' Users can print, make predictions on the produced model and save the model to the input path.
#'
#' @param data SparkDataFrame for training.
#' @param formula A symbolic description of the model to be fitted. Currently only a few formula
#' operators are supported, including '~', '.', ':', '+', and '-'.
#' @param regParam The regularization parameter.
#' @param regParam The regularization parameter. Only supports L2 regularization currently.
#' @param maxIter Maximum iteration number.
#' @param tol Convergence tolerance of iterations.
#' @param standardization Whether to standardize the training features before fitting the model. The coefficients
#' of models will be always returned on the original scale, so it will be transparent for
#' users. Note that with/without standardization, the models should be always converged
#' to the same solution when no regularization is applied.
#' @param threshold The threshold in binary classification, in range [0, 1].
#' @param threshold The threshold in binary classification applied to the linear model prediction.
#' This threshold can be any real number, where Inf will make all predictions 0.0
#' and -Inf will make all predictions 1.0.
#' @param weightCol The weight column name.
#' @param aggregationDepth The depth for treeAggregate (greater than or equal to 2). If the dimensions of features
#' or the number of partitions are large, this param could be adjusted to a larger size.
Expand Down Expand Up @@ -111,10 +114,10 @@ setMethod("spark.svmLinear", signature(data = "SparkDataFrame", formula = "formu
new("LinearSVCModel", jobj = jobj)
})

# Predicted values based on an LinearSVCModel model
# Predicted values based on a LinearSVCModel model

#' @param newData a SparkDataFrame for testing.
#' @return \code{predict} returns the predicted values based on an LinearSVCModel.
#' @return \code{predict} returns the predicted values based on a LinearSVCModel.
#' @rdname spark.svmLinear
#' @aliases predict,LinearSVCModel,SparkDataFrame-method
#' @export
Expand All @@ -124,36 +127,27 @@ setMethod("predict", signature(object = "LinearSVCModel"),
predict_internal(object, newData)
})

# Get the summary of an LinearSVCModel
# Get the summary of a LinearSVCModel

#' @param object an LinearSVCModel fitted by \code{spark.svmLinear}.
#' @param object a LinearSVCModel fitted by \code{spark.svmLinear}.
#' @return \code{summary} returns summary information of the fitted model, which is a list.
#' The list includes \code{coefficients} (coefficients of the fitted model),
#' \code{intercept} (intercept of the fitted model), \code{numClasses} (number of classes),
#' \code{numFeatures} (number of features).
#' \code{numClasses} (number of classes), \code{numFeatures} (number of features).
#' @rdname spark.svmLinear
#' @aliases summary,LinearSVCModel-method
#' @export
#' @note summary(LinearSVCModel) since 2.2.0
setMethod("summary", signature(object = "LinearSVCModel"),
function(object) {
jobj <- object@jobj
features <- callJMethod(jobj, "features")
labels <- callJMethod(jobj, "labels")
coefficients <- callJMethod(jobj, "coefficients")
nCol <- length(coefficients) / length(features)
coefficients <- matrix(unlist(coefficients), ncol = nCol)
intercept <- callJMethod(jobj, "intercept")
features <- callJMethod(jobj, "rFeatures")
coefficients <- callJMethod(jobj, "rCoefficients")
coefficients <- as.matrix(unlist(coefficients))
colnames(coefficients) <- c("Estimate")
rownames(coefficients) <- unlist(features)
numClasses <- callJMethod(jobj, "numClasses")
numFeatures <- callJMethod(jobj, "numFeatures")
if (nCol == 1) {
colnames(coefficients) <- c("Estimate")
} else {
colnames(coefficients) <- unlist(labels)
}
rownames(coefficients) <- unlist(features)
list(coefficients = coefficients, intercept = intercept,
numClasses = numClasses, numFeatures = numFeatures)
list(coefficients = coefficients, numClasses = numClasses, numFeatures = numFeatures)
})

# Save fitted LinearSVCModel to the input path
Expand Down
14 changes: 10 additions & 4 deletions R/pkg/R/streaming.R
Original file line number Diff line number Diff line change
Expand Up @@ -169,8 +169,10 @@ setMethod("isActive",
#' immediately.
#'
#' @param x a StreamingQuery.
#' @param timeout time to wait in milliseconds
#' @return TRUE if query has terminated within the timeout period.
#' @param timeout time to wait in milliseconds, if omitted, wait indefinitely until \code{stopQuery}
#' is called or an error has occured.
#' @return TRUE if query has terminated within the timeout period; nothing if timeout is not
#' specified.
#' @rdname awaitTermination
#' @name awaitTermination
#' @aliases awaitTermination,StreamingQuery-method
Expand All @@ -182,8 +184,12 @@ setMethod("isActive",
#' @note experimental
setMethod("awaitTermination",
signature(x = "StreamingQuery"),
function(x, timeout) {
handledCallJMethod(x@ssq, "awaitTermination", as.integer(timeout))
function(x, timeout = NULL) {
if (is.null(timeout)) {
invisible(handledCallJMethod(x@ssq, "awaitTermination"))
} else {
handledCallJMethod(x@ssq, "awaitTermination", as.integer(timeout))
}
})

#' stopQuery
Expand Down
12 changes: 12 additions & 0 deletions R/pkg/R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -899,3 +899,15 @@ basenameSansExtFromUrl <- function(url) {
isAtomicLengthOne <- function(x) {
is.atomic(x) && length(x) == 1
}

is_windows <- function() {
.Platform$OS.type == "windows"
}

hadoop_home_set <- function() {
!identical(Sys.getenv("HADOOP_HOME"), "")
}

windows_with_hadoop <- function() {
!is_windows() || hadoop_home_set()
}
Loading