Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch 2.0 #18311

Closed
wants to merge 1,644 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1644 commits
Select commit Hold shift + click to select a range
380b099
[SPARK-17612][SQL][BRANCH-2.0] Support `DESCRIBE table PARTITION` SQL…
dongjoon-hyun Oct 7, 2016
3487b02
[SPARK-17805][PYSPARK] Fix in sqlContext.read.text when pass in list …
BryanCutler Oct 7, 2016
9f2eb27
[SPARK-17707][WEBUI] Web UI prevents spark-submit application to be f…
srowen Oct 7, 2016
f460a19
[SPARK-17346][SQL][TEST-MAVEN] Add Kafka source for Structured Stream…
zsxwing Oct 7, 2016
a84d8ef
[SPARK-17782][STREAMING][BUILD] Add Kafka 0.10 project to build modules
hvanhovell Oct 7, 2016
6d056c1
[SPARK-17806] [SQL] fix bug in join key rewritten in HashJoin
Oct 7, 2016
d27df35
[SPARK-17832][SQL] TableIdentifier.quotedString creates un-parseable …
jiangxb1987 Oct 10, 2016
d719e9a
[SPARK-17417][CORE] Fix # of partitions for Reliable RDD checkpointing
dhruve Oct 10, 2016
ff9f5bb
[SPARK-17738][TEST] Fix flaky test in ColumnTypeSuite
Oct 11, 2016
a6b5e1d
[SPARK-17346][SQL][TESTS] Fix the flaky topic deletion in KafkaSource…
zsxwing Oct 11, 2016
5ec3e66
[SPARK-17816][CORE][BRANCH-2.0] Fix ConcurrentModificationException i…
seyfe Oct 11, 2016
e68e95e
Fix hadoop.version in building-spark.md
apivovarov Oct 12, 2016
f3d82b5
[SPARK-17880][DOC] The url linking to `AccumulatorV2` in the document…
sarutak Oct 12, 2016
f12b74c
[SPARK-17853][STREAMING][KAFKA][DOC] make it clear that reusing group…
koeninger Oct 12, 2016
4dcbde4
[SPARK-17808][PYSPARK] Upgraded version of Pyrolite to 4.13
BryanCutler Oct 11, 2016
5451541
[SPARK-17884][SQL] To resolve Null pointer exception when casting fro…
priyankagar Oct 12, 2016
d55ba30
[SPARK-17790][SPARKR] Support for parallelizing R data.frame larger t…
falaki Oct 12, 2016
050b817
[SPARK-17782][STREAMING][KAFKA] alternative eliminate race condition …
koeninger Oct 12, 2016
5903dab
[SPARK-16827][BRANCH-2.0] Avoid reporting spill metrics as shuffle me…
bchocho Oct 13, 2016
ab00e41
[SPARK-17876] Write StructuredStreaming WAL to a stream instead of ma…
brkyvz Oct 13, 2016
d38f38a
minor doc fix for Row.scala
david-weiluo-ren Oct 13, 2016
d7fa3e3
[SPARK-17834][SQL] Fetch the earliest offsets manually in KafkaSource…
zsxwing Oct 13, 2016
c53b837
[SPARK-17863][SQL] should not add column into Distinct
Oct 14, 2016
2a1b10b
[SPARK-17953][DOCUMENTATION] Fix typo in SparkSession scaladoc
tae-jun Oct 15, 2016
3cc2fe5
[SPARK-17819][SQL][BRANCH-2.0] Support default database in connection…
dongjoon-hyun Oct 17, 2016
ca66f52
[MINOR][SQL] Add prettyName for current_database function
weiqingy Oct 17, 2016
d1a0211
[SPARK-17892][SQL][2.0] Do Not Optimize Query in CTAS More Than Once …
gatorsmile Oct 17, 2016
a0d9015
Fix example of tf_idf with minDocFreq
maximerihouey Oct 17, 2016
881e0eb
[SPARK-17731][SQL][STREAMING] Metrics for structured streaming for br…
tdas Oct 17, 2016
01520de
[SQL][STREAMING][TEST] Fix flaky tests in StreamingQueryListenerSuite
lw-lin Oct 18, 2016
9e806f2
[SQL][STREAMING][TEST] Follow up to remove Option.contains for Scala …
tdas Oct 18, 2016
2aa2583
[SPARK-17751][SQL][BACKPORT-2.0] Remove spark.sql.eagerAnalysis and O…
gatorsmile Oct 18, 2016
26e978a
[SPARK-17711] Compress rolled executor log
loneknightpy Oct 18, 2016
6ef9231
[MINOR][DOC] Add more built-in sources in sql-programming-guide.md
weiqingy Oct 18, 2016
f6b8793
[SPARK-17841][STREAMING][KAFKA] drain commitQueue
koeninger Oct 18, 2016
99943bf
[SPARK-17731][SQL][STREAMING][FOLLOWUP] Refactored StreamingQueryList…
tdas Oct 19, 2016
3796a98
[SPARK-17711][TEST-HADOOP2.2] Fix hadoop2.2 compilation error
loneknightpy Oct 19, 2016
cdd2570
[SPARK-18001][DOCUMENT] fix broke link to SparkDataFrame
Wenpei Oct 19, 2016
995f602
[SPARK-17989][SQL] Check ascendingOrder type in sort_array function r…
HyukjinKwon Oct 20, 2016
4131623
[SPARK-18003][SPARK CORE] Fix bug of RDD zipWithIndex & zipWithUnique…
WeichenXu123 Oct 20, 2016
e8923d2
[SPARK-17999][KAFKA][SQL] Add getPreferredLocations for KafkaSourceRDD
jerryshao Oct 20, 2016
6cc6cb2
[SPARKR] fix warnings
felixcheung Oct 21, 2016
a65d40a
[SPARK-18034] Upgrade to MiMa 0.1.11 to fix flakiness
JoshRosen Oct 21, 2016
78458a7
[SPARK-17811] SparkR cannot parallelize data.frame with NA or NULL in…
falaki Oct 21, 2016
af2e6e0
[SPARK-17926][SQL][STREAMING] Added json for statuses
tdas Oct 21, 2016
b113b5d
[SPARK-17929][CORE] Fix deadlock when CoarseGrainedSchedulerBackend r…
scwf Oct 21, 2016
3e9840f
[SPARK-17812][SQL][KAFKA] Assign and specific startingOffsets for str…
koeninger Oct 21, 2016
d3c78c4
[STREAMING][KAFKA][DOC] clarify kafka settings needed for larger batches
koeninger Oct 21, 2016
a0c03c9
[SPARK-16606][MINOR] Tiny follow-up to , to correct more instances of…
srowen Oct 22, 2016
b959dab
[SPARK-17986][ML] SQLTransformer should remove temporary tables
drewrobb Oct 22, 2016
3d58787
[SPARK-17698][SQL] Join predicates should not contain filter clauses
tejasapatil Oct 22, 2016
e21e9d4
[SPARK-17123][SQL][BRANCH-2.0] Use type-widened encoder for DataFrame…
HyukjinKwon Oct 23, 2016
0e0d83a
[SPARKR][BRANCH-2.0] R merge API doc and example fix
felixcheung Oct 23, 2016
00a2e01
[SPARK-18058][SQL] [BRANCH-2.0]Comparing column types ignoring Nullab…
CodingCat Oct 23, 2016
064db17
[SPARK-17810][SQL] Default spark.sql.warehouse.dir is relative to loc…
srowen Oct 24, 2016
aef65ac
[SPARK-17153][SQL] Should read partition data when reading new files …
viirya Sep 26, 2016
bad15bc
[SPARK-18044][STREAMING] FileStreamSource should not infer partitions…
cloud-fan Oct 21, 2016
1c1e847
[SPARK-17624][SQL][STREAMING][TEST] Fixed flaky StateStoreSuite.maint…
tdas Oct 25, 2016
7c8d9a5
[SPARK-18070][SQL] binary operator should not consider nullability wh…
cloud-fan Oct 25, 2016
912487e
[SPARK-16988][SPARK SHELL] spark history server log needs to be fixed…
Oct 25, 2016
c2cce2e
[SPARK-18022][SQL] java.lang.NullPointerException instead of real exc…
srowen Oct 26, 2016
192c1dd
[SPARK-17733][SQL] InferFiltersFromConstraints rule never terminates …
jiangxb1987 Oct 26, 2016
b4a7b65
[SPARK-18093][SQL] Fix default value test in SQLConfSuite to work rega…
markgrover Oct 26, 2016
773fbfe
[SPARK-16304] LinkageError should not crash Spark executor
petermaxlee Jul 6, 2016
5b81b01
[SPARK-18063][SQL] Failed to infer constraints over multiple aliases
jiangxb1987 Oct 26, 2016
b482b3d
[SPARK-18104][DOC] Don't build KafkaSource doc
zsxwing Oct 26, 2016
76b71ee
[SPARK-13747][SQL] Fix concurrent executions in ForkJoinPool for SQL …
zsxwing Oct 26, 2016
1c2908e
Preparing Spark release v2.0.2-rc1
pwendell Oct 26, 2016
72b3cff
Preparing development version 2.0.3-SNAPSHOT
pwendell Oct 26, 2016
ea205e3
[SPARK-16963][STREAMING][SQL] Changes to Source trait and related imp…
frreiss Oct 27, 2016
dcf2f09
[SPARK-18009][SQL] Fix ClassCastException while calling toLocalIterat…
dilipbiswal Oct 27, 2016
1a4be51
[SPARK-18132] Fix checkstyle
yhuai Oct 27, 2016
6fb1f73
[SPARK-17813][SQL][KAFKA] Maximum data per trigger
koeninger Oct 27, 2016
578e40e
[SPARK-16963][SQL] Fix test "StreamExecution metadata garbage collect…
zsxwing Oct 27, 2016
9ed8976
[SPARK-18164][SQL] ForeachSink should fail the Spark job if `process`…
zsxwing Oct 29, 2016
9f92474
[SPARK-16312][FOLLOW-UP][STREAMING][KAFKA][DOC] Add java code snippet…
lw-lin Oct 30, 2016
300d596
[SPARK-18143][SQL] Ignore Structured Streaming event logs to avoid br…
zsxwing Oct 31, 2016
e06f43e
[SPARK-18030][TESTS] Fix flaky FileStreamSourceSuite by not deleting …
zsxwing Oct 31, 2016
4d2672a
[SPARK-18114][MESOS] Fix mesos cluster scheduler generage command opt…
Nov 1, 2016
58655f5
[SPARK-18189][SQL] Fix serialization issue in KeyValueGroupedDataset
seyfe Nov 1, 2016
4176da8
[SPARK-18148][SQL] Misleading Error Message for Aggregation Without W…
jiangxb1987 Nov 1, 2016
a01b950
[SPARK-18114][HOTFIX] Fix line-too-long style error from backport of …
srowen Nov 1, 2016
a6abe1e
Preparing Spark release v2.0.2-rc2
pwendell Nov 1, 2016
d401a74
Preparing development version 2.0.3-SNAPSHOT
pwendell Nov 1, 2016
81f0804
[SPARK-18144][SQL] logging StreamingQueryListener$QueryStartedEvent
CodingCat Nov 2, 2016
09178b6
[SPARK-18133][BRANCH-2.0][EXAMPLES][ML] Python ML Pipeline Exampl…
jagadeesanas2 Nov 2, 2016
eb790c5
[SPARK-16796][WEB UI] Mask spark.authenticate.secret on Spark environ…
Devian-ua Aug 6, 2016
1696bcf
[SPARK-18160][CORE][YARN] spark.files & spark.jars should not be pass…
zjffdu Nov 2, 2016
3253ae7
[SPARK-18111][SQL] Wrong approximate quantile answer when multiple re…
wzhfy Nov 2, 2016
dae1581
[SPARK-18200][GRAPHX] Support zero as an initial capacity in OpenHashSet
dongjoon-hyun Nov 3, 2016
c864e8a
[SPARK-18200][GRAPHX][FOLLOW-UP] Support zero as an initial capacity …
dongjoon-hyun Nov 4, 2016
399597b
[SPARK-17337][SPARK-16804][SQL][BRANCH-2.0] Backport subquery related…
hvanhovell Nov 4, 2016
8b99e20
[SPARK-18189][SQL][FOLLOWUP] Move test from ReplSuite to prevent java…
seyfe Nov 5, 2016
d023c6c
[SPARK-17981][SPARK-17957][SQL][BACKPORT-2.0] Fix Incorrect Nullabili…
gatorsmile Nov 5, 2016
5b9eb42
[SPARK-17693][SQL][BACKPORT-2.0] Fixed Insert Failure To Data Source …
gatorsmile Nov 5, 2016
dd5cb0a
[SPARK-17849][SQL] Fix NPE problem when using grouping sets
Nov 5, 2016
b5d7217
[SPARK-18125][SQL][BRANCH-2.0] Fix a compilation error in codegen due…
viirya Nov 7, 2016
10525c2
[SPARK-18283][STRUCTURED STREAMING][KAFKA] Added test to check whethe…
tdas Nov 7, 2016
584354e
Preparing Spark release v2.0.2-rc3
pwendell Nov 7, 2016
a39f8c1
Preparing development version 2.0.3-SNAPSHOT
pwendell Nov 7, 2016
2a6850c
[SPARK-18137][SQL] Fix RewriteDistinctAggregates UnresolvedException …
Nov 8, 2016
f441b9a
[SPARK-17703][SQL][BACKPORT-2.0] Add unnamed version of addReferenceO…
ueshin Nov 8, 2016
8aa419b
[SPARK-18280][CORE] Fix potential deadlock in `StandaloneSchedulerBac…
zsxwing Nov 8, 2016
0cceb1b
[SPARK-18342] Make rename failures fatal in HDFSBackedStateStore
brkyvz Nov 8, 2016
bdddc66
[SPARK-18368] Fix regexp_replace with task serialization.
rdblue Nov 9, 2016
c8628e8
Revert "[SPARK-18368] Fix regexp_replace with task serialization."
yhuai Nov 9, 2016
6e73105
[SPARK-18368][SQL] Fix regexp replace when serialized
rdblue Nov 9, 2016
99575e8
[SPARK-18387][SQL] Add serialization to checkEvaluation.
rdblue Nov 11, 2016
80c1a1f
[SPARK-17982][SQL][BACKPORT-2.0] SQLBuilder should wrap the generated…
dongjoon-hyun Nov 12, 2016
a719c51
[SPARK-18426][STRUCTURED STREAMING] Python Documentation Fix for Stru…
Nov 14, 2016
646cc85
[SPARK-18382][WEBUI] "run at null:-1" in UI when no file/line info in…
srowen Nov 14, 2016
26ae5cf
[SPARK-18010][CORE] Reduce work performed for building up the applica…
vijoshi Nov 14, 2016
6663965
[SPARK-18432][DOC] Changed HDFS default block size from 64MB to 128MB
moomindani Nov 14, 2016
c40fcbc
[SPARK-18416][STRUCTURED STREAMING] Fixed temp file leak in state store
tdas Nov 14, 2016
9abff1b
[SPARK-17348][SQL] Incorrect results from subquery transformation
nsyca Nov 14, 2016
de545e7
[SPARK-16808][CORE] History Server main page does not honor APPLICATI…
vijoshi Nov 15, 2016
e2452c6
[SPARK-18337] Complete mode memory sinks should be able to recover fr…
brkyvz Nov 15, 2016
8d55886
[SPARK-18300][SQL] Do not apply foldable propagation with expand as a…
hvanhovell Nov 16, 2016
4f3f096
[SPARK-18400][STREAMING] NPE when resharding Kinesis Stream
srowen Nov 16, 2016
10b36d6
[SPARK-18430][SQL][BACKPORT-2.0] Fixed Exception Messages when Hittin…
gatorsmile Nov 16, 2016
37e6d99
[SPARK-18459][SPARK-18460][STRUCTUREDSTREAMING] Rename triggerId to b…
tdas Nov 16, 2016
da9d516
[SPARK-18462] Fix ClassCastException in SparkListenerDriverAccumUpdat…
JoshRosen Nov 18, 2016
9dad3a7
[SPARK-18477][SS] Enable interrupts for HDFS in HDFSMetadataLog
zsxwing Nov 19, 2016
a37238b
[SPARK-18444][SPARKR] SparkR running in yarn-cluster mode should not …
yanboliang Nov 22, 2016
072f4c5
[SPARK-18504][SQL] Scalar subquery with extra group by columns return…
nsyca Nov 22, 2016
aefeaa7
[SPARK-18053][SQL] compare unsafe and safe complex-type values correctly
cloud-fan Nov 23, 2016
f8ce884
[SPARK-18519][SQL][BRANCH-2.0] map type can not be used in EqualTo
cloud-fan Nov 23, 2016
0caab3e
[SPARK-18436][SQL] isin causing SQL syntax error with JDBC
jiangxb1987 Nov 25, 2016
e67ce48
[SPARK-17251][SQL] Improve `OuterReference` to be `NamedExpression`
dongjoon-hyun Nov 26, 2016
9070bd3
[SPARK-18118][SQL] fix a compilation error due to nested JavaBeans
kiszk Nov 28, 2016
759bd4a
[SPARK-18118][SQL] fix a compilation error due to nested JavaBeans
hvanhovell Nov 28, 2016
f158045
[SPARK-18597][SQL] Do not push-down join conditions to the left side …
hvanhovell Nov 28, 2016
9ff03fa
[SPARK-18553][CORE][BRANCH-2.0] Fix leak of TaskSetManager following …
JoshRosen Nov 28, 2016
bdd27d1
[SPARK-17783][SQL][BACKPORT-2.0] Hide Credentials in CREATE and DESC …
gatorsmile Nov 29, 2016
8b33aa0
[SPARK-17843][WEB UI] Indicate event logs pending for processing on h…
vijoshi Nov 30, 2016
1b1c849
[SPARK-18640] Add synchronization to TaskScheduler.runningTasksByExec…
JoshRosen Nov 30, 2016
5ecd3c2
[SPARK][EXAMPLE] Added missing semicolon in quick-start-guide example
Nov 30, 2016
6e3fd2b
[SPARK-18617][BACKPORT] Follow up PR to Close "kryo auto pick" featur…
uncleGen Dec 1, 2016
729cadb
[SPARK-18674][SQL] improve the error message of using join
cloud-fan Dec 1, 2016
254e33f
[SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWrapper
techaddict Dec 1, 2016
0758df6
[SPARK-18617][SPARK-18560][TESTS] Fix flaky test: StreamingContextSui…
zsxwing Dec 1, 2016
5f71d13
[SPARK-18677] Fix parsing ['key'] in JSON path expressions.
rdblue Dec 2, 2016
1f57385
[SPARK-18685][TESTS] Fix URI and release resources after opening in t…
HyukjinKwon Dec 3, 2016
dc61ed4
[SPARK-18091][SQL] Deep if expressions cause Generated SpecificUnsafe…
Dec 4, 2016
bde1d41
[SPARK-18634][PYSPARK][SQL] Corruption and Correctness issues with ex…
viirya Dec 6, 2016
f5c5a07
[SPARK-18634][SQL][TRIVIAL] Touch-up Generate
hvanhovell Dec 6, 2016
e05ad88
[SPARK-18208][SHUFFLE] Executor OOM due to a growing LongArray in Byt…
Dec 7, 2016
7fbb073
[SPARK-17760][SQL][BACKPORT] AnalysisException with dataframe pivot w…
aray Dec 7, 2016
44df6d2
[SPARK-18762][WEBUI] Web UI should be http:4040 instead of https:4040
sarutak Dec 7, 2016
65b4b05
[SPARK-17822][R] Make JVMObjectTracker a member variable of RBackend
mengxr Dec 9, 2016
2c342e5
[SPARK-18745][SQL] Fix signed integer overflow due to toInt cast
kiszk Dec 9, 2016
06f592c
[SQL][MINOR] simplify a test to fix the maven tests
cloud-fan Dec 11, 2016
1d5c7f4
[SPARK-18843][CORE] Fix timeout in awaitResultInForkJoinSafely (branc…
zsxwing Dec 13, 2016
2b18bd4
[SPARK-18853][SQL] Project (UnaryNode) is way too aggressive in estim…
rxin Dec 14, 2016
1ff738a
[SPARK-18854][SQL] numberedTreeString and apply(i) inconsistent for s…
rxin Dec 15, 2016
a5c178b
Revert "[SPARK-18854][SQL] numberedTreeString and apply(i) inconsiste…
rxin Dec 15, 2016
2085a10
Revert "Revert "[SPARK-18854][SQL] numberedTreeString and apply(i) in…
rxin Dec 15, 2016
a323178
Fix compilation error
rxin Dec 15, 2016
669815d
[SPARK-18869][SQL] Add TreeNode.p that returns BaseType
rxin Dec 15, 2016
d36ed9e
[SPARK-18875][SPARKR][DOCS] Fix R API doc generation by adding `DESCR…
dongjoon-hyun Dec 15, 2016
1935bf4
[SPARK-18897][SPARKR] Fix SparkR SQL Test to drop test table
dongjoon-hyun Dec 16, 2016
b416683
[SPARK-18827][CORE] Fix cannot read broadcast on disk
wangyum Dec 18, 2016
2a5ab14
Fix test case for SubquerySuite.
rxin Dec 19, 2016
1f0c5fa
[SPARK-18281] [SQL] [PYSPARK] Remove timeout for reading data through…
viirya Dec 20, 2016
678d91c
[SPARK-18761][BRANCH-2.0] Introduce "task reaper" to oversee task kil…
JoshRosen Dec 20, 2016
2aae220
[SPARK-18928][BRANCH-2.0] Check TaskContext.isInterrupted() in FileSc…
JoshRosen Dec 21, 2016
ef206ac
[SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.list…
cloud-fan Dec 21, 2016
5f8c0b7
[SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsS…
zsxwing Dec 21, 2016
53cd99f
[SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for each table's rel…
xuanyuanking Dec 21, 2016
080ac37
[SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation…
maropu Dec 22, 2016
542be40
[SPARK-18031][TESTS] Fix flaky test ExecutorAllocationManagerSuite.ba…
zsxwing Dec 21, 2016
2d72160
[SPARK-18972][CORE] Fix the netty thread names for RPC
zsxwing Dec 23, 2016
30e6d46
[SPARK-17807][CORE] split test-tags into test-JAR
ryan-williams Dec 22, 2016
f124d35
[SPARK-18237][SPARK-18703][SPARK-18675][SQL][BACKPORT-2.0] CTAS for h…
gatorsmile Dec 26, 2016
5ed2f1c
[SPARK-18993][BUILD] Unable to build/compile Spark in IntelliJ due to…
srowen Dec 28, 2016
93549ff
[SPARK-18877][SQL][BACKPORT-2.0] CSVInferSchema.inferField` on Decima…
dongjoon-hyun Jan 5, 2017
56998f3
[SPARK-19110][ML][MLLIB] DistributedLDAModel returns different logPri…
wangmiao1981 Jan 7, 2017
e70c419
[SPARK-18941][SQL][DOC] Add a new behavior document on `CREATE/DROP T…
dongjoon-hyun Jan 8, 2017
6fe676c
[SPARK-18997][CORE] Recommended upgrade libthrift to 0.9.3
srowen Jan 10, 2017
ec2fe92
[SPARK-19133][SPARKR][ML][BACKPORT-2.0] fix glm for Gamma, clarify gl…
felixcheung Jan 12, 2017
c94288b
[SPARK-18857][SQL] Don't use `Iterator.duplicate` for `incrementalCol…
dongjoon-hyun Jan 10, 2017
3566e40
[SPARK-18969][SQL] Support grouping by nondeterministic expressions
cloud-fan Jan 12, 2017
55d2a11
[SPARK-19055][SQL][PYSPARK] Fix SparkSession initialization when Spar…
viirya Jan 12, 2017
be527dd
Fix missing close-parens for In filter's toString
ash211 Jan 13, 2017
449231c
[SPARK-18687][PYSPARK][SQL] Backward compatibility - creating a Dataf…
vijoshi Jan 13, 2017
f56819f
[SPARK-19178][SQL] convert string of large numbers to int should retu…
cloud-fan Jan 13, 2017
08385b7
[SPARK-19180] [SQL] the offset of short should be 2 in OffHeapColumn
Jan 13, 2017
ee4e8fa
[SPARK-17237][SPARK-17458][SQL][BACKPORT-2.0] Preserve aliases that a…
maropu Jan 15, 2017
9fc053c
[SPARK-16968][SQL][BACKPORT-2.0] Add additional options in jdbc when …
GraceH Jan 19, 2017
4c2065d
[SPARK-19314][SS][CATALYST] Do not allow sort before aggregation in S…
tdas Jan 20, 2017
886f737
[SPARK-19155][ML] MLlib GeneralizedLinearRegression family and link s…
yanboliang Jan 22, 2017
2d9e8d5
[SPARK-18750][YARN] Avoid using "mapValues" when allocating containers.
Jan 25, 2017
00a4807
[SPARK-18750][YARN] Follow up: move test to correct directory in 2.1 …
Jan 25, 2017
48a8dc8
[SPARK-14804][SPARK][GRAPHX] Fix checkpointing of VertexRDD/EdgeRDD
tdas Jan 26, 2017
93d5887
[SPARK-19220][UI] Make redirection to HTTPS apply to all URIs. (branc…
Jan 27, 2017
b41294b
[SPARK-19333][SPARKR] Add Apache License headers to R files
felixcheung Jan 27, 2017
8bf6422
[SPARK-19472][SQL] Parser should not mistake CASE WHEN(...) for a fun…
hvanhovell Feb 6, 2017
00803cd
[SPARK-19509][SQL] Grouping Sets do not respect nullable grouping col…
Feb 9, 2017
23050c8
[SPARK-17897][SQL][BACKPORT-2.0] Fixed IsNotNull Constraint Inference…
gatorsmile Feb 12, 2017
f50c437
[SPARK-19529] TransportClientFactory.createClient() shouldn't call aw…
JoshRosen Feb 13, 2017
2926812
[SPARK-19501][YARN] Reduce the number of HDFS RPCs during YARN deploy…
jongwook Feb 14, 2017
5c3e56f
[SPARK-19500] [SQL] Fix off-by-one bug in BytesToBytesMap
Feb 17, 2017
ddd432d
[SPARK-19646][CORE][STREAMING] binaryRecords replicates records in sc…
srowen Feb 20, 2017
8cdd121
[SPARK-19652][UI] Do auth checks for REST API access (branch-2.0).
Feb 22, 2017
a6af60f
[SPARK-19038][YARN] Avoid overwriting keytab configuration in yarn-cl…
jerryshao Feb 24, 2017
dcfb05c
[SPARK-19677][SS] Committing a delta file atop an existing one should…
vitillo Feb 28, 2017
c9c45d9
[SPARK-19769][DOCS] Update quickstart instructions
elmiko Feb 28, 2017
e30fe1c
[SPARK-19766][SQL][BRANCH-2.0] Constant alias columns in INNER JOIN s…
stanzhai Mar 2, 2017
491b47a
[SPARK-19750][UI][BRANCH-2.1] Fix redirect issue from http to https
jerryshao Mar 3, 2017
7380188
[SPARK-19779][SS] Delete needless tmp file after restart structured s…
gf53520 Mar 3, 2017
c7e7b04
[SPARK-19822][TEST] CheckpointSuite.testCheckpointedOperation: should…
uncleGen Mar 6, 2017
0cc992c
[SPARK-16845][SQL][BRANCH-2.0] GeneratedClass$SpecificOrdering` grows…
lw-lin Mar 6, 2017
e699028
[SPARK-19348][PYTHON] PySpark keyword_only decorator is not thread-safe
BryanCutler Mar 8, 2017
da3dfaf
[SPARK-18055][SQL] Use correct mirror in ExpresionEncoder
marmbrus Mar 8, 2017
c561e6c
[SPARK-19481] [REPL] [MAVEN] Avoid to leak SparkContext in Signaling.…
zsxwing Feb 9, 2017
e8426cb
[SPARK-19893][SQL] should not run DataFrame set oprations with map type
cloud-fan Mar 11, 2017
fd5149a
hot fix for compilation error caused by PR#17236
cloud-fan Mar 15, 2017
6ee7d5b
[SPARK-19986][TESTS] Make pyspark.streaming.tests.CheckpointTests mor…
zsxwing Mar 17, 2017
3983b3d
[SPARK-19994][SQL] Wrong outputOrdering for right/full outer smj
wzhfy Mar 20, 2017
72a0ee3
[SPARK-19994][HOTFIX][BRANCH-2.0] Change InnerLike to Inner
wzhfy Mar 21, 2017
b45940e
[SPARK-17204][CORE] Fix replicated off heap storage
Mar 24, 2017
90eb373
[SPARK-19959][SQL] Fix to throw NullPointerException in df[java.lang…
kiszk Mar 24, 2017
15ea5ea
[SPARK-20223][SQL] Fix typo in tpcds q77.sql
wzhfy Apr 5, 2017
9016e17
[SPARK-20214][ML] Make sure converted csc matrix has sorted indices
viirya Apr 6, 2017
a0b499f
[SPARK-20246][SQL] should not push predicate down through aggregate w…
cloud-fan Apr 8, 2017
87be965
[SPARK-20285][TESTS] Increase the pyspark streaming test timeout to 3…
zsxwing Apr 10, 2017
735e203
[SPARK-18555][SQL] DataFrameNaFunctions.fill miss up original values …
Dec 6, 2016
aec3752
[SPARK-20270][SQL] na.fill should not change the values in long or in…
Apr 10, 2017
123a758
[MINOR][SQL] Fix the @since tag when backporting SPARK-18555 from 2.2…
dbtsai Apr 11, 2017
24f6ef2
[SPARK-20291][SQL][BACKPORT] NaNvl(FloatType, NullType) should not be…
Apr 12, 2017
84be4c8
[SPARK-19019][PYTHON][BRANCH-2.0] Fix hijacked `collections.namedtupl…
HyukjinKwon Apr 17, 2017
ddf6dd8
[SPARK-20451] Filter out nested mapType datatypes from sort order in …
sameeragarwal Apr 25, 2017
068500a
[SPARK-20239][CORE][2.1-BACKPORT] Improve HistoryServer's ACL mechanism
jerryshao Apr 25, 2017
4665997
[SPARK-20558][CORE] clear InheritableThreadLocal variables in SparkCo…
cloud-fan May 3, 2017
d86dae8
[SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsisten…
zero323 May 10, 2017
b2d0ed2
[SPARK-20665][SQL] Bround" and "Round" function return NULL
10110346 May 12, 2017
9b145c6
[SPARK-17424] Fix unsound substitution bug in ScalaReflection.
rdblue May 12, 2017
4dd34d0
[SPARK-20756][YARN] yarn-shuffle jar references unshaded guava
markgrover May 22, 2017
72e1f83
[SPARK-20862][MLLIB][PYTHON] Avoid passing float to ndarray.reshape i…
MrBago May 24, 2017
79fbfbb
[SPARK-18406][CORE][BACKPORT-2.0] Race between end-of-task and comple…
jiangxb1987 May 24, 2017
ef0ebdd
[SPARK-20250][CORE] Improper OOM error when a task been killed while …
ConeyLiu May 25, 2017
9846a3c
[SPARK-20868][CORE] UnsafeShuffleWriter should verify the position af…
cloud-fan May 26, 2017
cd870c0
[SPARK-20940][CORE] Replace IllegalAccessError with IllegalStateExcep…
zsxwing Jun 1, 2017
f7cbf90
[SPARK-20922][CORE] Add whitelist of classes that can be deserialized…
Jun 1, 2017
9952b53
[SPARK-20922][CORE][HOTFIX] Don't use Java 8 lambdas in older branches.
Jun 1, 2017
0f35988
[SPARK-20974][BUILD] we should run REPL tests if SQL module has code …
cloud-fan Jun 3, 2017
0239b16
[SPARK-20211][SQL][BACKPORT-2.2] Fix the Precision and Scale of Decim…
gatorsmile Jun 14, 2017
7efd475
[SPARK-16251][SPARK-20200][CORE][TEST] Flaky test: org.apache.spark.r…
jiangxb1987 Jun 15, 2017
333924e
[SPARK-19688][STREAMING] Not to read `spark.yarn.credentials.file` fr…
Jun 19, 2017
44a97f7
[SPARK-21138][YARN] Cannot delete staging dir when the clusters of "s…
Jun 19, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
/lib/
R-unit-tests.log
R/unit-tests.out
R/cran-check.out
build/*.jar
build/apache-maven*
build/scala*
Expand Down Expand Up @@ -72,7 +73,13 @@ metastore/
metastore_db/
sql/hive-thriftserver/test_warehouses
warehouse/
spark-warehouse/

# For R session data
.RData
.RHistory
.Rhistory
*.Rproj
*.Rproj.*

.Rproj.user
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ It lists steps that are required before creating a PR. In particular, consider:

- Is the change important and ready enough to ask the community to spend time reviewing?
- Have you searched for existing, related JIRAs and pull requests?
- Is this a new feature that can stand alone as a package on http://spark-packages.org ?
- Is this a new feature that can stand alone as a [third party project](https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects) ?
- Is the change being proposed clearly explained and motivated?

When you contribute code, you affirm that the contribution is your original work and that you
Expand Down
3 changes: 2 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,7 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)
(The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)
(The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.9.2 - http://py4j.sourceforge.net/)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.10.3 - http://py4j.sourceforge.net/)
(Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - http://github.com/szeiger/junit-interface/)
(BSD licence) sbt and sbt-launch-lib.bash
(BSD 3 Clause) d3.min.js (https://github.com/mbostock/d3/blob/master/LICENSE)
Expand Down Expand Up @@ -296,3 +296,4 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) blockUI (http://jquery.malsup.com/block/)
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
(MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)
13 changes: 5 additions & 8 deletions NOTICE
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Apache Spark
Copyright 2014 The Apache Software Foundation.
Copyright 2014 and onwards The Apache Software Foundation.

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
Expand All @@ -12,7 +12,9 @@ Common Development and Distribution License 1.0
The following components are provided under the Common Development and Distribution License 1.0. See project link for details.

(CDDL 1.0) Glassfish Jasper (org.mortbay.jetty:jsp-2.1:6.1.14 - http://jetty.mortbay.org/project/modules/jsp-2.1)
(CDDL 1.0) JAX-RS (https://jax-rs-spec.java.net/)
(CDDL 1.0) Servlet Specification 2.5 API (org.mortbay.jetty:servlet-api-2.5:6.1.14 - http://jetty.mortbay.org/project/modules/servlet-api-2.5)
(CDDL 1.0) (GPL2 w/ CPE) javax.annotation API (https://glassfish.java.net/nonav/public/CDDL+GPL.html)
(COMMON DEVELOPMENT AND DISTRIBUTION LICENSE (CDDL) Version 1.0) (GNU General Public Library) Streaming API for XML (javax.xml.stream:stax-api:1.0-2 - no url defined)
(Common Development and Distribution License (CDDL) v1.0) JavaBeans Activation Framework (JAF) (javax.activation:activation:1.1 - http://java.sun.com/products/javabeans/jaf/index.jsp)

Expand All @@ -22,15 +24,10 @@ Common Development and Distribution License 1.1

The following components are provided under the Common Development and Distribution License 1.1. See project link for details.

(CDDL 1.1) (GPL2 w/ CPE) org.glassfish.hk2 (https://hk2.java.net)
(CDDL 1.1) (GPL2 w/ CPE) JAXB API bundle for GlassFish V3 (javax.xml.bind:jaxb-api:2.2.2 - https://jaxb.dev.java.net/)
(CDDL 1.1) (GPL2 w/ CPE) JAXB RI (com.sun.xml.bind:jaxb-impl:2.2.3-1 - http://jaxb.java.net/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-core (com.sun.jersey:jersey-core:1.8 - https://jersey.dev.java.net/jersey-core/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-core (com.sun.jersey:jersey-core:1.9 - https://jersey.java.net/jersey-core/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-guice (com.sun.jersey.contribs:jersey-guice:1.9 - https://jersey.java.net/jersey-contribs/jersey-guice/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-json (com.sun.jersey:jersey-json:1.8 - https://jersey.dev.java.net/jersey-json/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-json (com.sun.jersey:jersey-json:1.9 - https://jersey.java.net/jersey-json/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-server (com.sun.jersey:jersey-server:1.8 - https://jersey.dev.java.net/jersey-server/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-server (com.sun.jersey:jersey-server:1.9 - https://jersey.java.net/jersey-server/)
(CDDL 1.1) (GPL2 w/ CPE) Jersey 2 (https://jersey.java.net)

========================================================================
Common Public License 1.0
Expand Down
2 changes: 2 additions & 0 deletions R/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,5 @@
lib
pkg/man
pkg/html
SparkR.Rcheck/
SparkR_*.tar.gz
12 changes: 6 additions & 6 deletions R/DOCUMENTATION.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# SparkR Documentation

SparkR documentation is generated using in-source comments annotated using using
`roxygen2`. After making changes to the documentation, to generate man pages,
SparkR documentation is generated by using in-source comments and annotated by using
[`roxygen2`](https://cran.r-project.org/web/packages/roxygen2/index.html). After making changes to the documentation and generating man pages,
you can run the following from an R console in the SparkR home directory

library(devtools)
devtools::document(pkg="./pkg", roclets=c("rd"))

```R
library(devtools)
devtools::document(pkg="./pkg", roclets=c("rd"))
```
You can verify if your changes are good by running

R CMD check pkg/
32 changes: 18 additions & 14 deletions R/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
# R on Spark

SparkR is an R package that provides a light-weight frontend to use Spark from R.

### Installing sparkR

Libraries of sparkR need to be created in `$SPARK_HOME/R/lib`. This can be done by running the script `$SPARK_HOME/R/install-dev.sh`.
By default the above script uses the system wide installation of R. However, this can be changed to any user installed location of R by setting the environment variable `R_HOME` the full path of the base directory where R is installed, before running install-dev.sh script.
Example:
```
```bash
# where /home/username/R is where R is installed and /home/username/R/bin contains the files R and RScript
export R_HOME=/home/username/R
./install-dev.sh
Expand All @@ -17,8 +18,9 @@ export R_HOME=/home/username/R
#### Build Spark

Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run
```
build/mvn -DskipTests -Psparkr package

```bash
build/mvn -DskipTests -Psparkr package
```

#### Running sparkR
Expand All @@ -37,8 +39,8 @@ To set other options like driver memory, executor memory etc. you can pass in th

#### Using SparkR from RStudio

If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```
If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```R
# Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/username/spark")
# This line loads SparkR from the installed directory
Expand All @@ -55,23 +57,25 @@ Once you have made your changes, please include unit tests for them and run exis

#### Generating documentation

The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script.
The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script. Also, you may need to install these [prerequisites](https://github.com/apache/spark/tree/master/docs#prerequisites). See also, `R/DOCUMENTATION.md`

### Examples, Unit tests

SparkR comes with several sample programs in the `examples/src/main/r` directory.
To run one of them, use `./bin/spark-submit <filename> <args>`. For example:

./bin/spark-submit examples/src/main/r/dataframe.R

You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):

R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh
```bash
./bin/spark-submit examples/src/main/r/dataframe.R
```
You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
```bash
R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh
```

### Running on YARN

The `./bin/spark-submit` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
```
```bash
export YARN_CONF_DIR=/etc/hadoop/conf
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R
```
20 changes: 20 additions & 0 deletions R/WINDOWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,23 @@ include Rtools and R in `PATH`.
directory in Maven in `PATH`.
4. Set `MAVEN_OPTS` as described in [Building Spark](http://spark.apache.org/docs/latest/building-spark.html).
5. Open a command shell (`cmd`) in the Spark directory and run `mvn -DskipTests -Psparkr package`

## Unit tests

To run the SparkR unit tests on Windows, the following steps are required —assuming you are in the Spark root directory and do not have Apache Hadoop installed already:

1. Create a folder to download Hadoop related files for Windows. For example, `cd ..` and `mkdir hadoop`.

2. Download the relevant Hadoop bin package from [steveloughran/winutils](https://github.com/steveloughran/winutils). While these are not official ASF artifacts, they are built from the ASF release git hashes by a Hadoop PMC member on a dedicated Windows VM. For further reading, consult [Windows Problems on the Hadoop wiki](https://wiki.apache.org/hadoop/WindowsProblems).

3. Install the files into `hadoop\bin`; make sure that `winutils.exe` and `hadoop.dll` are present.

4. Set the environment variable `HADOOP_HOME` to the full path to the newly created `hadoop` directory.

5. Run unit tests for SparkR by running the command below. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:

```
R -e "install.packages('testthat', repos='http://cran.us.r-project.org')"
.\bin\spark-submit2.cmd --conf spark.hadoop.fs.default.name="file:///" R\pkg\tests\run-all.R
```

64 changes: 64 additions & 0 deletions R/check-cran.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/bin/bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

set -o pipefail
set -e

FWDIR="$(cd `dirname $0`; pwd)"
pushd $FWDIR > /dev/null

if [ ! -z "$R_HOME" ]
then
R_SCRIPT_PATH="$R_HOME/bin"
else
# if system wide R_HOME is not found, then exit
if [ ! `command -v R` ]; then
echo "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed."
exit 1
fi
R_SCRIPT_PATH="$(dirname $(which R))"
fi
echo "USING R_HOME = $R_HOME"

# Build the latest docs
$FWDIR/create-docs.sh

# Build a zip file containing the source package
"$R_SCRIPT_PATH/"R CMD build $FWDIR/pkg

# Run check as-cran.
VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`

CRAN_CHECK_OPTIONS="--as-cran"

if [ -n "$NO_TESTS" ]
then
CRAN_CHECK_OPTIONS=$CRAN_CHECK_OPTIONS" --no-tests"
fi

if [ -n "$NO_MANUAL" ]
then
CRAN_CHECK_OPTIONS=$CRAN_CHECK_OPTIONS" --no-manual"
fi

echo "Running CRAN check with $CRAN_CHECK_OPTIONS options"

"$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz

popd > /dev/null
30 changes: 28 additions & 2 deletions R/create-docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,26 @@
# limitations under the License.
#

# Script to create API docs for SparkR
# This requires `devtools` and `knitr` to be installed on the machine.
# Script to create API docs and vignettes for SparkR
# This requires `devtools`, `knitr` and `rmarkdown` to be installed on the machine.

# After running this script the html docs can be found in
# $SPARK_HOME/R/pkg/html
# The vignettes can be found in
# $SPARK_HOME/R/pkg/vignettes/sparkr_vignettes.html

set -o pipefail
set -e

# Figure out where the script is
export FWDIR="$(cd "`dirname "$0"`"; pwd)"
export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"

# Required for setting SPARK_SCALA_VERSION
. "${SPARK_HOME}"/bin/load-spark-env.sh

echo "Using Scala $SPARK_SCALA_VERSION"

pushd $FWDIR

# Install the package (this will also generate the Rd files)
Expand All @@ -43,4 +52,21 @@ Rscript -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knit

popd

# Find Spark jars.
if [ -f "${SPARK_HOME}/RELEASE" ]; then
SPARK_JARS_DIR="${SPARK_HOME}/jars"
else
SPARK_JARS_DIR="${SPARK_HOME}/assembly/target/scala-$SPARK_SCALA_VERSION/jars"
fi

# Only create vignettes if Spark JARs exist
if [ -d "$SPARK_JARS_DIR" ]; then
# render creates SparkR vignettes
Rscript -e 'library(rmarkdown); paths <- .libPaths(); .libPaths(c("lib", paths)); Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); render("pkg/vignettes/sparkr-vignettes.Rmd"); .libPaths(paths)'

find pkg/vignettes/. -not -name '.' -not -name '*.Rmd' -not -name '*.md' -not -name '*.pdf' -not -name '*.html' -delete
else
echo "Skipping R vignettes as Spark JARs not found in $SPARK_HOME"
fi

popd
7 changes: 6 additions & 1 deletion R/install-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,12 @@ pushd $FWDIR > /dev/null
if [ ! -z "$R_HOME" ]
then
R_SCRIPT_PATH="$R_HOME/bin"
else
else
# if system wide R_HOME is not found, then exit
if [ ! `command -v R` ]; then
echo "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed."
exit 1
fi
R_SCRIPT_PATH="$(dirname $(which R))"
fi
echo "USING R_HOME = $R_HOME"
Expand Down
5 changes: 5 additions & 0 deletions R/pkg/.Rbuildignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
^.*\.Rproj$
^\.Rproj\.user$
^\.lintr$
^src-native$
^html$
27 changes: 18 additions & 9 deletions R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,20 +1,25 @@
Package: SparkR
Type: Package
Title: R frontend for Spark
Version: 2.0.0
Date: 2013-09-09
Author: The Apache Software Foundation
Maintainer: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
Imports:
methods
Title: R Frontend for Apache Spark
Version: 2.0.3
Date: 2016-08-27
Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),
email = "shivaram@cs.berkeley.edu"),
person("Xiangrui", "Meng", role = "aut",
email = "meng@databricks.com"),
person("Felix", "Cheung", role = "aut",
email = "felixcheung@apache.org"),
person(family = "The Apache Software Foundation", role = c("aut", "cph")))
URL: http://www.apache.org/ http://spark.apache.org/
BugReports: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-ContributingBugReports
Depends:
R (>= 3.0),
methods,
methods
Suggests:
testthat,
e1071,
survival
Description: R frontend for Spark
Description: The SparkR package provides an R frontend for Apache Spark.
License: Apache License (== 2.0)
Collate:
'schema.R'
Expand All @@ -26,16 +31,20 @@ Collate:
'pairRDD.R'
'DataFrame.R'
'SQLContext.R'
'WindowSpec.R'
'backend.R'
'broadcast.R'
'client.R'
'context.R'
'deserialize.R'
'functions.R'
'install.R'
'jvm.R'
'mllib.R'
'serialize.R'
'sparkR.R'
'stats.R'
'types.R'
'utils.R'
'window.R'
RoxygenNote: 5.0.1
Loading