Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pull latest apache spark #10

Merged
merged 1,045 commits into from
Sep 30, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1045 commits
Select commit Hold shift + click to select a range
be72b15
[SPARK-21803][TEST] Remove the HiveDDLCommandSuite
gatorsmile Aug 22, 2017
3ed1ae1
[SPARK-20641][CORE] Add missing kvstore module in Laucher and SparkSu…
jerryshao Aug 22, 2017
43d71d9
[SPARK-21499][SQL] Support creating persistent function for Spark UDA…
gatorsmile Aug 22, 2017
01a8e46
[SPARK-21769][SQL] Add a table-specific option for always respecting …
gatorsmile Aug 22, 2017
d56c262
[SPARK-21681][ML] fix bug of MLOR do not work correctly when featureS…
WeichenXu123 Aug 22, 2017
41bb1dd
[SPARK-10931][ML][PYSPARK] PySpark Models Copy Param Values from Esti…
BryanCutler Aug 23, 2017
3c0c2d0
[SPARK-21765] Set isStreaming on leaf nodes for streaming plans.
Aug 23, 2017
3429619
[ML][MINOR] Make sharedParams update.
yanboliang Aug 23, 2017
d58a350
[SPARK-19326] Speculated task attempts do not get launched in few sce…
janewangfb Aug 23, 2017
d6b30ed
[SPARK-12664][ML] Expose probability in mlp model
WeichenXu123 Aug 23, 2017
1662e93
[SPARK-21501] Change CacheLoader to limit entries based on memory foo…
Aug 23, 2017
6942aee
[SPARK-21603][SQL][FOLLOW-UP] Change the default value of maxLinesPer…
maropu Aug 23, 2017
b8aaef4
[SPARK-21807][SQL] Override ++ operation in ExpressionSet to reduce c…
eatoncys Aug 24, 2017
43cbfad
[SPARK-21805][SPARKR] Disable R vignettes code on Windows
felixcheung Aug 24, 2017
ce0d3bb
[SPARK-21694][MESOS] Support Mesos CNI network labels
susanxhuynh Aug 24, 2017
846bc61
[MINOR][SQL] The comment of Class ExchangeCoordinator exist a typing …
figo77 Aug 24, 2017
95713eb
[SPARK-21804][SQL] json_tuple returns null values within repeated col…
jmchung Aug 24, 2017
dc5d34d
[SPARK-19165][PYTHON][SQL] PySpark APIs using columns as arguments sh…
HyukjinKwon Aug 24, 2017
9e33954
[SPARK-21745][SQL] Refactor ColumnVector hierarchy to make ColumnVect…
ueshin Aug 24, 2017
183d4cb
[SPARK-21759][SQL] In.checkInputDataTypes should not wrongly report u…
viirya Aug 24, 2017
2dd37d8
[SPARK-21826][SQL] outer broadcast hash join should not throw NPE
cloud-fan Aug 24, 2017
d3abb36
[SPARK-21788][SS] Handle more exceptions when stopping a streaming query
zsxwing Aug 24, 2017
763b83e
[SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF` and ` SO_SN…
Aug 24, 2017
05af2de
[SPARK-21830][SQL] Bump ANTLR version and fix a few issues.
hvanhovell Aug 24, 2017
f3676d6
[SPARK-21108][ML] convert LinearSVC to aggregator framework
YY-OnCall Aug 25, 2017
7d16776
[SPARK-21255][SQL][WIP] Fixed NPE when creating encoder for enum
mike0sv Aug 25, 2017
574ef6c
[SPARK-21527][CORE] Use buffer limit in order to use JAVA NIO Util's …
caneGuy Aug 25, 2017
de7af29
[MINOR][BUILD] Fix build warnings and Java lint errors
srowen Aug 25, 2017
1f24cee
[SPARK-21832][TEST] Merge SQLBuilderTest into ExpressionSQLBuilderSuite
dongjoon-hyun Aug 25, 2017
1813c4a
[SPARK-21714][CORE][YARN] Avoiding re-uploading remote resources in y…
jerryshao Aug 25, 2017
628bdea
[SPARK-17742][CORE] Fail launcher app handle if child process exits w…
Aug 25, 2017
51620e2
[SPARK-21756][SQL] Add JSON option to allow unquoted control characters
vinodkc Aug 25, 2017
1a598d7
[SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDTs not actuall…
srowen Aug 25, 2017
522e1f8
[SPARK-21831][TEST] Remove `spark.sql.hive.convertMetastoreOrc` confi…
dongjoon-hyun Aug 26, 2017
3b66b1c
[MINOR][DOCS] Minor doc fixes related with doc build and uses script …
HyukjinKwon Aug 26, 2017
07142cf
[SPARK-21843] testNameNote should be "(minNumPostShufflePartitions: 5)"
iamhumanbeing Aug 27, 2017
0456b40
[SPARK-21818][ML][MLLIB] Fix bug of MultivariateOnlineSummarizer.vari…
WeichenXu123 Aug 28, 2017
24e6c18
[SPARK-21798] No config to replace deprecated SPARK_CLASSPATH config …
Aug 28, 2017
73e64f7
[SPARK-19662][SCHEDULER][TEST] Add Fair Scheduler Unit Test coverage …
erenavsarogullari Aug 28, 2017
c7270a4
[SPARK-17139][ML] Add model summary for MultinomialLogisticRegression
WeichenXu123 Aug 28, 2017
32fa0b8
[SPARK-21781][SQL] Modify DataSourceScanExec to use concrete ColumnVe…
ueshin Aug 29, 2017
8fcbda9
[SPARK-21848][SQL] Add trait UserDefinedExpression to identify user-d…
gengliangwang Aug 29, 2017
6327ea5
[SPARK-21255][SQL] simplify encoder for java enum
cloud-fan Aug 29, 2017
6077e3e
[SPARK-21801][SPARKR][TEST] unit test randomly fail with randomforest
felixcheung Aug 29, 2017
840ba05
[MINOR][ML] Document treatment of instance weights in logreg summary
jkbradley Aug 29, 2017
d7b1fcf
[SPARK-21728][CORE] Allow SparkSubmit to use Logging.
Aug 29, 2017
fba9cc8
[SPARK-21813][CORE] Modify TaskMemoryManager.MAXIMUM_PAGE_SIZE_BYTES …
Geek-He Aug 29, 2017
3d0e174
[SPARK-21845][SQL] Make codegen fallback of expressions configurable
gatorsmile Aug 30, 2017
e47f48c
[SPARK-20886][CORE] HadoopMapReduceCommitProtocol to handle FileOutpu…
steveloughran Aug 30, 2017
d4895c9
[MINOR][TEST] Off -heap memory leaks for unit tests
10110346 Aug 30, 2017
8f0df6b
[SPARK-21873][SS] - Avoid using `return` inside `CachedKafkaConsumer.…
Aug 30, 2017
734ed7a
[SPARK-21806][MLLIB] BinaryClassificationMetrics pr(): first point (0…
srowen Aug 30, 2017
b30a11a
[SPARK-21764][TESTS] Fix tests failures on Windows: resources not bei…
HyukjinKwon Aug 30, 2017
4133c1b
[SPARK-21469][ML][EXAMPLES] Adding Examples for FeatureHasher
BryanCutler Aug 30, 2017
32d6d9d
Revert "[SPARK-21845][SQL] Make codegen fallback of expressions confi…
gatorsmile Aug 30, 2017
235d283
[MINOR][SQL][TEST] Test shuffle hash join while is not expected
heary-cao Aug 30, 2017
6949a9c
[SPARK-21834] Incorrect executor request in case of dynamic allocation
Aug 30, 2017
d8f4540
[SPARK-21839][SQL] Support SQL config for ORC compression
dongjoon-hyun Aug 30, 2017
313c6ca
[SPARK-21875][BUILD] Fix Java style bugs
ash211 Aug 31, 2017
cd5d0f3
[SPARK-11574][CORE] Add metrics StatsD sink
Aug 31, 2017
4482ff2
[SPARK-17321][YARN] Avoid writing shuffle metadata to disk if NM reco…
jerryshao Aug 31, 2017
ecf437a
[SPARK-21534][SQL][PYSPARK] PickleException when creating dataframe f…
viirya Aug 31, 2017
964b507
[SPARK-21583][SQL] Create a ColumnarBatch from ArrowColumnVectors
BryanCutler Aug 31, 2017
19b0240
[SPARK-21878][SQL][TEST] Create SQLMetricsTestUtils
gatorsmile Aug 31, 2017
9696580
[SPARK-21886][SQL] Use SparkSession.internalCreateDataFrame to create…
jaceklaskowski Aug 31, 2017
fc45c2c
[SPARK-20812][MESOS] Add secrets support to the dispatcher
Aug 31, 2017
501370d
[SPARK-21583][HOTFIX] Removed intercept in test causing failures
BryanCutler Aug 31, 2017
7ce1108
[SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown rule for Union
gatorsmile Aug 31, 2017
cba69ae
[SPARK-21110][SQL] Structs, arrays, and other orderable datatypes sho…
aray Aug 31, 2017
96028e3
[SPARK-17139][ML][FOLLOW-UP] Add convenient method `asBinary` for cas…
WeichenXu123 Aug 31, 2017
f5e10a3
[SPARK-21862][ML] Add overflow check in PCA
WeichenXu123 Aug 31, 2017
5cd8ea9
[SPARK-21779][PYTHON] Simpler DataFrame.sample API in Python
HyukjinKwon Sep 1, 2017
648a862
[SPARK-21789][PYTHON] Remove obsolete codes for parsing abstract sche…
HyukjinKwon Sep 1, 2017
0bdbefe
[SPARK-21728][CORE] Follow up: fix user config, auth in SparkSubmit l…
Sep 1, 2017
12f0d24
[SPARK-21880][WEB UI] In the SQL table page, modify jobs trace inform…
Geek-He Sep 1, 2017
12ab7f7
[SPARK-14280][BUILD][WIP] Update change-version.sh and pom.xml to add…
srowen Sep 1, 2017
aba9492
[SPARK-21895][SQL] Support changing database in HiveClient
gatorsmile Sep 1, 2017
900f14f
[SPARK-21729][ML][TEST] Generic test for ProbabilisticClassifier to e…
WeichenXu123 Sep 2, 2017
acb7fed
[SPARK-21891][SQL] Add TBLPROPERTIES to DDL statement: CREATE TABLE U…
gatorsmile Sep 2, 2017
07fd68a
[SPARK-21897][PYTHON][R] Add unionByName API to DataFrame in Python a…
HyukjinKwon Sep 3, 2017
9f30d92
[SPARK-21654][SQL] Complement SQL predicates expression description
viirya Sep 4, 2017
ca59445
[SPARK-21418][SQL] NoSuchElementException: None.get in DataSourceScan…
srowen Sep 4, 2017
4e7a29e
[SPARK-21913][SQL][TEST] withDatabase` should drop database with CASCADE
dongjoon-hyun Sep 5, 2017
7f3c6ff
[SPARK-21903][BUILD] Upgrade scalastyle to 1.0.0.
HyukjinKwon Sep 5, 2017
02a4386
[SPARK-20978][SQL] Bump up Univocity version to 2.5.4
HyukjinKwon Sep 5, 2017
2974406
[SPARK-21845][SQL][TEST-MAVEN] Make codegen fallback of expressions c…
gatorsmile Sep 5, 2017
8c954d2
[SPARK-21925] Update trigger interval documentation in docs with beha…
brkyvz Sep 5, 2017
fd60d4f
[SPARK-21652][SQL] Fix rule confliction between InferFiltersFromConst…
jiangxb1987 Sep 5, 2017
9e451bc
[MINOR][DOC] Update `Partition Discovery` section to enumerate all av…
dongjoon-hyun Sep 5, 2017
6a23254
[SPARK-18061][THRIFTSERVER] Add spnego auth support for ThriftServer …
jerryshao Sep 6, 2017
445f179
[SPARK-9104][CORE] Expose Netty memory metrics in Spark
jerryshao Sep 6, 2017
4ee7dfe
[SPARK-21924][DOCS] Update structured streaming programming guide doc
Sep 6, 2017
16c4c03
[SPARK-19357][ML] Adding parallel model evaluation in ML tuning
BryanCutler Sep 6, 2017
64936c1
[SPARK-21903][BUILD][FOLLOWUP] Upgrade scalastyle-maven-plugin and sc…
HyukjinKwon Sep 6, 2017
f2e22ae
[SPARK-21835][SQL] RewritePredicateSubquery should not produce unreso…
viirya Sep 6, 2017
36b48ee
[SPARK-21801][SPARKR][TEST] set random seed for predictable test
felixcheung Sep 6, 2017
acdf45f
[SPARK-21765] Check that optimization doesn't affect isStreaming bit.
jose-torres Sep 6, 2017
fa0092b
[SPARK-21901][SS] Define toString for StateOperatorProgress
jaceklaskowski Sep 6, 2017
aad2125
Fixed pandoc dependency issue in python/setup.py
Sep 7, 2017
ce7293c
[SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery should not pro…
viirya Sep 7, 2017
eea2b87
[SPARK-21912][SQL] ORC/Parquet table should not create invalid column…
dongjoon-hyun Sep 7, 2017
b9ab791
[SPARK-21890] Credentials not being passed to add the tokens
Sep 7, 2017
e00f1a1
[SPARK-13656][SQL] Delete spark.sql.parquet.cacheMetadata from SQLCon…
dongjoon-hyun Sep 7, 2017
c26976f
[SPARK-21939][TEST] Use TimeLimits instead of Timeouts
dongjoon-hyun Sep 8, 2017
57bc1e9
[SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTests2 should s…
ueshin Sep 8, 2017
f62b20f
[SPARK-21949][TEST] Tables created in unit tests should be dropped af…
10110346 Sep 8, 2017
6e37524
[SPARK-21726][SQL] Check for structural integrity of the plan in Opti…
viirya Sep 8, 2017
dbb8241
[SPARK-21936][SQL] backward compatibility test framework for HiveExte…
cloud-fan Sep 8, 2017
0dfc1ec
[SPARK-21726][SQL][FOLLOW-UP] Check for structural integrity of the p…
viirya Sep 8, 2017
8a4f228
[SPARK-21946][TEST] fix flaky test: "alter table: rename cached table…
kiszk Sep 8, 2017
8598d03
[SPARK-15243][ML][SQL][PYTHON] Add missing support for unicode in Par…
HyukjinKwon Sep 8, 2017
31c74fe
[SPARK-19866][ML][PYSPARK] Add local version of Word2Vec findSynonyms…
keypointt Sep 8, 2017
8a5eb50
[SPARK-21941] Stop storing unused attemptId in SQLTaskMetrics
ash211 Sep 9, 2017
6b45d7e
[SPARK-21954][SQL] JacksonUtils should verify MapType's value type in…
viirya Sep 9, 2017
e4d8f9a
[MINOR][SQL] Correct DataFrame doc.
yanboliang Sep 9, 2017
f767905
[SPARK-4131] Support "Writing data into the filesystem from queries"
janewangfb Sep 9, 2017
520d92a
[SPARK-20098][PYSPARK] dataType's typeName fix
szalai1 Sep 10, 2017
6273a71
[SPARK-21610][SQL] Corrupt records are not handled properly when crea…
jmchung Sep 11, 2017
828fab0
[BUILD][TEST][SPARKR] add sparksubmitsuite to appveyor tests
felixcheung Sep 11, 2017
4bab8f5
[SPARK-21856] Add probability and rawPrediction to MLPC for Python
chunshengji Sep 11, 2017
dc74c0e
[MINOR][SQL] remove unuse import class
heary-cao Sep 11, 2017
e2ac2f1
[SPARK-21976][DOC] Fix wrong documentation for Mean Absolute Error.
FavioVazquez Sep 12, 2017
dd78167
[SPARK-14516][ML] Adding ClusteringEvaluator with the implementation …
mgaido91 Sep 12, 2017
7d0a3ef
[SPARK-21610][SQL][FOLLOWUP] Corrupt records are not handled properly…
jmchung Sep 12, 2017
9575582
[DOCS] Fix unreachable links in the document
sarutak Sep 12, 2017
515910e
[SPARK-17642][SQL] support DESC EXTENDED/FORMATTED table column commands
wzhfy Sep 12, 2017
720c94f
[SPARK-21027][ML][PYTHON] Added tunable parallelism to one vs. rest i…
ajaysaini725 Sep 12, 2017
b9b54b1
[SPARK-21368][SQL] TPCDSQueryBenchmark can't refer query files.
sarutak Sep 12, 2017
c5f9b89
[SPARK-18608][ML] Fix double caching
zhengruifeng Sep 12, 2017
1a98574
[SPARK-21979][SQL] Improve QueryPlanConstraints framework
gengliangwang Sep 12, 2017
371e4e2
[SPARK-21513][SQL] Allow UDF to_json support converting MapType to json
goldmedal Sep 13, 2017
f6c5d8f
[SPARK-21027][MINOR][FOLLOW-UP] add missing since tag
WeichenXu123 Sep 13, 2017
dd88fa3
[BUILD] Close stale PRs
srowen Sep 13, 2017
a1d98c6
[SPARK-21982] Set locale to US
Gschiavon Sep 13, 2017
4fbf748
[SPARK-21893][BUILD][STREAMING][WIP] Put Kafka 0.8 behind a profile
srowen Sep 13, 2017
ca00cc7
[SPARK-21963][CORE][TEST] Create temp file should be delete after use
heary-cao Sep 13, 2017
0fa5b7c
[SPARK-21690][ML] one-pass imputer
zhengruifeng Sep 13, 2017
b6ef1f5
[SPARK-21970][CORE] Fix Redundant Throws Declarations in Java Codebase
original-brownbear Sep 13, 2017
21c4450
[SPARK-21980][SQL] References in grouping functions should be indexed…
DonnyZone Sep 13, 2017
8c7e19a
[SPARK-4131] Merge HiveTmpFile.scala to SaveAsHiveFile.scala
janewangfb Sep 13, 2017
17edfec
[SPARK-20427][SQL] Read JDBC table use custom schema
wangyum Sep 13, 2017
8be7e6b
[SPARK-21973][SQL] Add an new option to filter queries in TPC-DS
maropu Sep 14, 2017
dcbb229
[MINOR][SQL] Only populate type metadata for required types such as C…
dilipbiswal Sep 14, 2017
8d8641f
[SPARK-21854] Added LogisticRegressionTrainingSummary for Multinomial…
Sep 14, 2017
66cb72d
[MINOR][DOC] Add missing call of `update()` in examples of PeriodicGr…
zhengruifeng Sep 14, 2017
c76153c
[SPARK-18608][ML][FOLLOWUP] Fix double caching for PySpark OneVsRest.
yanboliang Sep 14, 2017
4e6fc69
[SPARK-4131][FOLLOW-UP] Support "Writing data into the filesystem fro…
gatorsmile Sep 14, 2017
4b88393
[SPARK-21922] Fix duration always updating when task failed but statu…
caneGuy Sep 14, 2017
ddd7f5e
[SPARK-17642][SQL][FOLLOWUP] drop test tables and improve comments
wzhfy Sep 14, 2017
054ddb2
[SPARK-21988] Add default stats to StreamingExecutionRelation.
jose-torres Sep 14, 2017
a28728a
[SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting Map…
goldmedal Sep 15, 2017
8866174
[SPARK-22018][SQL] Preserve top-level alias metadata when collapsing …
tdas Sep 15, 2017
22b111e
[SPARK-21902][CORE] Print root cause for BlockManager#doPut
caneGuy Sep 15, 2017
4decedf
[SPARK-22002][SQL] Read JDBC table use custom schema support specify …
wangyum Sep 15, 2017
3c6198c
[SPARK-21987][SQL] fix a compatibility issue of sql event logs
cloud-fan Sep 15, 2017
79a4dab
[SPARK-21958][ML] Word2VecModel save: transform data in the cluster
travishegner Sep 15, 2017
c7307ac
[SPARK-15689][SQL] data source v2 read path
cloud-fan Sep 15, 2017
0bad10d
[SPARK-22017] Take minimum of all watermark execs in StreamExecution.
jose-torres Sep 16, 2017
73d9067
[SPARK-21967][CORE] org.apache.spark.unsafe.types.UTF8String#compareT…
original-brownbear Sep 16, 2017
f407302
[SPARK-22032][PYSPARK] Speed up StructType conversion
maver1ck Sep 17, 2017
6adf67d
[SPARK-21985][PYSPARK] PairDeserializer is broken for double-zipped RDDs
aray Sep 17, 2017
6308c65
[SPARK-21953] Show both memory and disk bytes spilled if either is pr…
ash211 Sep 18, 2017
7c72662
[SPARK-22043][PYTHON] Improves error message for show_profiles and du…
HyukjinKwon Sep 18, 2017
1e978b1
[SPARK-21113][CORE] Read ahead input stream to amortize disk IO cost …
Sep 18, 2017
894a756
[SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSuite
cloud-fan Sep 18, 2017
3b049ab
[SPARK-22003][SQL] support array column in vectorized reader with UDF
Sep 18, 2017
c66d64b
[SPARK-14878][SQL] Trim characters string function support
kevinyu98 Sep 18, 2017
94f7e04
[SPARK-22030][CORE] GraphiteSink fails to re-connect to Graphite inst…
Sep 19, 2017
10f45b3
[SPARK-22047][FLAKY TEST] HiveExternalCatalogVersionsSuite
cloud-fan Sep 19, 2017
a11db94
[SPARK-21923][CORE] Avoid calling reserveUnrollMemoryForThisTask for …
ConeyLiu Sep 19, 2017
7c92351
[MINOR][CORE] Cleanup dead code and duplication in Mem. Management
original-brownbear Sep 19, 2017
1bc17a6
[SPARK-22052] Incorrect Metric assigned in MetricsReporter.scala
Taaffy Sep 19, 2017
581200a
[SPARK-21428][SQL][FOLLOWUP] CliSessionState should point to the actu…
yaooqinn Sep 19, 2017
8319432
[SPARK-21917][CORE][YARN] Supporting adding http(s) resources in yarn…
jerryshao Sep 19, 2017
2f96242
[MINOR][ML] Remove unnecessary default value setting for evaluators.
yanboliang Sep 19, 2017
d5aefa8
[SPARK-21338][SQL] implement isCascadingTruncateTable() method in Agg…
huaxingao Sep 19, 2017
ee13f3e
[SPARK-21969][SQL] CommandUtils.updateTableStats should call refreshT…
Sep 19, 2017
718bbc9
[SPARK-22067][SQL] ArrowWriter should use position when setting UTF8S…
BryanCutler Sep 20, 2017
c6ff59a
[SPARK-18838][CORE] Add separate listener queues to LiveListenerBus.
Sep 20, 2017
280ff52
[SPARK-21977] SinglePartition optimizations break certain Streaming S…
brkyvz Sep 20, 2017
3d4dd14
[SPARK-22066][BUILD] Update checkstyle to 8.2, enable it, fix violations
srowen Sep 20, 2017
2b6ff0c
[SPARK-22066][BUILD][HOTFIX] Revert scala-maven-plugin to 3.2.2 to wo…
srowen Sep 20, 2017
e17901d
[SPARK-22049][DOCS] Confusing behavior of from_utc_timestamp and to_u…
srowen Sep 20, 2017
ce6a71e
[SPARK-22076][SQL] Expand.projections should not be a Stream
cloud-fan Sep 20, 2017
bb9c069
[SPARK-18838][HOTFIX][YARN] Check internal context state before stopp…
Sep 20, 2017
55d5fa7
[SPARK-21384][YARN] Spark + YARN fails with LocalFileSystem as defaul…
Sep 20, 2017
352bea5
[SPARK-22076][SQL][FOLLOWUP] Expand.projections should not be a Stream
cloud-fan Sep 21, 2017
1da5822
[SPARK-21934][CORE] Expose Shuffle Netty memory usage to MetricsSystem
jerryshao Sep 21, 2017
a8d9ec8
[SPARK-21780][R] Simpler Dataset.sample API in R
HyukjinKwon Sep 21, 2017
1d1a09b
[SPARK-17997][SQL] Add an aggregation function for counting distinct …
wzhfy Sep 21, 2017
1270e71
[SPARK-22086][DOCS] Add expression description for CASE WHEN
viirya Sep 21, 2017
f10cbf1
[SPARK-21977][HOTFIX] Adjust EnsureStatefulOpPartitioningSuite to use…
srowen Sep 21, 2017
b75bd17
[SPARK-21928][CORE] Set classloader on SerializerManager's private kryo
squito Sep 21, 2017
f7ad0db
[INFRA] Close stale PRs.
Sep 21, 2017
9cac249
[SPARK-22088][SQL] Incorrect scalastyle comment causes wrong styles i…
viirya Sep 21, 2017
b21b806
[SPARK-22075][ML] GBTs unpersist datasets cached by Checkpointer
zhengruifeng Sep 21, 2017
a8a5cd2
[SPARK-22009][ML] Using treeAggregate improve some algs
zhengruifeng Sep 21, 2017
f32a842
[SPARK-22053][SS] Stream-stream inner join in Append Mode
tdas Sep 21, 2017
fedf696
[SPARK-22094][SS] processAllAvailable should check the query state
zsxwing Sep 22, 2017
5ac9685
[SPARK-21981][PYTHON][ML] Added Python interface for ClusteringEvaluator
mgaido91 Sep 22, 2017
5960686
[SPARK-21998][SQL] SortMergeJoinExec did not calculate its outputOrde…
maryannxue Sep 22, 2017
8f130ad
[SPARK-22072][SPARK-22071][BUILD] Improve release build scripts
holdenk Sep 22, 2017
27fc536
[SPARK-21190][PYSPARK] Python Vectorized UDFs
BryanCutler Sep 22, 2017
10e37f6
[UI][STREAMING] Modify the title, 'Records' instead of 'Input Size'
Sep 22, 2017
d2b2932
[SPARK-22092] Reallocation in OffHeapColumnVector.reserveInternal cor…
ala Sep 22, 2017
3e6a714
[SPARK-21766][PYSPARK][SQL] DataFrame toPandas() raises ValueError wi…
viirya Sep 22, 2017
f180b65
[SPARK-22060][ML] Fix CrossValidator/TrainValidationSplit param persi…
WeichenXu123 Sep 23, 2017
c11f24a
[SPARK-18136] Fix SPARK_JARS_DIR for Python pip install on Windows
jsnowacki Sep 23, 2017
3920af7
[SPARK-22099] The 'job ids' list style needs to be changed in the SQL…
Sep 23, 2017
50ada2a
[SPARK-22033][CORE] BufferHolder, other size checks should account fo…
srowen Sep 23, 2017
04975a6
[SPARK-22109][SQL] Resolves type conflicts between strings and timest…
HyukjinKwon Sep 23, 2017
c792aff
[SPARK-20448][DOCS] Document how FileInputDStream works with object s…
steveloughran Sep 23, 2017
4a8c9e2
[SPARK-22110][SQL][DOCUMENTATION] Add usage and improve documentation…
kevinyu98 Sep 23, 2017
2274d84
[SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTruncateTable() me…
viirya Sep 24, 2017
9d48bd0
[SPARK-22093][TESTS] Fixes `assume` in `UtilsSuite` and `HiveDDLSuite`
HyukjinKwon Sep 24, 2017
4943ea5
[SPARK-22058][CORE] the BufferedInputStream will not be closed if an …
Sep 24, 2017
576c43f
[SPARK-22087][SPARK-14650][WIP][BUILD][REPL][CORE] Compile Spark REPL…
srowen Sep 24, 2017
20adf9a
[SPARK-22107] Change as to alias in python quickstart
Sep 25, 2017
365a29b
[SPARK-22100][SQL] Make percentile_approx support date/timestamp type…
wzhfy Sep 25, 2017
2c5b9b1
[SPARK-22083][CORE] Release locks in MemoryStore.evictBlocksToFreeSpace
squito Sep 25, 2017
038b185
[SPARK-22103] Move HashAggregateExec parent consume to a separate fun…
juliuszsompolski Sep 25, 2017
ce20478
[SPARK-22120][SQL] TestHiveSparkSession.reset() should clean out Hive…
Sep 25, 2017
d8e825e
[SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_udf and add do…
BryanCutler Sep 26, 2017
64fbd1c
[SPARK-22124][SQL] Sample and Limit should also defer input evaluatio…
viirya Sep 26, 2017
f21f6ce
[SPARK-22103][FOLLOWUP] Rename addExtraCode to addInnerClass
juliuszsompolski Sep 26, 2017
ceaec93
[BUILD] Close stale PRs
HyukjinKwon Sep 27, 2017
1fdfe69
[SPARK-22112][PYSPARK] Supports RDD of strings as input in spark.read…
goldmedal Sep 27, 2017
9c5935d
[SPARK-22141][SQL] Propagate empty relation before checking Cartesian…
gengliangwang Sep 27, 2017
74daf62
[SPARK-20642][CORE] Store FsHistoryProvider listing data in a KVStore.
Sep 27, 2017
d2b8b63
[SAPRK-20785][WEB-UI][SQL] Spark should provide jump links and add (c…
Sep 27, 2017
12e740b
[SPARK-22130][CORE] UTF8String.trim() scans " " twice
kiszk Sep 27, 2017
09cbf3d
[SPARK-22125][PYSPARK][SQL] Enable Arrow Stream format for vectorized…
ueshin Sep 27, 2017
9b98aef
[HOTFIX][BUILD] Fix finalizer checkstyle error and re-disable checkstyle
srowen Sep 27, 2017
02bb068
[SPARK-22143][SQL] Fix memory leak in OffHeapColumnVector
hvanhovell Sep 27, 2017
9244957
[SPARK-22140] Add TPCDSQuerySuite
gatorsmile Sep 28, 2017
7bf4da8
[MINOR] Fixed up pandas_udf related docs and formatting
BryanCutler Sep 28, 2017
3b117d6
[SPARK-22123][CORE] Add latest failure reason for task set blacklist
caneGuy Sep 28, 2017
f20be4d
[SPARK-22135][MESOS] metrics in spark-dispatcher not being registered…
Sep 28, 2017
01bd00d
[SPARK-22128][CORE] Update paranamer to 2.8 to avoid BytecodeReadingP…
srowen Sep 28, 2017
d74dee1
[SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExchangeExec
rxin Sep 28, 2017
d29d1e8
[SPARK-22159][SQL] Make config names consistently end with "enabled".
rxin Sep 28, 2017
323806e
[SPARK-22160][SQL] Make sample points per partition (in range partiti…
rxin Sep 29, 2017
161ba7e
[SPARK-22146] FileNotFoundException while reading ORC files containin…
mgaido91 Sep 29, 2017
0fa4dbe
[SPARK-22141][FOLLOWUP][SQL] Add comments for the order of batches
gengliangwang Sep 29, 2017
a2516f4
[SPARK-22142][BUILD][STREAMING] Move Flume support behind a profile
srowen Sep 29, 2017
ecbe416
[SPARK-22129][SPARK-22138] Release script improvements
holdenk Sep 29, 2017
9ed7394
[SPARK-22161][SQL] Add Impala-modified TPC-DS queries
gatorsmile Sep 29, 2017
4728640
Revert "[SPARK-22142][BUILD][STREAMING] Move Flume support behind a p…
gatorsmile Sep 29, 2017
530fe68
[SPARK-21904][SQL] Rename tempTables to tempViews in SessionCatalog
gatorsmile Sep 30, 2017
c6610a9
[SPARK-22122][SQL] Use analyzed logical plans to count input rows in …
maropu Sep 30, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ R-unit-tests.log
R/unit-tests.out
R/cran-check.out
R/pkg/vignettes/sparkr-vignettes.html
R/pkg/tests/fulltests/Rplots.pdf
build/*.jar
build/apache-maven*
build/scala*
Expand All @@ -46,6 +47,8 @@ dev/pr-deps/
dist/
docs/_site
docs/api
sql/docs
sql/site
lib_managed/
lint-r-report.log
log/
Expand Down
12 changes: 6 additions & 6 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -249,11 +249,11 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.8 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.8 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)
Expand All @@ -263,7 +263,7 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)
(The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)
(The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.10.4 - http://py4j.sourceforge.net/)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.10.6 - http://py4j.sourceforge.net/)
(Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - http://github.com/szeiger/junit-interface/)
(BSD licence) sbt and sbt-launch-lib.bash
(BSD 3 Clause) d3.min.js (https://github.com/mbostock/d3/blob/master/LICENSE)
Expand Down
6 changes: 1 addition & 5 deletions R/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,11 +66,7 @@ To run one of them, use `./bin/spark-submit <filename> <args>`. For example:
```bash
./bin/spark-submit examples/src/main/r/dataframe.R
```
You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
```bash
R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh
```
You can run R unit tests by following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests).

### Running on YARN

Expand Down
3 changes: 1 addition & 2 deletions R/WINDOWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,9 @@ To run the SparkR unit tests on Windows, the following steps are required —ass

4. Set the environment variable `HADOOP_HOME` to the full path to the newly created `hadoop` directory.

5. Run unit tests for SparkR by running the command below. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
5. Run unit tests for SparkR by running the command below. You need to install the needed packages following the instructions under [Running R Tests](http://spark.apache.org/docs/latest/building-spark.html#running-r-tests) first:

```
R -e "install.packages('testthat', repos='http://cran.us.r-project.org')"
.\bin\spark-submit2.cmd --conf spark.hadoop.fs.defaultFS="file:///" R\pkg\tests\run-all.R
```

1 change: 1 addition & 0 deletions R/pkg/.Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
^README\.Rmd$
^src-native$
^html$
^tests/fulltests/*
4 changes: 2 additions & 2 deletions R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: SparkR
Type: Package
Version: 2.2.0
Version: 2.3.0
Title: R Frontend for Apache Spark
Description: The SparkR package provides an R Frontend for Apache Spark.
Description: Provides an R Frontend for Apache Spark.
Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),
email = "shivaram@cs.berkeley.edu"),
person("Xiangrui", "Meng", role = "aut",
Expand Down
19 changes: 18 additions & 1 deletion R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ exportMethods("glm",
"spark.als",
"spark.kstest",
"spark.logit",
"spark.decisionTree",
"spark.randomForest",
"spark.gbt",
"spark.bisectingKmeans",
Expand All @@ -74,7 +75,8 @@ exportMethods("glm",
# Job group lifecycle management methods
export("setJobGroup",
"clearJobGroup",
"cancelJobGroup")
"cancelJobGroup",
"setJobDescription")

# Export Utility methods
export("setLogLevel")
Expand All @@ -84,6 +86,7 @@ exportClasses("SparkDataFrame")
exportMethods("arrange",
"as.data.frame",
"attach",
"broadcast",
"cache",
"checkpoint",
"coalesce",
Expand Down Expand Up @@ -123,6 +126,7 @@ exportMethods("arrange",
"group_by",
"groupBy",
"head",
"hint",
"insertInto",
"intersect",
"isLocal",
Expand Down Expand Up @@ -165,6 +169,7 @@ exportMethods("arrange",
"transform",
"union",
"unionAll",
"unionByName",
"unique",
"unpersist",
"where",
Expand Down Expand Up @@ -249,12 +254,15 @@ exportMethods("%<=>%",
"getField",
"getItem",
"greatest",
"grouping_bit",
"grouping_id",
"hex",
"histogram",
"hour",
"hypot",
"ifelse",
"initcap",
"input_file_name",
"instr",
"isNaN",
"isNotNull",
Expand All @@ -279,6 +287,8 @@ exportMethods("%<=>%",
"lower",
"lpad",
"ltrim",
"map_keys",
"map_values",
"max",
"md5",
"mean",
Expand Down Expand Up @@ -351,6 +361,7 @@ exportMethods("%<=>%",
"to_utc_timestamp",
"translate",
"trim",
"trunc",
"unbase64",
"unhex",
"unix_timestamp",
Expand Down Expand Up @@ -409,6 +420,8 @@ export("as.DataFrame",
"print.summary.GeneralizedLinearRegressionModel",
"read.ml",
"print.summary.KSTest",
"print.summary.DecisionTreeRegressionModel",
"print.summary.DecisionTreeClassificationModel",
"print.summary.RandomForestRegressionModel",
"print.summary.RandomForestClassificationModel",
"print.summary.GBTRegressionModel",
Expand All @@ -419,6 +432,7 @@ export("structField",
"structField.character",
"print.structField",
"structType",
"structType.character",
"structType.jobj",
"structType.structField",
"print.structType")
Expand Down Expand Up @@ -447,11 +461,14 @@ S3method(print, structField)
S3method(print, structType)
S3method(print, summary.GeneralizedLinearRegressionModel)
S3method(print, summary.KSTest)
S3method(print, summary.DecisionTreeRegressionModel)
S3method(print, summary.DecisionTreeClassificationModel)
S3method(print, summary.RandomForestRegressionModel)
S3method(print, summary.RandomForestClassificationModel)
S3method(print, summary.GBTRegressionModel)
S3method(print, summary.GBTClassificationModel)
S3method(structField, character)
S3method(structField, jobj)
S3method(structType, character)
S3method(structType, jobj)
S3method(structType, structField)
Loading