Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression #27237

Closed
wants to merge 8,730 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
8730 commits
Select commit Hold shift + click to select a range
e8f4401
revert [SPARK-29680][SQL] Remove ALTER TABLE CHANGE COLUMN syntax
cloud-fan Jan 2, 2020
da8b26c
[SPARK-30341][SQL] Overflow check for interval arithmetic operations
yaooqinn Jan 2, 2020
0c150fa
[SPARK-30387] Improving stop hook log message
jobitmathew Jan 2, 2020
fb4bc25
[SPARK-30285][CORE] Fix deadlock between LiveListenerBus#stop and Asy…
wangshuo128 Jan 3, 2020
fc92cce
[SPARK-29930][SQL][FOLLOW-UP] Allow only default value to be set for …
MaxGekk Jan 3, 2020
b12d06e
[SPARK-30412][SQL][TESTS] Eliminate warnings in Java tests regarding …
MaxGekk Jan 3, 2020
c854da7
[SPARK-30384][WEBUI] Needs to improve the Column name and Add tooltip…
07ARB Jan 3, 2020
fb98de2
[SPARK-30214][SQL] A new framework to resolve v2 commands
yaooqinn Jan 3, 2020
b2eac86
[SPARK-30225][CORE] Correct read() behavior past EOF in NioBufferedFi…
Jan 3, 2020
e15ff06
[SPARK-29768][SQL][FOLLOW-UP] Improve handling non-deterministic filt…
Ngone51 Jan 3, 2020
b97ac5b
[SPARK-29947][SQL] Improve ResolveRelations performance
wangyum Jan 3, 2020
fc2e3e4
[SPARK-30359][CORE] Don't clear executorsPendingToRemove at the begin…
Ngone51 Jan 3, 2020
57c455a
[SPARK-30406] OneForOneStreamManager ensure that compound operations …
ajithme Jan 3, 2020
a0366c0
[SPARK-30358][ML][PYSPARK][FOLLOWUP] ML expose predictRaw and predict…
huaxingao Jan 3, 2020
795b086
[SPARK-30144][ML][PYSPARK] Make MultilayerPerceptronClassificationMod…
huaxingao Jan 3, 2020
2e1ca9d
[SPARK-30267][SQL] Avro arrays can be of any List
steven-aerts Jan 3, 2020
dd54265
Revert "[SPARK-23264][SQL] Make INTERVAL keyword optional when ANSI e…
cloud-fan Jan 3, 2020
a948d73
[SPARK-30390][MLLIB] Avoid double caching in mllib.KMeans#runWithWeights
amanomer Jan 4, 2020
9cdf187
[SPARK-30398][ML] PCA/RegressionMetrics/RowMatrix avoid unnecessary c…
zhengruifeng Jan 4, 2020
9a25970
[SPARK-30415][SQL] Improve Readability of SQLConf Doc
iRakson Jan 4, 2020
dec2298
[SPARK-30418][ML] Make FM call super class method extractLabeledPoints
huaxingao Jan 6, 2020
eb990d9
[SPARK-9612][ML][FOLLOWUP] fix GBT support weights if subsamplingRate<1
zhengruifeng Jan 6, 2020
4576351
[SPARK-30426][SS][DOC] Fix the disorder of structured-streaming-kafka…
xuanyuanking Jan 6, 2020
3ed89a0
[SPARK-29800][SQL] Rewrite non-correlated EXISTS subquery use ScalaSu…
AngersZhuuuu Jan 6, 2020
9d74f85
[SPARK-30226][SQL] Remove withXXX functions in WriteBuilder
edrevo Jan 6, 2020
40e6922
[SPARK-30313][CORE] Ensure EndpointRef is available MasterWebUI/Worke…
HeartSaVioR Jan 6, 2020
e91613a
[SPARK-30154][ML] PySpark UDF to convert MLlib vectors to dense arrays
WeichenXu123 Jan 7, 2020
746f2c2
[SPARK-30430][PYTHON][DOCS] Add a note that UserDefinedFunction's con…
HyukjinKwon Jan 7, 2020
9df5450
[SPARK-19784][SPARK-25403][SQL] Refresh the table even table stats is…
wangyum Jan 7, 2020
02aa5c3
[SPARK-30433][SQL] Make conflict attributes resolution more scalable …
Ngone51 Jan 7, 2020
7913139
[SPARK-30338][SQL] Avoid unnecessary InternalRow copies in ParquetRow…
JoshRosen Jan 7, 2020
1b36abd
[SPARK-30414][SQL] ParquetRowConverter optimizations: arrays, maps, p…
JoshRosen Jan 7, 2020
ce49a81
[SPARK-30335][SQL][DOCS] Add a note first, last, collect_list and col…
HyukjinKwon Jan 7, 2020
d6f1725
[SPARK-30214][SQL] V2 commands resolves namespaces with new resolutio…
imback82 Jan 7, 2020
369e26d
[SPARK-30431][SQL] Update SqlBase.g4 to create commentSpec pattern li…
yaooqinn Jan 7, 2020
0cd1863
[SPARK-30173] Tweak stale PR message
nchammas Jan 7, 2020
f512ed1
[SPARK-30039][SQL] CREATE FUNCTION should do multi-catalog resolution
planga82 Jan 7, 2020
8f0733f
[SPARK-30382][SQL] Remove Hive LogUtils usage to prevent ClassNotFoun…
ajithme Jan 7, 2020
85a203b
[SPARK-30450][INFRA] Exclude .git folder for python linter
ericfchang Jan 7, 2020
ed338aa
[SPARK-28825][SQL][DOC] Documentation for Explain Command
PavithraRamachandran Jan 8, 2020
69f57f6
[SPARK-30381][ML] Refactor GBT to reuse treePoints for all trees
zhengruifeng Jan 8, 2020
b263233
[SPARK-30302][SQL] Complete info for show create table for views
wzhfy Jan 8, 2020
d1f461e
[SPARK-30453][BUILD][R] Update AppVeyor R version to 3.6.2
HyukjinKwon Jan 8, 2020
f4a7d03
[SPARK-30429][SQL] Optimize catalogString and usage in ValidateExtern…
viirya Jan 8, 2020
bb7aaac
[SPARK-30267][SQL][FOLLOWUP] Use while loop in Avro Array Deserializer
gengliangwang Jan 8, 2020
16e2641
[SPARK-30214][SQL][FOLLOWUP] Remove statement logical plans for names…
imback82 Jan 8, 2020
b5aedb3
[SPARK-30215][SQL] Remove PrunedInMemoryFileIndex and merge its funct…
fuwhu Jan 8, 2020
ad7698b
[SPARK-30410][SQL] Calculating size of table with large number of par…
wzhfy Jan 8, 2020
93a9320
[MINOR][CORE] Process bar should print new line to avoid polluting logs
Ngone51 Jan 8, 2020
4585acf
[MINOR][ML][INT] Array.fill(0) -> Array.ofDim; Array.empty -> Array.e…
zhengruifeng Jan 8, 2020
95dd7a6
[SPARK-30445][CORE] Accelerator aware scheduling handle setting confi…
tgravescs Jan 8, 2020
8e50e22
[SPARK-30281][SS] Consider partitioned/recursive option while verifyi…
HeartSaVioR Jan 8, 2020
2c1099e
[SPARK-30417][CORE] Task speculation numTaskThreshold should be great…
yuchenhuo Jan 8, 2020
6bf7661
[SPARK-30315][SQL] Add adaptive execution context
maryannxue Jan 9, 2020
d5e0e37
[SPARK-30440][CORE][TESTS] Avoid race condition in TaskSetManagerSuit…
ajithme Jan 9, 2020
38695a9
[SPARK-30434][PYTHON][SQL] Move pandas related functionalities into '…
HyukjinKwon Jan 9, 2020
96699d3
[SPARK-30464][PYTHON][DOCS] Explicitly note that we don't add "pandas…
HyukjinKwon Jan 9, 2020
ee65efd
[SPARK-30183][SQL] Disallow to specify reserved properties in CREATE/…
yaooqinn Jan 9, 2020
fce1416
[SPARK-30450][INFRA][FOLLOWUP] Fix git folder regex for windows file …
ericfchang Jan 9, 2020
d3fde04
[SPARK-28198][PYTHON][FOLLOW-UP] Run the tests of MAP ITER UDF in Jen…
HyukjinKwon Jan 9, 2020
d229fb3
[SPARK-30428][SQL] File source V2: support partition pruning
gengliangwang Jan 9, 2020
044e486
[SPARK-30452][ML][PYSPARK] Add predict and numFeatures in Python Isot…
huaxingao Jan 9, 2020
5e6ee16
[SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider
brkyvz Jan 9, 2020
e761f2a
[SPARK-30459][SQL] Fix ignoreMissingFiles/ignoreCorruptFiles in data …
Ngone51 Jan 9, 2020
a0ee3e0
[MINOR][SQL][TEST-HIVE1.2] Fix scalastyle error due to length line in…
shaneknapp Jan 9, 2020
9ffb56c
[SPARK-30416][SQL] Log a warning for deprecated SQL config in `set()`…
MaxGekk Jan 10, 2020
a301cf4
[SPARK-30439][SQL] Support non-nullable column in CREATE TABLE, ADD C…
cloud-fan Jan 10, 2020
98df052
[SPARK-30480][PYSPARK][TESTS] Fix 'test_memory_limit' on pyspark test
HeartSaVioR Jan 10, 2020
8f57470
[SPARK-30018][SQL] Support ALTER DATABASE SET OWNER syntax
yaooqinn Jan 10, 2020
e53c60d
[SPARK-30447][SQL] Constant propagation nullability issue
peter-toth Jan 10, 2020
8ac99d7
Revert "[SPARK-30480][PYSPARK][TESTS] Fix 'test_memory_limit' on pysp…
HyukjinKwon Jan 10, 2020
b24f547
[SPARK-30234][SQL] ADD FILE cannot add directories from sql CLI
iRakson Jan 10, 2020
0fdfdc3
[SPARK-30448][CORE] accelerator aware scheduling enforce cores as lim…
tgravescs Jan 10, 2020
b69e13a
[SPARK-30343][SQL] Skip unnecessary checks in RewriteDistinctAggregates
maropu Jan 10, 2020
d4210b5
[SPARK-30468][SQL] Use multiple lines to display data columns for sho…
wzhfy Jan 10, 2020
2072afd
[SPARK-29779][CORE] Compact old event log files and cleanup
HeartSaVioR Jan 10, 2020
81e6e83
[SPARK-30312][SQL] Preserve path permission and acl when truncate table
viirya Jan 10, 2020
234d162
[SPARK-29748][PYTHON][SQL] Remove Row field sorting in PySpark for ve…
BryanCutler Jan 10, 2020
9f52a8c
[SPARK-30489][BUILD] Make build delete pyspark.zip file properly
jeff303 Jan 11, 2020
aeda417
[SPARK-30312][SQL][FOLLOWUP] Use inequality check instead to be robust
viirya Jan 11, 2020
998301e
[SPARK-30478][CORE][DOCS] Fix Memory Package documentation
sddyljsx Jan 12, 2020
79e2a5e
[SPARK-30458][WEBUI] Fix Wrong Executor Computing Time in Time Line o…
sddyljsx Jan 12, 2020
fdc9794
[SPARK-30353][SQL] Add IsNotNull check in SimplifyBinaryComparison op…
ulysses-you Jan 12, 2020
85d4c67
[SPARK-27296][SQL] Allows Aggregator to be registered as a UDF
erikerlandson Jan 12, 2020
747d9ec
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL b…
MaxGekk Jan 12, 2020
88fbbff
[SPARK-30409][TEST][FOLLOWUP][HOTFIX] Remove dangling JSONBenchmark-j…
dongjoon-hyun Jan 12, 2020
64296d3
[SPARK-28752][BUILD][DOCS][FOLLOW-UP] Render examples imported from J…
HyukjinKwon Jan 13, 2020
5bb490b
[SPARK-30457][ML] Use PeriodicRDDCheckpointer instead of NodeIdCache
zhengruifeng Jan 13, 2020
3f17d3a
[SPARK-30245][SQL] Add cache for Like and RLike when pattern is not s…
ulysses-you Jan 13, 2020
49d2341
[SPARK-28152][SQL][FOLLOWUP] Add a legacy conf for old MsSqlServerDia…
dongjoon-hyun Jan 13, 2020
b5f36e7
[SPARK-21869][SS][DOCS][FOLLOWUP] Document Kafka producer pool config…
HeartSaVioR Jan 13, 2020
c423f74
[SPARK-30480][PYTHON][TESTS] Increases the memory limit being tested …
HyukjinKwon Jan 13, 2020
014d65d
[SPARK-30493][PYTHON][ML] Remove OneVsRestModel setClassifier, setLab…
zero323 Jan 13, 2020
e5cee65
[SPARK-30377][ML] Make Regressors extend abstract class Regressor
huaxingao Jan 13, 2020
6da323a
[SPARK-30351][ML][PYSPARK] BisectingKMeans support instance weighting
huaxingao Jan 13, 2020
37907f0
[SPARK-30188][SQL] Resolve the failed unit tests when enable AQE
JkSelf Jan 13, 2020
2faeaca
[SPARK-30234][SQL][DOCS][FOLOWUP] Update Documentation for ADD FILE a…
iRakson Jan 14, 2020
e892f4e
Revert "[SPARK-28670][SQL] create function should thrown Exception if…
HyukjinKwon Jan 14, 2020
3235226
[SPARK-30500][SPARK-30501][SQL] Remove SQL configs deprecated in Spar…
MaxGekk Jan 14, 2020
6afbc96
[SPARK-30482][SQL][CORE][TESTS] Add sub-class of `AppenderSkeleton` r…
MaxGekk Jan 14, 2020
17340d9
[SPARK-30292][SQL] Throw Exception when invalid string is cast to num…
iRakson Jan 14, 2020
7dc62ee
[SPARK-30325][CORE] markPartitionCompleted cause task status inconsis…
Jan 14, 2020
3cca7f2
[SPARK-30498][ML][PYSPARK] Fix some ml parity issues between python a…
huaxingao Jan 14, 2020
56fec44
[SPARK-29544][SQL] optimize skewed partition based on data size
JkSelf Jan 14, 2020
227d249
[SPARK-30423][SQL] Deprecate UserDefinedAggregateFunction
erikerlandson Jan 14, 2020
40b068d
[SPARK-9478][ML][PYSPARK] Add sample weights to Random Forest
zhengruifeng Jan 14, 2020
c0c294f
[SPARK-27142][SQL] Provide REST API for SQL information
ajithme Jan 14, 2020
4e86366
[SPARK-30509][SQL] Fix deprecation log warning in Avro schema inferring
MaxGekk Jan 14, 2020
d1172ff
[MINOR][TESTS] Remove unsupported `header` option in AvroSuite
MaxGekk Jan 14, 2020
0dd2ccc
[SPARK-30452][ML][PYSPARK][FOLLOWUP] Change IsotonicRegressionModel.n…
zero323 Jan 15, 2020
0fb507d
[SPARK-30505][DOCS] Deprecate Avro option `ignoreExtension` in sql-da…
MaxGekk Jan 15, 2020
57b3295
Support offset
beliefer Jan 15, 2020
e1fd933
Support offset
beliefer Jan 15, 2020
d900e06
Fix scala style
beliefer Jan 15, 2020
ac3a31b
[SPARK-30515][SQL] Refactor SimplifyBinaryComparison to reduce the ti…
gengliangwang Jan 15, 2020
0393cc6
[SPARK-29708][SQL] Correct aggregated values when grouping sets are d…
maropu Jan 15, 2020
b93c7b7
[SPARK-30504][PYTHON][ML] Set weightCol in OneVsRest(Model) _to_java …
zero323 Jan 15, 2020
91c7785
[SPARK-30378][ML][PYSPARK][FOLLOWUP] Remove Param fields provided by …
zero323 Jan 15, 2020
05f3622
[SPARK-30479][SQL] Apply compaction of event log to SQL events
HeartSaVioR Jan 15, 2020
dbfa026
[SPARK-30495][SS] Consider spark.security.credentials.kafka.enabled a…
gaborgsomogyi Jan 15, 2020
ba55c53
[SPARK-30246][CORE] OneForOneStreamManager might leak memory in conne…
hensg Jan 15, 2020
d4d89e0
[SPARK-26736][SQL] Partition pruning through nondeterministic express…
maropu Jan 15, 2020
a704a57
[SPARK-30497][SQL] migrate DESCRIBE TABLE to the new framework
cloud-fan Jan 16, 2020
68e04b7
[SPARK-27986][SQL][FOLLOWUP] Respect filter in sql/toString of Aggreg…
maropu Jan 16, 2020
a7ee438
[SPARK-30518][SQL] Precision and scale should be same for values betw…
Ngone51 Jan 16, 2020
b38e1f4
[SPARK-30502][ML][CORE] PeriodicRDDCheckpointer support storageLevel
zhengruifeng Jan 16, 2020
bee05cc
[SPARK-30434][FOLLOW-UP][PYTHON][SQL] Make the parameter list consist…
HyukjinKwon Jan 16, 2020
65f396c
test
beliefer Jan 16, 2020
2a41f75
[SPARK-30312][SQL][FOLLOWUP] Rename conf by adding `.enabled`
viirya Jan 16, 2020
f325106
[SPARK-30323][SQL] Support filters pushdown in CSV datasource
MaxGekk Jan 16, 2020
3ae10d5
[SPARK-30491][INFRA] Enable dependency audit files to tell dependency…
mengCareers Jan 16, 2020
d1203fc
[SPARK-30521][SQL][TESTS] Eliminate deprecation warnings for Expressi…
MaxGekk Jan 16, 2020
8f96dfe
Fix bug
beliefer Jan 16, 2020
8fb6da6
Fix bug
beliefer Jan 16, 2020
3cc194f
Fix bug
beliefer Jan 16, 2020
686d4a8
Fix bug
beliefer Jan 16, 2020
d77957d
[SPARK-29565][FOLLOWUP] add setInputCol/setOutputCol in OHEModel
huaxingao Jan 16, 2020
34fce08
[SPARK-30507][SQL] TableCalalog reserved properties shoudn't be chang…
yaooqinn Jan 16, 2020
bc5d463
[SPARK-30524][SQL] Disable OptimizeSkewedJoin rule when introducing a…
JkSelf Jan 16, 2020
6261b98
[SPARK-29950][K8S] Blacklist deleted executors in K8S with dynamic al…
Jan 16, 2020
ae489f4
[SPARK-30534][INFRA] Use mvn in `dev/scalastyle`
dongjoon-hyun Jan 17, 2020
5467b20
[MINOR][ML] Change DecisionTreeClassifier to FMClassifier in OneVsRes…
huaxingao Jan 17, 2020
cdf4981
[SPARK-30499][SQL] Remove SQL config spark.sql.execution.pandas.respe…
MaxGekk Jan 17, 2020
2e3cdce
[SPARK-29188][PYTHON][FOLLOW-UP] Explicitly disable Arrow execution f…
HyukjinKwon Jan 17, 2020
0d4bba5
[SPARK-29572][SQL] add v1 read fallback API in DS v2
cloud-fan Jan 17, 2020
975b276
[SPARK-29188][PYTHON][FOLLOW-UP] Explicitly disable Arrow execution f…
HyukjinKwon Jan 17, 2020
a2f0611
[SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework
imback82 Jan 17, 2020
eb90424
[SPARK-29306][CORE] Stage Level Sched: Executors need to track what R…
tgravescs Jan 17, 2020
b1fe13d
[SPARK-30310][CORE] Resolve missing match case in SparkUncaughtExcept…
tinhto-000 Jan 17, 2020
34eb214
[SPARK-30041][SQL][WEBUI] Add Codegen Stage Id to Stage DAG visualiza…
LucaCanali Jan 17, 2020
4b73091
[SPARK-27868][CORE][FOLLOWUP] Recover the default value to -1 again
xCASx Jan 17, 2020
bd9c349
[SPARK-29876][SS] Delete/archive file source completed files in separ…
gaborgsomogyi Jan 17, 2020
9385c93
[SPARK-30312][DOCS][FOLLOWUP] Add a migration guide
dongjoon-hyun Jan 17, 2020
c8a7124
[SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE wi…
kevinyu98 Jan 18, 2020
c72bd3e
[SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer J…
dongjoon-hyun Jan 18, 2020
5e72ece
[SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors a…
zero323 Jan 18, 2020
2986096
[SPARK-30544][BUILD] Upgrade the version of Genjavadoc to 0.15
sarutak Jan 18, 2020
e41bc12
[SPARK-30539][PYTHON][SQL] Add DataFrame.tail in PySpark
HyukjinKwon Jan 18, 2020
49fe73e
[MINOR][DOCS] Remove note about -T for parallel build
srowen Jan 18, 2020
74a94f6
[MINOR][HIVE] Pick up HIVE-22708 HTTP transport fix
srowen Jan 18, 2020
1ccee04
[SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments
JkSelf Jan 19, 2020
ba9e152
[SPARK-30551][SQL] Disable comparison for interval type
yaooqinn Jan 19, 2020
03ec655
[SPARK-30530][SQL] Fix filter pushdown for bad CSV records
MaxGekk Jan 19, 2020
bb93a87
[SPARK-30371][K8S] Add spark.kubernetes.driver.master conf
wackxu Jan 19, 2020
3c677d7
[SPARK-30282][DOCS][FOLLOWUP] Update SQL migration guide for SHOW TBL…
imback82 Jan 19, 2020
626452f
[SPARK-30566][BUILD] Iterator doesn't refer outer identifier named "i…
sarutak Jan 20, 2020
27d8a54
[SPARK-30572][BUILD] Add a fallback Maven repository
dongjoon-hyun Jan 20, 2020
12add10
Add check analysis
beliefer Jan 20, 2020
90cae7f
Merge branch 'master' into support-ansi-offset
beliefer Jan 20, 2020
df35639
[SPARK-29290][CORE] Update to chill 0.9.5
srowen Jan 20, 2020
7cb9b96
[SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
maropu Jan 20, 2020
d1d1641
[SPARK-30413][SQL] Avoid WrappedArray roundtrip in GenericArrayData c…
JoshRosen Jan 20, 2020
4039194
Fix conflict
beliefer Jan 20, 2020
7ab5f40
[SPARK-30547][SQL] Add unstable annotation to the CalendarInterval class
yaooqinn Jan 20, 2020
3c49ac3
[SPARK-30554][SQL] Return `Iterable` from `FailureSafeParser.rawParser`
MaxGekk Jan 20, 2020
a31c510
[SPARK-30558][SQL] Avoid rebuilding `AvroOptions` per each partition
MaxGekk Jan 20, 2020
d18c591
Adjust test cases
beliefer Jan 20, 2020
3f1afbc
[SPARK-30535][SQL] Migrate ALTER TABLE commands to the new framework
imback82 Jan 20, 2020
b0f3317
[SPARK-30578][SQL][TEST] Explicitly set conf to use DSv2 for orc in O…
Ngone51 Jan 20, 2020
2c7d4f2
[SPARK-30482][CORE][SQL][TESTS][FOLLOW-UP] Output caller info in log …
MaxGekk Jan 21, 2020
c0d9a41
[SPARK-30530][SQL][FOLLOW-UP] Remove unnecessary codes and fix commen…
HyukjinKwon Jan 21, 2020
da83998
[SPARK-30019][SQL] Add the owner property to v2 table
yaooqinn Jan 21, 2020
533427f
[SPARK-30568][SQL] Invalidate interval type as a field table schema
yaooqinn Jan 21, 2020
b45c5ea
[SPARK-30587][SQL][TESTS] Add test suites for CSV and JSON v1
MaxGekk Jan 21, 2020
8fd3178
[SPARK-30475][SQL] File source V2: Push data filters for file listing
guykhazma Jan 21, 2020
8ddfd9d
[SPARK-30433][SQL][FOLLOW-UP] Optimize collect conflict plans
Ngone51 Jan 21, 2020
bcb16e3
[SPARK-30571][CORE] fix splitting shuffle fetch requests
cloud-fan Jan 21, 2020
e7b4d43
Optimize code
beliefer Jan 21, 2020
cfae848
[MINOR][DOCS] Fix Jenkins build image and link in README.md
HyukjinKwon Jan 21, 2020
d2ad1a5
Revert "[SPARK-30534][INFRA] Use mvn in `dev/scalastyle`"
HyukjinKwon Jan 21, 2020
0cb99cd
[SPARK-30547][SQL][FOLLOWUP] Update since anotation for CalendarInter…
yaooqinn Jan 21, 2020
1cd2355
[SPARK-30593][SQL] Revert interval ISO/ANSI SQL Standard output since…
yaooqinn Jan 21, 2020
26b6b5a
[SPARK-30252][SQL] Disallow negative scale of Decimal
Ngone51 Jan 21, 2020
17f8275
[SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions
fuwhu Jan 21, 2020
14c9446
[SPARK-30599][CORE][TESTS] Increase the maximum number of log events …
MaxGekk Jan 21, 2020
4c9b6a1
[SPARK-30553][DOCS] fix structured-streaming java example error
Jan 22, 2020
ac79fef
[SPARK-28264][PYTHON][SQL] Support type hints in pandas UDF and renam…
HyukjinKwon Jan 22, 2020
f1070b7
[SPARK-30591][SQL] Remove the nonstandard SET OWNER syntax for namesp…
yaooqinn Jan 22, 2020
b78c436
[SPARK-30555][SQL] MERGE INTO insert action should only access column…
cloud-fan Jan 22, 2020
cc6ce8f
[SPARK-30503][ML] OnlineLDAOptimizer does not handle persistance corr…
zhengruifeng Jan 22, 2020
d036ef6
[SPARK-30573][DOC] Document WHERE Clause of SELECT statement in SQL R…
dilipbiswal Jan 22, 2020
208770a
[SPARK-30575][DOC] Document HAVING Clause of SELECT statement in SQL …
dilipbiswal Jan 22, 2020
eb6f7da
[SPARK-30583][DOC] Document LIMIT Clause of SELECT statement in SQL R…
dilipbiswal Jan 22, 2020
34f6ee0
[SPARK-30592][SQL] Interval support for csv and json funtions
yaooqinn Jan 22, 2020
315fcdd
[SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
JkSelf Jan 22, 2020
ee6e399
[SPARK-30604][CORE] Fix a log message by including hostLocalBlockByte…
Udbhav30 Jan 22, 2020
23a220e
[SPARK-30606][SQL] Fix the `like` function with 2 parameters
MaxGekk Jan 22, 2020
164018a
[SPARK-30574][DOC] Document GROUP BY Clause of SELECT statement in SQ…
dilipbiswal Jan 23, 2020
b7af3f7
[SPARK-28801][DOC] Document SELECT statement in SQL Reference (Main p…
dilipbiswal Jan 23, 2020
4cfcdad
[SPARK-30531][WEB UI] Do not render plan viz when it exists already
EnricoMi Jan 23, 2020
9abcd9e
[SPARK-30556][SQL] Copy sparkContext.localproperties to child thread …
ajithme Jan 23, 2020
4ed6bb7
[SPARK-30609] Allow default merge command resolution to be bypassed b…
tdas Jan 23, 2020
5252366
[SPARK-30535][SQL] Revert "[] Migrate ALTER TABLE commands to the new…
brkyvz Jan 23, 2020
2025a1f
[SPARK-30601][BUILD] Add a Google Maven Central as a primary repository
HyukjinKwon Jan 23, 2020
8fe1937
[SPARK-30607][SQL][PYSPARK][SPARKR] Add overlay wrappers for SparkR a…
zero323 Jan 23, 2020
4b86134
[SPARK-30543][ML][PYSPARK][R] RandomForest add Param bootstrap to con…
zhengruifeng Jan 23, 2020
49f5ef5
[SPARK-30575][DOCS][FOLLOWUP] Fix typos in documents
huaxingao Jan 23, 2020
cfa5a3f
[SPARK-27871][SQL][FOLLOW-UP] Remove the conf spark.sql.optimizer.rea…
gatorsmile Jan 23, 2020
e6aa283
[SPARK-30605][SQL] move defaultNamespace from SupportsNamespace to Ca…
cloud-fan Jan 23, 2020
2939253
[SPARK-30188][SQL][TESTS][FOLLOW-UP] Remove `sorted` in asserts of co…
MaxGekk Jan 23, 2020
89d4f70
[SPARK-29175][SQL][FOLLOW-UP] Rename the config name to spark.sql.mav…
xuanyuanking Jan 23, 2020
685595c
[SPARK-30620][SQL] avoid unnecessary serialization in AggregateExpres…
cloud-fan Jan 23, 2020
1f6bcfd
[SPARK-28794][SQL][DOC] Documentation for Create table Command
PavithraRamachandran Jan 23, 2020
91f8b3a
[SPARK-30570][BUILD] Update scalafmt plugin to 1.0.3 with onlyChanged…
koeninger Jan 23, 2020
2942711
[SPARK-29947][SQL][FOLLOWUP] Fix table lookup cache
cloud-fan Jan 23, 2020
a30e142
[SPARK-30603][SQL] Move RESERVED_PROPERTIES from SupportsNamespaces a…
yaooqinn Jan 23, 2020
156aceb
[SPARK-30298][SQL] Respect aliases in output partitioning of projects…
imback82 Jan 23, 2020
48a5a14
[SPARK-27083][SQL][FOLLOW-UP] Rename spark.sql.subquery.reuse to spar…
gatorsmile Jan 23, 2020
667cdb2
[SPARK-28962][SQL][FOLLOW-UP] Add the parameter description for the S…
gatorsmile Jan 24, 2020
08ce75d
[MINOR][DOCS] Fix src/dest type documentation for `to_timestamp`
deepyaman Jan 24, 2020
20197a5
[SPARK-30627][SQL] Disable all the V2 file sources by default
gengliangwang Jan 24, 2020
8ec0124
[SPARK-29924][DOCS] Document Apache Arrow JDK11 requirement
dongjoon-hyun Jan 24, 2020
2f6a914
[SPARK-30626][K8S] Add SPARK_APPLICATION_ID into driver pod env
Jeffwan Jan 24, 2020
42e5f85
[SPARK-30630][ML] Remove numTrees in GBT in 3.0.0
huaxingao Jan 24, 2020
579e4c1
[SPARK-29721][SQL] Prune unnecessary nested fields from Generate with…
viirya Jan 25, 2020
022ace7
[SPARK-30639][BUILD] Upgrade Jersey to 2.30
dongjoon-hyun Jan 25, 2020
5d29897
[SPARK-30579][DOC] Document ORDER BY Clause of SELECT statement in SQ…
dilipbiswal Jan 26, 2020
a1f2a88
[SPARK-30645][SPARKR][TESTS][WINDOWS] Move Unicode test data to exter…
zero323 Jan 26, 2020
5278a0e
[SPARK-29777][FOLLOW-UP][SPARKR] Remove no longer valid test for recu…
zero323 Jan 26, 2020
17b2c8a
Revert "[SPARK-25496][SQL] Deprecate from_utc_timestamp and to_utc_ti…
gatorsmile Jan 26, 2020
8f999b2
[SPARK-30644][SQL][TEST] Remove query index from the golden files of …
gatorsmile Jan 26, 2020
ee4d3ec
[SPARK-30314] Add identifier and catalog information to DataSourceV2R…
yuchenhuo Jan 26, 2020
5c09178
[SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during …
BryanCutler Jan 26, 2020
5af0453
[SPARK-30581][DOC] Document SORT BY Clause of SELECT statement in SQL…
dilipbiswal Jan 27, 2020
a028088
[SPARK-30589][DOC] Document DISTRIBUTE BY Clause of SELECT statement …
dilipbiswal Jan 27, 2020
9c6fde4
[SPARK-30588][DOC] Document CLUSTER BY Clause of SELECT statement in …
dilipbiswal Jan 27, 2020
34d0e0f
[SPARK-30653][INFRA][SQL] EOL character enforcement for java/scala/xm…
HeartSaVioR Jan 27, 2020
a9b1124
[SPARK-30633][SQL] Append L to seed when type is LongType
patrickcording Jan 27, 2020
cc986a7
[SPARK-30625][SQL] Support `escape` as third parameter of the `like` …
MaxGekk Jan 27, 2020
7825bc3
Resolve conflict
beliefer Jan 28, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
5 changes: 5 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1,2 +1,7 @@
*.bat text eol=crlf
*.cmd text eol=crlf
*.java text eol=lf
*.scala text eol=lf
*.xml text eol=lf
*.py text eol=lf
*.R text eol=lf
40 changes: 35 additions & 5 deletions .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
@@ -1,12 +1,42 @@
## What changes were proposed in this pull request?
<!--
Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
4. Be sure to keep the PR description updated to reflect all changes.
5. Please write your PR title to summarize what this PR proposes.
6. If possible, provide a concise example to reproduce the issue for a faster review.
-->

(Please fill in changes proposed in this fix)
### What changes were proposed in this pull request?
<!--
Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
2. If you fix some SQL features, you can provide some references of other DBMSes.
3. If there is design documentation, please add the link.
4. If there is a discussion in the mailing list, please add the link.
-->


## How was this patch tested?
### Why are the changes needed?
<!--
Please clarify why the changes are needed. For instance,
1. If you propose a new API, clarify the use case for a new API.
2. If you fix a bug, you can clarify why it is a bug.
-->

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

### Does this PR introduce any user-facing change?
<!--
If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
If no, write 'No'.
-->

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

### How was this patch tested?
<!--
If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
If tests were not added, please describe why they were not added and/or why it was difficult to add.
-->
119 changes: 119 additions & 0 deletions .github/workflows/master.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
name: master

on:
push:
branches:
- master
pull_request:
branches:
- master

jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
java: [ '1.8', '11' ]
hadoop: [ 'hadoop-2.7', 'hadoop-3.2' ]
hive: [ 'hive-1.2', 'hive-2.3' ]
exclude:
- java: '11'
hive: 'hive-1.2'
- hadoop: 'hadoop-3.2'
hive: 'hive-1.2'
name: Build Spark - JDK${{ matrix.java }}/${{ matrix.hadoop }}/${{ matrix.hive }}

steps:
- uses: actions/checkout@master
# We split caches because GitHub Action Cache has a 400MB-size limit.
- uses: actions/cache@v1
with:
path: build
key: build-${{ hashFiles('**/pom.xml') }}
restore-keys: |
build-
- uses: actions/cache@v1
with:
path: ~/.m2/repository/com
key: ${{ matrix.java }}-${{ matrix.hadoop }}-maven-com-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ matrix.java }}-${{ matrix.hadoop }}-maven-com-
- uses: actions/cache@v1
with:
path: ~/.m2/repository/org
key: ${{ matrix.java }}-${{ matrix.hadoop }}-maven-org-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ matrix.java }}-${{ matrix.hadoop }}-maven-org-
- uses: actions/cache@v1
with:
path: ~/.m2/repository/net
key: ${{ matrix.java }}-${{ matrix.hadoop }}-maven-net-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ matrix.java }}-${{ matrix.hadoop }}-maven-net-
- uses: actions/cache@v1
with:
path: ~/.m2/repository/io
key: ${{ matrix.java }}-${{ matrix.hadoop }}-maven-io-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ matrix.java }}-${{ matrix.hadoop }}-maven-io-
- name: Set up JDK ${{ matrix.java }}
uses: actions/setup-java@v1
with:
java-version: ${{ matrix.java }}
- name: Build with Maven
run: |
export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=1g -Dorg.slf4j.simpleLogger.defaultLogLevel=WARN"
export MAVEN_CLI_OPTS="--no-transfer-progress"
mkdir -p ~/.m2
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Phive -P${{ matrix.hive }} -Phive-thriftserver -P${{ matrix.hadoop }} -Phadoop-cloud -Djava.version=${{ matrix.java }} install
rm -rf ~/.m2/repository/org/apache/spark


lint:
runs-on: ubuntu-latest
name: Linters (Java/Scala/Python), licenses, dependencies
steps:
- uses: actions/checkout@master
- uses: actions/setup-java@v1
with:
java-version: '11'
- uses: actions/setup-python@v1
with:
python-version: '3.x'
architecture: 'x64'
- name: Scala
run: ./dev/lint-scala
- name: Java
run: ./dev/lint-java
- name: Python
run: |
pip install flake8 sphinx numpy
./dev/lint-python
- name: License
run: ./dev/check-license
- name: Dependencies
run: ./dev/test-dependencies.sh

lintr:
runs-on: ubuntu-latest
name: Linter (R)
steps:
- uses: actions/checkout@master
- uses: actions/setup-java@v1
with:
java-version: '11'
- name: install R
run: |
echo 'deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/' | sudo tee -a /etc/apt/sources.list
curl -sL "https://keyserver.ubuntu.com/pks/lookup?op=get&search=0xE298A3A825C0D65DFD57CBB651716619E084DAB9" | sudo apt-key add
sudo apt-get update
sudo apt-get install -y r-base r-base-dev libcurl4-openssl-dev
- name: install R packages
run: |
sudo Rscript -e "install.packages(c('curl', 'xml2', 'httr', 'devtools', 'testthat', 'knitr', 'rmarkdown', 'roxygen2', 'e1071', 'survival'), repos='https://cloud.r-project.org/')"
sudo Rscript -e "devtools::install_github('jimhester/lintr@v2.0.0')"
- name: package and install SparkR
run: ./R/install-dev.sh
- name: lint-r
run: ./dev/lint-r
24 changes: 24 additions & 0 deletions .github/workflows/stale.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: Close stale PRs

on:
schedule:
- cron: "0 0 * * *"

jobs:
stale:
runs-on: ubuntu-latest
steps:
- uses: actions/stale@v1.1.0
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
stale-pr-message: >
We're closing this PR because it hasn't been updated in a while.
This isn't a judgement on the merit of the PR in any way. It's just
a way of keeping the PR queue manageable.

If you'd like to revive this PR, please reopen it and ask a
committer to remove the Stale tag!
days-before-stale: 100
# Setting this to 0 is the same as setting it to 1.
# See: https://github.com/actions/stale/issues/28
days-before-close: 0
18 changes: 17 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
R-unit-tests.log
R/unit-tests.out
R/cran-check.out
R/pkg/vignettes/sparkr-vignettes.html
R/pkg/tests/fulltests/Rplots.pdf
build/*.jar
build/apache-maven*
build/scala*
Expand All @@ -41,9 +43,12 @@ dependency-reduced-pom.xml
derby.log
dev/create-release/*final
dev/create-release/*txt
dev/pr-deps/
dist/
docs/_site
docs/_site/
docs/api
sql/docs
sql/site
lib_managed/
lint-r-report.log
log/
Expand All @@ -56,17 +61,25 @@ project/plugins/project/build.properties
project/plugins/src_managed/
project/plugins/target/
python/lib/pyspark.zip
python/.eggs/
python/deps
python/docs/_site/
python/test_coverage/coverage_data
python/test_coverage/htmlcov
python/pyspark/python
reports/
scalastyle-on-compile.generated.xml
scalastyle-output.xml
scalastyle.txt
spark-*-bin-*.tgz
spark-resources/
spark-tests.log
src_managed/
streaming-tests.log
target/
unit-tests.log
work/
docs/.jekyll-metadata

# For Hive
TempStatsStore/
Expand All @@ -84,3 +97,6 @@ spark-warehouse/
*.Rproj.*

.Rproj.user

# For SBT
.jvmopts
51 changes: 0 additions & 51 deletions .travis.yml

This file was deleted.

4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
## Contributing to Spark

*Before opening a pull request*, review the
[Contributing to Spark wiki](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark).
[Contributing to Spark guide](https://spark.apache.org/contributing.html).
It lists steps that are required before creating a PR. In particular, consider:

- Is the change important and ready enough to ask the community to spend time reviewing?
- Have you searched for existing, related JIRAs and pull requests?
- Is this a new feature that can stand alone as a [third party project](https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects) ?
- Is this a new feature that can stand alone as a [third party project](https://spark.apache.org/third-party-projects.html) ?
- Is the change being proposed clearly explained and motivated?

When you contribute code, you affirm that the contribution is your original work and that you
Expand Down
Loading