Skip to content

Commit

Permalink
Squashed commits:
Browse files Browse the repository at this point in the history
  - 0483890a2be0d6e9118b5ecf20ca2cde9188ad2a Code style by zhipeng.mao <zhipeng.mao@databricks.com>
  - 0a840b144bd48fb88e26703b8a04ca8229630f5a Error message, comments, function usage by zhipeng.mao <zhipeng.mao@databricks.com>
  - e628afe416e6d88df69525425284cd0aeef7ec24 Use file ID in bitmap aggregation by zhipeng.mao <zhipeng.mao@databricks.com>
  - fe6efbb1cfd45d778718b877a824f730a2dd8ef2 Check UniForm Hudi table must be UC by Annie Wang <170372889+anniewang-db@users.noreply.github.com>
  - 8a391f7155895b979866455055741a42808020a8 [LC-4634] Enable IcebergCompatV2 Delta Uniform By Alter T... by Zihao Xu <xzhseh@gmail.com>
  - 5028821d836b75c283b730bbf807fdfa5d28cc21 [CC][LC-4664][Test-Only] Make TrackingCommitCoordinatorCl... by Dhruv Arya <dhruv.arya@databricks.com>
  - d3fb4e4ae691f1ccf99274645cb3fea31d9a10ec [SC-166402] managed data skipping columns by Paul Loh <170377582+lohpaul99@users.noreply.github.com>
  - 8c98e895d4860387d216278168acd3977437d715 [LC-4668] Expose uncertainty in isDeltaTable resolution v... by Lars Kroll <lars.kroll@databricks.com>
  - 3376b76a114fac408e6363ac864921d7178d9804 [ES-1010221] Part 2 of Scala 2.13 migration for Delta tests by Jintian Liang <105243217+jintian-liang@users.noreply.github.com>
  - 7880534e5f814f768648699823731edb3eb86c77 [SC-163065] Make two copies of runQueriesWithOrderingFrom... by Thang Long Vu <107926660+longvu-db@users.noreply.github.com>
  - 8a59d4b4946aef5cfbc75594bd05ce979c249cd4 [SQL][COLLATION] Add COLLATE tag to structured QPL by Uros Bojanic <157381213+uros-db@users.noreply.github.com>
  - 91d9489acc1b15d4bea86baf3b75ba6e5e0b9abf [LC-4638] Implement more efficient SerDe for the row_inde... by Christos Stavrakakis <christos.stavrakakis@databricks.com>
  - 11d61890a1e26b8380e7cae9b1d750acf5a14f96 [ES-1177667][TEST-ONLY] Remove timeout source from withTe... by Johan Lasperas <johan.lasperas@databricks.com>
  - bc90067254419651895419cd0228d3cd5cbe4d8d [ES-1109125] Report accurate type when checking Iceberg c... by Hao Jiang <hao.jiang@databricks.com>
  - 2bf266859083ecfc1d46240638b7c421c24285ad [SC-160780][FOLLOW-UP] Add SQLConf flag for #96237 change by Vitalii Li <vitalii.li@databricks.com>
  - 23f0eb9982898acf167f3d3eb4d1905133dff27e [LC-4740][CoordinatedCommits] Extend NewTransactionSuite ... by Jun <85203301+junlee-db@users.noreply.github.com>
  - 66a20cfed0eaf369a8a2b35e709a23311e0a19f3 [LC-3943] Forbid automatic protocol upgrade when using co... by Stefan Kandic <154237371+stefankandic@users.noreply.github.com>
  - 876af9aa7e6de42c501b89eae58111e2df7ba8cc [LC-4491] Shade Hudi into DBR by Annie Wang <170372889+anniewang-db@users.noreply.github.com>
  - 673ca7fe43e3110c67143642f3a77f4f5106bc23 [LC-4754][Delta][MC] Fix error message formatting in Delt... by Sumeet Varma <sumeet.varma@databricks.com>
  - 321e72867fdc98d501963bccaf17f8577bc4c445 [LC-4192][Delta][MC] Use Coordinated Commits Properties f... by Sumeet Varma <sumeet.varma@databricks.com>
  - 84fe62011e78b5528adb70cc06a1be906ce97b02 [LC-4263][Delta] Add usage logging to UCCommitCooridnator... by Yumingxuan Guo <a.guo@databricks.com>
  - 4cde1a20dddc7408beb251cfbf62baed7bc29940 [LC-4392][Auto-Clustering-Key-Selection] Add additional t... by parimarjannegi-db <parimarjan.negi@databricks.com>
  - 7b5a138dd7388ac59a08528c0e27c622cf0bb2f3 [LC-4549] Delta structured logging migration for remainin... by Jiaheng Tang <jiaheng.tang@databricks.com>
  - c0e90e6bd23cc181e900ee4633b2eb213a913f9d [LC-4131] Validate that CDF works with collations by Stefan Kandic <154237371+stefankandic@users.noreply.github.com>
  - c472de831911ccd1cc0fd7b23e1fd417cecde195 [Delta][Copybara] Introduce `// EDGE-NO-INHERITANCE` tag by Paddy Xu <xupaddy@gmail.com>
  - 78f75a2bb3a05ad37b59e42eb35f4b61cb1c7acf [DELTA-OSS-AUTO][SC-170469][Hudi] Flesh out tests and upd... by Annie Wang <170372889+anniewang-db@users.noreply.github.com>
  - 49d07ae43b3dbc09c2679e35a4c751b99f53c60a [ES-1151956] Fixes for "show tables extended" by rayman7718 <rayman7718@gmail.com>
  - c35f0fdd59377faae33aa06f477fe5b066babe82 [ES-1159798][LC-4729][BEHAVE-74] Fix CDC Commit Timestamp... by Thang Long Vu <107926660+longvu-db@users.noreply.github.com>
  - cfd343651f3c00a6f551f3e2ef5180b80f2dd47b [LC-4661] Deprecate tableVersion field in type widening m... by Johan Lasperas <johan.lasperas@databricks.com>
  - 7014ad1ec51af3f55c2f9713b032c6c9d32ff755 [AUTO][SC-170122][SPARK-48697][LC-4703][SQL] Add collatio... by Stefan Kandic <stefan.kandic@databricks.com>
  - 35628a58efa0c704bf2e2ddfcba833ff9b40f07a [Delta][CC] Add integration test for DROP/UNDROP. by Yumingxuan Guo <a.guo@databricks.com>
  - 3351fa6531e3e370ebe41f931c3946fa00206d5f [LC-4693][Uniform] Make refresh on UFI no-op when there's... by ChengJi-db <cheng.ji@databricks.com>
  - 10d743671c381ea09131f3519be7fc90c460db75 [LC-4143][Delta][CC] Adds asynchronous backfilling in UC ... by Lukas Rupprecht <lukas.rupprecht@databricks.com>
  - d55d080248832d881d33e52d9b1d9efd66a197bf [LC-4698] add logging to UniForm Iceberg conversion num o... by Fred Storage Liu <fred.liu@databricks.com>
  - f992112f04b14423f323d4f171040ec8cca1b3ad [LC-4499] All Delta Commands should have all operation me... by sabir-akhadov <52208605+sabir-akhadov@users.noreply.github.com>
  - 7d0c410dba226ae191f51280c382b272fba459b1 [LC-3778] Log duration of DeltaSource operations by Tom van Bussel <Tom.VanBussel@databricks.com>
  - e5e29b59cab4218ed77903079eb2b08216e6f982 Import changes from OSS PR delta-io#3326 by Annie Wang <170372889+anniewang-db@users.noreply.github.com>
  - 4c08927c6eee0da4a9a00795b36a4b734f2c8a04 Import changes from OSS PR delta-io#3323 by Annie Wang <170372889+anniewang-db@users.noreply.github.com>
  - 1fd8ee89f3b7398db6f13486bd42ca2535023976 Import changes from OSS PR delta-io#3320 by Annie Wang <170372889+anniewang-db@users.noreply.github.com>
  - db210bb4f2bd0d682863c99ca4c385ed0d0d0118 [LC-4692] Regular delta min/max stats should always use U... by Stefan Kandic <154237371+stefankandic@users.noreply.github.com>
  - e873cab62acf71139d89f4019f59fb5aeec81f14 Import changes from OSS PR delta-io#3309 by Annie Wang <170372889+anniewang-db@users.noreply.github.com>
  - 6d1aca32ebc1d9009e77ca26f4fd0c73ba8e7f6e Import changes from OSS PR delta-io#3310 by Annie Wang <170372889+anniewang-db@users.noreply.github.com>
  - 0070442d167c53884e083127eeaa7e69c71f28d9 Import changes from OSS PR delta-io#3311 by Annie Wang <170372889+anniewang-db@users.noreply.github.com>
  - 8b8df4f43b5a85d1626ea4a2ce612b34ea67544d [ES-1089258][ES-1043275][Delta] Creates unresolved tables... by Lukas Rupprecht <lukas.rupprecht@databricks.com>
  - 602f7e429fd3c73d43c24ac7f231a9a10bfe8830 [LC-4043] Allow auto REFRESH on UniForm tables by Hao Jiang <hao.jiang@databricks.com>
  - 26755b6dba8caeca1b7c8797eb634abcb5532e35 [ES-1154731] Disable managed Hive Metastore related Conve... by Ming DAI <ming.dai@databricks.com>
  - 568c954c44eafe752d6c614ca36dc3f2a0a651d4 [ES-1150026][Delta] Handle concurrent CREATE TABLE IF NOT... by Sumeet Varma <sumeet.varma@databricks.com>
  - 195a5fbcd168a0d1e7e992dfdb226b3a9e21a6a3 [LC-4657][Delta][MC] Throw exception when additional list... by Sumeet Varma <sumeet.varma@databricks.com>
  - 438d79963e76478a641e256ee53edca557024c99 [LC-4524][Delta][MC] Fetch unbackfilled deltas to detect ... by Sumeet Varma <sumeet.varma@databricks.com>
  - 3c6800ac28f998e3183a54d4ff6cf058607cad16 [LC-4996][BEHAVE-60] Introduce stable type widening table... by Johan Lasperas <johan.lasperas@databricks.com>
  - c0d24087192bcbbd80f42fa83e2229482c877d66 [ES-1154366][ES-1154367] Fix ConvertToDeltaSuite in serve... by Ming DAI <ming.dai@databricks.com>
  - 6b7e4d0f9004d044fd88248c4b024d61e1d4b91d [Liquid] Improve error message when clustering column not... by Chirag Singh <137233133+chirag-s-db@users.noreply.github.com>
  - 75e21adf355c43bbafeb6dffefe7627614ca6338 [LC-4242] History truncation/validation support for write... by Andreas Chatzistergiou <93710326+andreaschat-db@users.noreply.github.com>
  - 71aec8cc7f136db13de264f661446526a9e7a2c2 [ES-1162704] Disable some Python Delta tests in serverless by Christos Stavrakakis <christos.stavrakakis@databricks.com>
  - 96b6ce441e287e9b008261844017eb9d09ffe692 update delta-sharing client to version 1.1.0 by Jade Wang <111902719+jadewang-db@users.noreply.github.com>
  - 510e7d69ce180c869031affff075bc163077fa82 [SC-160780] Support Delta API for DML commands when table... by Vitalii Li <vitalii.li@databricks.com>
  - 562322d9ab5ecf7fb237fa0e950470d83a252ed1 [LC-4549] Delta structured logging migration by Jiaheng Tang <jiaheng.tang@databricks.com>
  - 850e368261d2cb89cc732139c8ee1574978366d5 [LC-4526] Support snapshot expiration for uniform iceberg... by ChengJi-db <cheng.ji@databricks.com>
  - 6e11dd7f11e79a429068a6daa59d4e201fa894c8 [ES-1151640] Break down ImplicitDMLCastingSuite by Lars Kroll <lars.kroll@databricks.com>
  - 837d86463cef6bbb1e7571f06431cedd67b76c7e [ES-1154733][ES-1154732][ES-1154381][ES-1154382][TEST-ONL... by Paddy Xu <xupaddy@gmail.com>
  - 51bfb0c60bb2b2d64f01b7c5f96187e65b0ad4b1 [LC-4534] Add ThreadLocal trait for underlying execution ... by leonwind-db <leon.windheuser@databricks.com>
  - ade2f1670859abf5cad4f4a3f0240566ec8c3a0f [LS-989][Delta] Add checkpointType, numColumns and numFla... by Sumeet Varma <sumeet.varma@databricks.com>
  - 62c938e3eb626e04000e0acd4d37953a24d3c52d [LS-985][Delta] Add detailed mismatch reasons for Increme... by Sumeet Varma <sumeet.varma@databricks.com>
  - bfda87427e6c9c019d83ee108e8e70a5e5950d8e [SPARK-48576][SQL] Rename UTF8_BINARY_LCASE to UTF8_LCASE by Uros Bojanic <157381213+uros-db@users.noreply.github.com>
  - 2ff6142a85ef692a055071fa722ac10caeaf685f [LC-4301] Fix the predicates used by metadata queries by Wei Luo <143362963+weiluo-db@users.noreply.github.com>
  - 6340541a0bbc4c9d71e04aff99a3cb21720dd975 [DELTA-OSS-AUTO][SC-168789][Spark] Optimize batching / in... by Adam Binford <adamq43@gmail.com>
  - 1c691ed0e5df46402a4ebe44b16b89b748f2915e [SC-162602] Add per-kdtree scan stats (only scanned files) by Eric Liang <ekhliang@gmail.com>
  - 895a09ff141e810f87d2c5b353d3963d8ac00586 [SC-167647] map lookup for createPhysicalSchema by andrewxue-db <169104436+andrewxue-db@users.noreply.github.com>
  - d89308d177f5e269e7cb6bdfbe6c0978bd9d1cd2 [Delta OSS] Fix compile error on Spark master due to Pars... by Jiaheng Tang <jiaheng.tang@databricks.com>
  - c7ba28b7e35319db1a41d5252506d82b39933ec2 [STATS-356] Improve recomputeDeltaFileStats logic with ME... by Pat Sukprasert <pat.sukprasert@databricks.com>
  - 750eeac292c99c0003b6011038b99e2ee92083db [LC-4499] All Delta Commands should have all operation me... by sabir-akhadov <52208605+sabir-akhadov@users.noreply.github.com>
  - ebe3e3e296c593fa6e5020e8127b410cd9d4c4c6 [LC-3942] Add historical schema read compatibility checks... by Nikola Mandic <nikola.mandic@databricks.com>
  - 20dbe80816413dcfa565777fdd9878f4f00c1af2 [SC-167257][Auto-Cluster-Key-Selection] Show operational ... by Supun Nakandala <supun.nakandala@databricks.com>
  - 2460f7604cba40783e44a43cbeb4bc39956427d8 [ES-1155687][LC-812] Make DPO track read files by Christos Stavrakakis <christos.stavrakakis@databricks.com>
  - 59e3a4f57aaa7a7298e978ace615ff04b4507418 [LC-4484] Match RangeBloomFilterMightContain expressions ... by Tom van Bussel <Tom.VanBussel@databricks.com>
  - 01c09b8556e57b2fa94c3e6f850b2ae426364976 [ES-1077008][LC-4411][BEHAVE-32] Add a Delta config to en... by Paddy Xu <xupaddy@gmail.com>
  - 7bd2395d454a1f2383d5b58f16e7de643cacae8b [ES-1154374][TEST-ONLY] Fix INSERT with delta view tests ... by Johan Lasperas <johan.lasperas@databricks.com>
  - d8f90b3560e6fe8d95bf757538660a995cf185a0 [LC-4323] Delta metadata queries should be QPL tagged wit... by sabir-akhadov <52208605+sabir-akhadov@users.noreply.github.com>
  - e673c6972eb230234b8206309dcffc135af6387e [LC-4132][SQL] Disable changing collation of clustering c... by Nikola Mandic <nikola.mandic@databricks.com>
  - 9c34481f90d4e526a0f5c3671e5e59aae969c44c [LC-3259][Delta][MC][Pt. 3] Creates separate CommitStore ... by Lukas Rupprecht <lukas.rupprecht@databricks.com>
  - 0249b90ffe247bc689586891e90f81bd0568ad6d [LC-4564][LC-4565][Liquid OSS] Support show tblproperties... by Jiaheng Tang <jiaheng.tang@databricks.com>
  - 24a50e9f98a540ba72c9a7a585ddad0a8196ee92 [LC-4548] Support Spark Structured Logging in Delta by Jiaheng Tang <jiaheng.tang@databricks.com>
  - f4b3d9042b1ffebc8ed78a3fe2634e6672213cc0 [ES-1154365][ES-1154357] Disable test cases not supported... by Chirag Singh <137233133+chirag-s-db@users.noreply.github.com>
  - a39c357817393943cc2997bbc9f020b8e02c14a8 [Liquid][LC-3844] Do not attempt lazy clustering on files... by Chirag Singh <137233133+chirag-s-db@users.noreply.github.com>
  - b8ff879f8471592e7d08312be088152c575ecb3f [Liquid][LC-4335][LC-4314] Execute eager clustering batch... by Chirag Singh <137233133+chirag-s-db@users.noreply.github.com>
  - 1bce3b75784df6960e1c8261dea5c97f86f0e3bd [Delta] Rename managed commit to coordinated commits by Dhruv Arya <dhruv.arya@databricks.com>
  - f727c3e6717ed8f235560055ade0848a2333c6dd [DELTA-OSS-AUTO][SC-167757][SPARK][MINOR] Code cleanup (r... by Jacek Laskowski <jacek@japila.pl>
  - edf995cf281c10917d747620a4edc8ab37bf032d [Liquid][LC-2605] Change verifyClusteringColumns to not p... by Chirag Singh <137233133+chirag-s-db@users.noreply.github.com>
  - 04129352860bfd2973d2717a69b793a4ae880217 [BEHAVE-43][SC-168642][Spark] Support dropping the CHECK ... by Tom van Bussel <Tom.VanBussel@databricks.com>
  - 863b8c7f7af9d720949a9b098f89f8bc7e3a1279 [STATS-368] Make statistics on load work with tables with... by Satya Valluri <satya.valluri@databricks.com>
  - 75de39d97097df2b75596073f2df19d83fda4ac0 [SC-168457] Also track auto table and clustering column s... by Eric Liang <ekhliang@gmail.com>
  - 635eb6b63cbc3809b51ceaf023334bcbe5158ecd [LC-4146][MAC][Liquid-PO Integration] Handle large tables... by Stella Wan <42614699+stella-wan@users.noreply.github.com>
  - 493c1415599a7910da34b6a37e3023e6cf95ce25 [DELTA-OSS-AUTO][SC-167314] Update flink mima checks by Qianru Lao <55441375+EstherBear@users.noreply.github.com>
  - c8ce5d5360467dc9c01e851a7ed539f3d0792096 [ES-1115071][LC-4301][Liquid] Make metadata query optimiz... by Wei Luo <143362963+weiluo-db@users.noreply.github.com>
  - 261d280360847c040db11e4e1ac31dd9cfa58e69 [LC-4507] Run delete and insert steps in parallel during ... by Christos Stavrakakis <christos.stavrakakis@databricks.com>
  - aae17ff4806845f0f50e0fbd61808ed4e67b07f9 [DBRRM-1031][SC-168577][ES-1145536] Fix Kryo serializatio... by Josh Rosen <joshrosen@databricks.com>
  - 320d96e453fb6de07460ac524e6ed6843b667d37 [DELTA-OSS-AUTO][SC-168636]Improve documentation in PROTO... by Andreas Chatzistergiou <93710326+andreaschat-db@users.noreply.github.com>
  - 9f7979357c735e5b0be9158eccf16d5fc19b76a6 [LS-986][Delta] Handle multi-byte UTF-8 characters while ... by Sumeet Varma <sumeet.varma@databricks.com>
  - c447349ec894c1a54ac4118dc4d470dc12c684df [ES-1151654] Improve the running time of DeletionVectorsS... by Andreas Chatzistergiou <93710326+andreaschat-db@users.noreply.github.com>
  - 307cd1089a3389a6a9a0aa099529255761e7e107 [AUTOSTATS][ES-1147462] fallback to old style column sele... by mohamedzait <113953232+mohamedzait@users.noreply.github.com>
  (And 4593 more changes)

GitOrigin-RevId: 0483890a2be0d6e9118b5ecf20ca2cde9188ad2a
  • Loading branch information
zhipengmao-db committed Jul 17, 2024
0 parents commit b2a1526
Show file tree
Hide file tree
Showing 999 changed files with 285,876 additions and 0 deletions.
3 changes: 3 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
*.bat text eol=crlf
*.cmd text eol=crlf
*.bin binary
131 changes: 131 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
*#*#
*.#*
*.iml
*.ipr
*.iws
*.pyc
*.pyo
*.swp
*~
.DS_Store
.ammonite
.bloop
.bsp
.cache
.classpath
.ensime
.ensime_cache/
.ensime_lucene
.generated-mima*
.idea/
.idea_modules/
.metals
.project
.pydevproject
.scala_dependencies
.settings
/lib/
R-unit-tests.log
R/unit-tests.out
R/cran-check.out
R/pkg/vignettes/sparkr-vignettes.html
R/pkg/tests/fulltests/Rplots.pdf
build/*.jar
build/apache-maven*
build/scala*
build/zinc*
cache
checkpoint
conf/*.cmd
conf/*.conf
conf/*.properties
conf/*.sh
conf/*.xml
conf/java-opts
dependency-reduced-pom.xml
derby.log
dev/create-release/*final
dev/create-release/*txt
dev/pr-deps/
dist/
docs/_site
docs/api
sql/docs
sql/site
lib_managed/
lint-r-report.log
log/
logs/
metals.sbt
out/
project/boot/
project/build/target/
project/plugins/lib_managed/
project/plugins/project/build.properties
project/plugins/src_managed/
project/plugins/target/
python/lib/pyspark.zip
python/deps
docs/python/_static/
docs/python/_templates/
docs/python/_build/
python/test_coverage/coverage_data
python/test_coverage/htmlcov
python/pyspark/python
reports/
scalastyle-on-compile.generated.xml
scalastyle-output.xml
scalastyle.txt
spark-*-bin-*.tgz
spark-tests.log
src_managed/
streaming-tests.log
target/
unit-tests.log
work/
docs/.jekyll-metadata

# For Hive
TempStatsStore/
metastore/
metastore_db/
sql/hive-thriftserver/test_warehouses
warehouse/
spark-warehouse/

# For R session data
.RData
.RHistory
.Rhistory
*.Rproj
*.Rproj.*

.Rproj.user

**/src/main/resources/js

# For SBT
.jvmopts
sbt-launch-*.jar

# For Python linting
pep8*.py
pycodestyle*.py

# For IDE settings
.vscode

# For Terraform
**/.terraform/*
*.tfstate
*.tfstate.*
crash.log
crash.*.log
*.tfvars
*.tfvars.json
override.tf
override.tf.json
*_override.tf
*_override.tf.json
.terraformrc
.terraform.rc
1 change: 1 addition & 0 deletions .sbtopts
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
-J-Xmx4G
79 changes: 79 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Delta Lake Code of Conduct

## Our Pledge

In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.

## Our Standards

Examples of behavior that contributes to creating a positive environment
include:

* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members

Examples of unacceptable behavior by participants include:
shipit
* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
shipit
## Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.

## Scope

This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the Technical Steering Committee defined [here](https://github.com/delta-io/delta/blob/master/CONTRIBUTING.md#governance). All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq

## Linux Foundation Code of Conduct
Your use is additionally subject to the [Linux Foundation Code of Conduct](https://lfprojects.org/policies/code-of-conduct/)
75 changes: 75 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
We happily welcome contributions to Delta Lake. We use [GitHub Issues](/../../issues/) to track community reported issues and [GitHub Pull Requests ](/../../pulls/) for accepting changes.

# Governance
Delta Lake is an independent open-source project and not controlled by any single company. To emphasize this we joined the [Delta Lake Project](https://community.linuxfoundation.org/delta-lake/) in 2019, which is a sub-project of the Linux Foundation Projects. Within the project, we make decisions based on [these rules](https://delta.io/pdfs/delta-charter.pdf).

Delta Lake is supported by a wide set of developers from over 50 organizations across multiple repositories. Since 2019, more than 190 developers have contributed to Delta Lake! The Delta Lake community is growing by leaps and bounds with more than 6000 members in the [Delta Users slack](https://go.delta.io/slack)).

For more information, please refer to the [founding technical charter](https://delta.io/pdfs/delta-charter.pdf).

# Communication
- Before starting work on a major feature, please reach out to us via [GitHub](https://github.com/delta-io/delta/issues), [Slack](https://go.delta.io/slack), [email](https://groups.google.com/g/delta-users), etc. We will make sure no one else is already working on it and ask you to open a GitHub issue.
- A "major feature" is defined as any change that is > 100 LOC altered (not including tests), or changes any user-facing behavior.
- We will use the GitHub issue to discuss the feature and come to agreement.
- This is to prevent your time being wasted, as well as ours.
- The GitHub review process for major features is also important so that organizations with commit access can come to agreement on design.
- If it is appropriate to write a design document, the document must be hosted either in the GitHub tracking issue, or linked to from the issue and hosted in a world-readable location. Examples of design documents include [sample 1](https://docs.google.com/document/d/16S7xoAmXpSax7W1OWYYHo5nZ71t5NvrQ-F79pZF6yb8), [sample 2](https://docs.google.com/document/d/1MJhmW_H7doGWY2oty-I78vciziPzBy_nzuuB-Wv5XQ8), and [sample 3](https://docs.google.com/document/d/19CU4eJuBXOwW7FC58uSqyCbcLTsgvQ5P1zoPOPgUSpI).
- Specifically, if the goal is to add a new extension, please read the extension policy.
- Small patches and bug fixes don't need prior communication. If you have identified a bug and have ways to solve it, please create an [issue](https://github.com/delta-io/delta/issues) or create a [pull request](https://github.com/delta-io/delta/pulls).
- If you have an example code that explains a use case or a feature, create a pull request to post under [examples](https://github.com/delta-io/delta/tree/master/examples).


# Coding style
We generally follow the [Apache Spark Scala Style Guide](https://spark.apache.org/contributing.html).

# Sign your work
The sign-off is a simple line at the end of the explanation for the patch. Your signature certifies that you wrote the patch or otherwise have the right to pass it on as an open-source patch. The rules are pretty simple: if you can certify the below (from developercertificate.org):

```
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
```

Then you just add a line to every git commit message:

```
Signed-off-by: Jane Smith <jane.smith@email.com>
Use your real name (sorry, no pseudonyms or anonymous contributions.)
```

If you set your `user.name` and `user.email` git configs, you can sign your commit automatically with `git commit -s`.
60 changes: 60 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#
# Copyright (2021) The Delta Lake Project Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
FROM ubuntu:focal-20221019

ENV DEBIAN_FRONTEND noninteractive
ENV DEBCONF_NONINTERACTIVE_SEEN true

RUN apt-get update
RUN apt-get install -y software-properties-common
RUN apt-get install -y curl
RUN apt-get install -y wget
RUN apt-get install -y openjdk-8-jdk
RUN apt-get install -y python3.8
RUN apt-get install -y python3-pip
RUN apt-get install -y git

# Upgrade pip. This is needed to use prebuilt wheels for packages cffi (dep of cryptography) and
# cryptography. Otherwise, building wheels for these packages fails.
RUN pip3 install --upgrade pip

RUN pip3 install pyspark==3.5.0

RUN pip3 install mypy==0.982

RUN pip3 install pydocstyle==3.0.0

RUN pip3 install pandas==1.0.5

RUN pip3 install pyarrow==8.0.0

RUN pip3 install numpy==1.20.3

RUN pip3 install importlib_metadata==3.10.0

RUN pip3 install cryptography==37.0.4

# We must install cryptography before twine. Else, twine will pull a newer version of
# cryptography that requires a newer version of Rust and may break tests.
RUN pip3 install twine==4.0.1

RUN pip3 install wheel==0.33.4

RUN pip3 install setuptools==41.0.1

# Do not add any non-deterministic changes (e.g., copy from files
# from repo) in this Dockerfile, so that the docker image
# generated from this can be reused across builds.
Loading

0 comments on commit b2a1526

Please sign in to comment.