Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support _rowid meta column for spark connector in java #3194

Merged
merged 4 commits into from
Dec 6, 2024

Conversation

SaintBacchus
Copy link
Collaborator

As discussion in PR, I had implement the _rowid meta column just in java package.

@github-actions github-actions bot added enhancement New feature or request java labels Dec 4, 2024
@SaintBacchus
Copy link
Collaborator Author

@LuQQiu @eddyxu Can you help review this PR?

@eddyxu eddyxu requested review from wjones127, westonpace and LuQQiu and removed request for westonpace and wjones127 December 4, 2024 07:55
Copy link
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me. My only serious worry is the licensing.

@SaintBacchus SaintBacchus force-pushed the SupportRowIdInJava branch 2 times, most recently from b329f52 to d324d62 Compare December 5, 2024 02:47
@SaintBacchus
Copy link
Collaborator Author

@wjones127 I had added the license file in lancedb. Plz review it

LICENSE Outdated
Comment on lines 438 to 437
------------------------------------------------------------------------------------
This product bundles various third-party components under other open source licenses.
This section summarizes those components and their licenses. See licenses/
for text of these licenses.


Apache Software Foundation License 2.0
--------------------------------------

common/network-common/src/main/java/org/apache/spark/network/util/LimitedInputStream.java
core/src/main/java/org/apache/spark/util/collection/TimSort.java
core/src/main/resources/org/apache/spark/ui/static/bootstrap*
core/src/main/resources/org/apache/spark/ui/static/vis*
docs/js/vendor/bootstrap.js
connector/spark-ganglia-lgpl/src/main/java/com/codahale/metrics/ganglia/GangliaReporter.java
core/src/main/resources/org/apache/spark/ui/static/d3-flamegraph.min.js
core/src/main/resources/org/apache/spark/ui/static/d3-flamegraph.css

Python Software Foundation License
----------------------------------

python/pyspark/loose_version.py

BSD 3-Clause
------------

python/lib/py4j-*-src.zip
python/pyspark/cloudpickle/*.py
python/pyspark/join.py

The CSS style for the navigation sidebar of the documentation was originally
submitted by Óscar Nájera for the scikit-learn project. The scikit-learn project
is distributed under the 3-Clause BSD license.


MIT License
-----------

core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
core/src/main/resources/org/apache/spark/ui/static/*dataTables*
core/src/main/resources/org/apache/spark/ui/static/graphlib-dot.min.js
core/src/main/resources/org/apache/spark/ui/static/jquery*
core/src/main/resources/org/apache/spark/ui/static/sorttable.js
docs/js/vendor/anchor.min.js
docs/js/vendor/jquery*
docs/js/vendor/modernizer*

ISC License
-----------

core/src/main/resources/org/apache/spark/ui/static/d3.min.js


Creative Commons CC0 1.0 Universal Public Domain Dedication
-----------------------------------------------------------
(see LICENSE-CC0.txt)

data/mllib/images/kittens/29.5.a_b_EGDP022204.jpg
data/mllib/images/kittens/54893.jpg
data/mllib/images/kittens/DP153539.jpg
data/mllib/images/kittens/DP802813.jpg
data/mllib/images/multi-channel/chr30.4.184.jpg

https://github.com/apache/spark/blob/master/LICENSE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove all of this, since none of this applies to the files you brought over.

Suggested change
------------------------------------------------------------------------------------
This product bundles various third-party components under other open source licenses.
This section summarizes those components and their licenses. See licenses/
for text of these licenses.
Apache Software Foundation License 2.0
--------------------------------------
common/network-common/src/main/java/org/apache/spark/network/util/LimitedInputStream.java
core/src/main/java/org/apache/spark/util/collection/TimSort.java
core/src/main/resources/org/apache/spark/ui/static/bootstrap*
core/src/main/resources/org/apache/spark/ui/static/vis*
docs/js/vendor/bootstrap.js
connector/spark-ganglia-lgpl/src/main/java/com/codahale/metrics/ganglia/GangliaReporter.java
core/src/main/resources/org/apache/spark/ui/static/d3-flamegraph.min.js
core/src/main/resources/org/apache/spark/ui/static/d3-flamegraph.css
Python Software Foundation License
----------------------------------
python/pyspark/loose_version.py
BSD 3-Clause
------------
python/lib/py4j-*-src.zip
python/pyspark/cloudpickle/*.py
python/pyspark/join.py
The CSS style for the navigation sidebar of the documentation was originally
submitted by Óscar Nájera for the scikit-learn project. The scikit-learn project
is distributed under the 3-Clause BSD license.
MIT License
-----------
core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
core/src/main/resources/org/apache/spark/ui/static/*dataTables*
core/src/main/resources/org/apache/spark/ui/static/graphlib-dot.min.js
core/src/main/resources/org/apache/spark/ui/static/jquery*
core/src/main/resources/org/apache/spark/ui/static/sorttable.js
docs/js/vendor/anchor.min.js
docs/js/vendor/jquery*
docs/js/vendor/modernizer*
ISC License
-----------
core/src/main/resources/org/apache/spark/ui/static/d3.min.js
Creative Commons CC0 1.0 Universal Public Domain Dedication
-----------------------------------------------------------
(see LICENSE-CC0.txt)
data/mllib/images/kittens/29.5.a_b_EGDP022204.jpg
data/mllib/images/kittens/54893.jpg
data/mllib/images/kittens/DP153539.jpg
data/mllib/images/kittens/DP802813.jpg
data/mllib/images/multi-channel/chr30.4.184.jpg
https://github.com/apache/spark/blob/master/LICENSE

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I had removed it

Comment on lines 25 to 36
/**
SPDX-License-Identifier: Apache-2.0
SPDX-FileCopyrightText: Copyright The Lance Authors

The following code is originally from https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/util/ArrowUtilsSuite.scala
and is licensed under the Apache license:

License: Apache License 2.0, Copyright 2014 and onwards The Apache Software Foundation.
https://github.com/apache/spark/blob/master/LICENSE

It has been modified by the Lance developers to fit the needs of the Lance project.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this up? It's confusing with the other license above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just put it in the apache license header.

Comment on lines 25 to 36
/**
SPDX-License-Identifier: Apache-2.0
SPDX-FileCopyrightText: Copyright The Lance Authors

The following code is originally from https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/vectorized/ArrowColumnVectorSuite.scala
and is licensed under the Apache license:

License: Apache License 2.0, Copyright 2014 and onwards The Apache Software Foundation.
https://github.com/apache/spark/blob/master/LICENSE

It has been modified by the Lance developers to fit the needs of the Lance project.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here.

Copy link
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working with me on the license stuff. Looks good now :)

@wjones127 wjones127 merged commit e4ab9a8 into lancedb:main Dec 6, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request java
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants