feat(java): support alter columns for dataset #3259

yanghua · 2024-12-17T08:34:09Z

No description provided.

yanghua · 2024-12-17T09:29:33Z

java/core/src/main/java/com/lancedb/lance/Dataset.java

java/core/src/test/java/com/lancedb/lance/DatasetTest.java

SaintBacchus · 2024-12-19T02:28:32Z

java/core/src/main/java/com/lancedb/lance/schema/ColumnAlteration.java

+/** Column alteration used to alter dataset columns. */
+public class ColumnAlteration {
+
+  private String path;


path was a concept of arrow, should we name it as column name for java moudle?

IMO, this suggestion makes sense in some dimension.

But, this issue(the ambiguity of concepts in different contexts) also happens in the rust module. This class just aligns with the definition in the rust module.

I am OK, if we all agree with changing the naming in all modules.

path can be the path to a nested columns. Such as outer_struct.inner_struct.field_a. There the column name is field_a, but the full path to it in the schema is outer_struct.inner_struct.field_a. That's why it's called path.

SaintBacchus · 2024-12-19T02:40:07Z

java/core/src/main/java/com/lancedb/lance/schema/ColumnAlteration.java

+
+import org.apache.arrow.vector.types.pojo.ArrowType;
+
+import java.util.Optional;


It's better use the Optional in java-core module since the jdk Optinal is not seriable. And the spark connector need a seriable class.

It's better use the Optional in java-core module

What does this mean? Which Optional do you prefer? I see, other core classes for example Fragment, FragmentOperation, WriteParams, ReadOptions also use java.util.Optional.

OK, maybe the Optional in java should be named as SeriableOptional for spark to use.

SaintBacchus · 2024-12-19T02:42:29Z

java/core/lance-jni/src/blocking_dataset.rs

+    let mut dataset_guard =
+        unsafe { env.get_rust_field::<_, _, BlockingDataset>(java_dataset, NATIVE_DATASET) }?;
+
+    RT.block_on(dataset_guard.inner.alter_columns(&column_alterations))?;


If there are non-support alter operators, will this code raise exception?

non-support alter operators

What does this mean? Did you mean if there is a dataset's schema does not support evolution on some conversion ?between two types?

If yes, it would throw an exception.

It might be worth adding a unit test verifying you get an exception and that it has a meaningful message.

wjones127

Looks good to me. I'll give @SaintBacchus a chance to give a final review before I merge.

SaintBacchus · 2024-12-20T02:12:49Z

It also LGTM

github-actions bot added enhancement New feature or request java labels Dec 17, 2024

yanghua force-pushed the 3249-alter-col branch 5 times, most recently from 312bf18 to ba412eb Compare December 17, 2024 09:13

wjones127 requested changes Dec 17, 2024

View reviewed changes

java/core/src/main/java/com/lancedb/lance/Dataset.java Outdated Show resolved Hide resolved

yanghua force-pushed the 3249-alter-col branch from 27b05ae to 97c6240 Compare December 18, 2024 09:38

yanghua added 3 commits December 18, 2024 18:51

feat(java): support alter columns for dataset

bbe4eda

feat(java): support alter columns for dataset

1b31e72

feat(java): support alter columns for dataset

8f8936b

yanghua force-pushed the 3249-alter-col branch from 3f54adb to 8f8936b Compare December 18, 2024 10:51

wjones127 reviewed Dec 18, 2024

View reviewed changes

java/core/src/test/java/com/lancedb/lance/DatasetTest.java Outdated Show resolved Hide resolved

feat(java): support alter columns for dataset

34294cc

SaintBacchus reviewed Dec 19, 2024

View reviewed changes

wjones127 approved these changes Dec 19, 2024

View reviewed changes

SaintBacchus approved these changes Dec 20, 2024

View reviewed changes

wjones127 merged commit 2b29487 into lancedb:main Dec 20, 2024
8 checks passed

yanghua mentioned this pull request Dec 23, 2024

Support altering columns in java module #3249

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(java): support alter columns for dataset #3259

feat(java): support alter columns for dataset #3259

yanghua commented Dec 17, 2024

yanghua commented Dec 17, 2024

SaintBacchus Dec 19, 2024

yanghua Dec 19, 2024

wjones127 Dec 19, 2024

SaintBacchus Dec 19, 2024

yanghua Dec 19, 2024

SaintBacchus Dec 19, 2024

SaintBacchus Dec 19, 2024

yanghua Dec 19, 2024

wjones127 Dec 19, 2024

wjones127 left a comment •

edited

Loading

SaintBacchus commented Dec 20, 2024


		import org.apache.arrow.vector.types.pojo.ArrowType;

		import java.util.Optional;

feat(java): support alter columns for dataset #3259

feat(java): support alter columns for dataset #3259

Conversation

yanghua commented Dec 17, 2024

yanghua commented Dec 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wjones127 left a comment • edited Loading

Choose a reason for hiding this comment

SaintBacchus commented Dec 20, 2024

wjones127 left a comment •

edited

Loading