Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iceberg TableDefinition from table identifier #5868

Closed
devinrsmith opened this issue Jul 30, 2024 · 4 comments · Fixed by #5891
Closed

iceberg TableDefinition from table identifier #5868

devinrsmith opened this issue Jul 30, 2024 · 4 comments · Fixed by #5891
Assignees
Labels
core Core development tasks feature request New feature or request iceberg query engine
Milestone

Comments

@devinrsmith
Copy link
Member

It should be possible to get a TableDefinition (and potentially a meta_table representation) from a given table identifier without needing to read the table first.

@devinrsmith
Copy link
Member Author

devinrsmith commented Jul 30, 2024

It probably needs to take into account IcebergInstructions; and have the same sort of logic that read_table would use to resolve the returned definition. If IcebergInstructions has a table definition that is incompatible, I would expect both get_definition and read_table to give similar error messages.

@rcaudy
Copy link
Member

rcaudy commented Jul 30, 2024

I think it makes sense to offer getDefinition(TableIdentifier) and getDefinition(TableIdentifier, Snapshot).

@rcaudy rcaudy added query engine core Core development tasks and removed triage labels Jul 30, 2024
@rcaudy rcaudy modified the milestones: 3. Triage, 0.36.0 Jul 30, 2024
@rcaudy
Copy link
Member

rcaudy commented Jul 31, 2024

Prior art in core+, which we should match in spirit:

    @Override
    public TableDefinition getTableDefinition(@NotNull final String namespace, @NotNull final String tableName) {
        // return a definition derived from the Iceberg schema
    }

    @Override
    public Table getTableDefinitionTable(@NotNull final String namespace, @NotNull final String tableName) {
        return TableTools.newTable(getTableDefinition(namespace, tableName)).meta();
    }

@rcaudy
Copy link
Member

rcaudy commented Jul 31, 2024

Index: engine/table/src/main/java/io/deephaven/engine/util/TableTools.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/engine/table/src/main/java/io/deephaven/engine/util/TableTools.java b/engine/table/src/main/java/io/deephaven/engine/util/TableTools.java
--- a/engine/table/src/main/java/io/deephaven/engine/util/TableTools.java	(revision b9e2c6e3ba9f43e2ec69abe9176433503c87d054)
+++ b/engine/table/src/main/java/io/deephaven/engine/util/TableTools.java	(date 1722293086196)
@@ -14,6 +14,7 @@
 import io.deephaven.engine.rowset.RowSetFactory;
 import io.deephaven.engine.table.*;
 import io.deephaven.engine.table.impl.perf.QueryPerformanceRecorder;
+import io.deephaven.engine.table.impl.sources.NullValueColumnSource;
 import io.deephaven.internal.log.LoggerFactory;
 import io.deephaven.time.DateTimeUtils;
 import io.deephaven.engine.table.impl.QueryTable;
@@ -730,7 +731,7 @@
     public static Table newTable(TableDefinition definition) {
         Map<String, ColumnSource<?>> columns = new LinkedHashMap<>();
         for (ColumnDefinition<?> columnDefinition : definition.getColumns()) {
-            columns.put(columnDefinition.getName(), ArrayBackedColumnSource.getMemoryColumnSource(0,
+            columns.put(columnDefinition.getName(), NullValueColumnSource.getInstance(
                     columnDefinition.getDataType(), columnDefinition.getComponentType()));
         }
         return new QueryTable(definition, RowSetFactory.empty().toTracking(), columns) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core development tasks feature request New feature or request iceberg query engine
Projects
None yet
3 participants