Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: check label list list column type earlier #2846

Merged
merged 3 commits into from
Sep 12, 2024

Conversation

LuQQiu
Copy link
Contributor

@LuQQiu LuQQiu commented Sep 9, 2024

fixes #2844

@github-actions github-actions bot added the bug Something isn't working label Sep 9, 2024
@codecov-commenter
Copy link

codecov-commenter commented Sep 9, 2024

Codecov Report

Attention: Patch coverage is 8.33333% with 11 lines in your changes missing coverage. Please review.

Project coverage is 78.11%. Comparing base (c0e1f15) to head (e083822).

Files with missing lines Patch % Lines
rust/lance-index/src/scalar/label_list.rs 8.33% 10 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2846      +/-   ##
==========================================
- Coverage   78.12%   78.11%   -0.02%     
==========================================
  Files         229      229              
  Lines       70492    70501       +9     
  Branches    70492    70501       +9     
==========================================
- Hits        55073    55070       -3     
- Misses      12305    12314       +9     
- Partials     3114     3117       +3     
Flag Coverage Δ
unittests 78.11% <8.33%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 9, 2024

Error throws in the main branch

make: *** [test] Illegal instruction: 4
python/tests/test_filter.py::test_duckdb

_____________________ test_duckdb_pushdown_extension_types _____________________

tmp_path = PosixPath('/tmp/pytest-of-runner/pytest-0/test_duckdb_pushdown_extension0')

def test_duckdb_pushdown_extension_types(tmp_path):
    # large_binary is reported by pyarrow as a substrait extension type.  Datafusion
    # does not currently handle these extension types.  This should be ok as long
    # as the filter isn't accessing the column with the extension type.
    #
    # Lance works around this by removing any columns with extension types from the
    # schema it gives to duckdb.
    tab = pa.table(
        {
            "filterme": [1, 2, 3],
            "largebin": pa.array([b"123", b"456", b"789"], pa.large_binary()),
            "othercol": [4, 5, 6],
        }
    )
    ds = lance.write_dataset(tab, str(tmp_path))  # noqa: F841
    expected = tab.slice(1, 1)
    actual = duckdb.query("SELECT * FROM ds WHERE filterme = 2").fetch_arrow_table()
    assert actual.to_pydict() == expected.to_pydict()

    expected = tab.slice(0, 1)
    actual = duckdb.query("SELECT * FROM ds WHERE othercol = 4").fetch_arrow_table()
    assert actual.to_pydict() == expected.to_pydict()

    # Not the best error message but hopefully this is short lived until datafusion
    # supports substrait extension types.
    with pytest.raises(
        duckdb.InvalidInputException,
        match="referenced a field that is not yet supported by Substrait conversion",
    ):
      duckdb.query("SELECT * FROM ds WHERE largebin = '456'").fetchall()

E duckdb.duckdb.Error: OSError: Invalid user input: pushdown filter referenced a field that is not yet supported by Substrait conversion, /home/runner/work/lance/lance/rust/lance-datafusion/src/substrait.rs:225:157
E
E At:
E /opt/hostedtoolcache/Python/3.12.5/x64/lib/python3.12/site-packages/lance/dataset.py(2504): to_scanner
E /opt/hostedtoolcache/Python/3.12.5/x64/lib/python3.12/site-packages/lance/dataset.py(370): scanner

python/tests/test_integration.py:39: Error

python/tests/test_integration.py::test_duckdb_pushdown_extension_types FAILED

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 10, 2024

rust linux-arm Error: The operation was canceled.

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 11, 2024

@westonpace @wjones127 PTAL, thanks!

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 11, 2024

New version: (0, 17, 1)
File "/home/runner/work/lance/lance/pr/../base/ci/check_versions.py", line 64, in
new_version[1] == last_version[1] + 1
AssertionError: Minor version should have been bumped because there was a breaking change.
Last version: (0, 17, 0)

Copy link
Contributor

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning this up!

@LuQQiu
Copy link
Contributor Author

LuQQiu commented Sep 11, 2024

Thanks for cleaning this up!

Thanks for reviewing and providing good suggestions!

@LuQQiu LuQQiu merged commit 940345e into lancedb:main Sep 12, 2024
20 of 22 checks passed
@LuQQiu LuQQiu deleted the verifyEarlier branch September 12, 2024 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: entered unreachable code: Should verify that the first column is a list earlier
3 participants