Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[fix](parquet)Fix data column and null map column not equal when read…
…ing Parquet complex type cross-page data (#47734) ### What problem does this PR solve? Related PR: #23277 Problem Summary: Previously, you may encounter this error when reading parquet complex types. This PR is mainly to fix this problem. ``` [fragment_mgr.cpp:549] report error status: cur path: xxx. parquet. Read parquet file xxxx.parquet failed, reason = [E-1721][E-1721] Size of filter doesn't match size of column: size=3156, filter.size=12544 0# doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&) at //ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:173 1# doris::Exception::Exception<unsigned long&, unsigned long&>(int, std::basic_string_view<char, std::char_traits<char> > const&, unsigned long&, unsigned long&) at //ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187 2# doris::vectorized::ColumnVector<unsigned char>::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /be/src/vec/columns/columns_common.h:86 3# doris::vectorized::ColumnNullable::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /be/src/vec/columns/column_nullable.cpp:373 4# doris::vectorized::ColumnStruct::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /be/src/vec/columns/column_struct.cpp:289 5# doris::vectorized::ColumnArray::filter_generic(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /be/src/vec/common/cow.h:402 6# doris::vectorized::ColumnArray::filter_nullable(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /be/src/vec/columns/column_array.cpp:877 7# doris::vectorized::ColumnNullable::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /be/src/vec/columns/column_nullable.cpp:371 8# doris::vectorized::Block::filter_block_internal(doris::vectorized::Block*, std::vector<unsigned int, std::allocator<unsigned int> > const&, doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /be/src/vec/core/block.cpp:790 9# doris::vectorized::RowGroupReader::next_batch(doris::vectorized::Block*, unsigned long, unsigned long*, bool*) at /be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:0 10# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /be/src/common/status.h:486 11# doris::vectorized::IcebergTableReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /be/src/common/status.h:491 12# doris::vectorized::VFileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /be/src/common/status.h:491 13# doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /be/src/common/status.h:491 14# doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /be/src/vec/exec/scan/vscanner.cpp:0 15# doris::vectorized::VScanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /be/src/vec/exec/scan/vscanner.cpp:101 16# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /be/src/common/status.h:378 ```
- Loading branch information