You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue in DataFusion apache/datafusion#12041 showcase a scenario where an empty Dictionary is effectively not null, and filtering by null doesn't return that row.
This indeed looks like a bug, and should be a relatively straightforward fix. The issue can be clearly seen when one compares the logic for StringArray with DictionaryArray
rows.iter()
.map(|row| {
let s = row.get(i);
(!null_regex.is_null(s)).then_some(s)
})
.collect::<StringArray>(),
This issue in DataFusion apache/datafusion#12041 showcase a scenario where an empty Dictionary is effectively not null, and filtering by null doesn't return that row.
I have tracked the problem up to Arrow CSV and have created a small test case to reproduce it https://github.com/apache/arrow-rs/compare/main...edmondop:arrow-rs:datafusion-12041?expand=1
I am unsure about whether we should close apache/datafusion#12041, change the behavior of arrow-csv, or provide this as an option to the reader maybe?
The text was updated successfully, but these errors were encountered: