Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on reading partitioned Parquets from an Azure Blob Storage #94

Open
juergend7lytix opened this issue Feb 11, 2025 · 0 comments
Open

Comments

@juergend7lytix
Copy link

When reading a hive partitioned Parquet from an Azure Blob Storage, I get a "json exception".

The query is the following,

INSTALL azure;
LOAD azure;

CREATE SECRET azure_secret (
    TYPE AZURE,
    CONNECTION_STRING 'DefaultEndpointsProtocol=https;AccountName=xxx;AccountKey=xxx==;EndpointSuffix=core.windows.net'
  );

SET azure_transport_option_type = 'curl';

SELECT distinct(some_column)
FROM read_parquet('abfss://xxx/xxx/xxx.pq/**');

and the exact error,

Invalid Error:
[json.exception.type_error.302] type must be string, but is null

What did I try?

  • Reading the same partitioned parquet from the local disk works flawlessly,
  • using the CLI and the Python installation of DuckDB,
  • downgrading DuckDB to 1.1.4 and 1.2.0.

Unfortunately, all of this without avail. I also don't know how to narrow the source of the error more down. But as it works when reading the files from the disk, I guess it has something to do with the Azure binding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant