Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

application/octet-stream content-type when writing to Azure Blob Storage using ObjectStoragePath #39722

Open
1 of 2 tasks
pedro-cf opened this issue May 20, 2024 · 4 comments
Open
1 of 2 tasks

Comments

@pedro-cf
Copy link

pedro-cf commented May 20, 2024

Apache Airflow version

2.9.1

If "Other Airflow 2 version" selected, which one?

No response

What happened?

When using ObjectStoragePath to send a file to an Azure Blob Storage Container the content type is being set as application/octet-stream

Example code:

ObjectStoragePath("file:///opt/airflow/input.tif").copy(dst=base / "copy.tif")

Content types displayed on Azure Storage Explorer:
image

(Note: input.tif is manually uploaded using Azure Storage Explorer)

Not sure if it's a bug, a feature request or if there is already a solution.

What you think should happen instead?

After uploading a file using ObjectStoragePath the content-type should be appropriate.

How to reproduce

Example code:

ObjectStoragePath("file:///opt/airflow/input.tif").copy(dst=base / "copy.tif")

Operating System

Ubuntu 22.04 (WSL2)

Versions of Apache Airflow Providers

  • apache-airflow-providers-microsoft-azure 10.1.0

Deployment

Docker-Compose

Deployment details

No response

Anything else?

Related Discussion:

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@pedro-cf pedro-cf added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels May 20, 2024
@bolkedebruin
Copy link
Contributor

bolkedebruin commented May 21, 2024

It is better to open this issue at https://github.com/fsspec/filesystem_spec as that is the underlying implementation. For ObjectStoragePath to be able to use that it needs to be exposed there and typically it becomes automatically available with a new fsspec release as parameters are passed to the lib.

@Taragolis Taragolis added upstream-dependency and removed good first issue needs-triage label for new issues that we didn't triage yet labels May 22, 2024
@pedro-cf
Copy link
Author

fsspec/adlfs#474

@pedro-cf
Copy link
Author

pedro-cf commented May 27, 2024

@bolkedebruin @Taragolis @eladkal

fsspec/adlfs#474 (comment)

fsspec/adlfs#392

There appears to be a way but using ObjectStoragePath how can we avoid using Azure Blob Storage specific code?

@bolkedebruin
Copy link
Contributor

Have you tried it? As I mentioned earlier ObjectStoragePath().open(xxx) passes additional parameters to the underlying implementation. So setting **{"content_settings": content_settings} works for adls.

As the content-type is not part of the spec, there is no generic way of setting it. Happy to accept a patch for it, but it is probably more beneficial to all downstream users of fsspec to supply such a patch to fsspec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants