-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Cannot create datasets in projects with model names containing multibyte characters #1197
Comments
@t0momi219 こんにちは! I'm here to assist you with any issues you encounter. I can help you solve bugs, answer questions, and even guide you on how to become a contributor. Let me know how I can assist you with the bug you've encountered. The issue you're encountering is due to the The relevant part of the code in def get_datasets(self, source: Literal["inputs", "outputs"]) -> list[Dataset]:
uris = []
for completed in self.openlineage_events_completes:
for output in getattr(completed, source):
dataset_uri = output.namespace + "/" + output.name
uris.append(dataset_uri)
self.log.debug("URIs to be converted to Dataset: %s", uris)
datasets = []
try:
datasets = [Dataset(uri) for uri in uris]
except ValueError:
raise AirflowCompatibilityError(
"""
Apache Airflow 2.9.0 & 2.9.1 introduced a breaking change in Dataset URIs, to be fixed in newer versions:
https://github.com/apache/airflow/issues/39486
If you want to use Cosmos with one of these Airflow versions, you will have to disable emission of Datasets:
By setting ``emit_datasets=False`` in ``RenderConfig``. For more information, see https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html.
"""
)
return datasets To address this issue, you have a few options:
For now, the quickest solution would be to disable the emission of datasets if that fits your use case. If you need further assistance or have any questions about implementing these changes, feel free to ask! |
In projects containing models with names like the following, dataset creation fails, and an error occurs during execution. ```txt └── dbt └── my_project └── models ├── 日本語名モデル.sql └── 日本語名モデル.yml ``` ``` File "/home/airflow/.local/lib/python3.12/site-packages/airflow/datasets/__init__.py", line 78, in _sanitize_uri raise ValueError("Dataset URI must only consist of ASCII characters") ValueError: Dataset URI must only consist of ASCII characters ``` To support model names with multibyte characters, it might be good to URL encode the names. closes: #1197 Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
…1198) In projects containing models with names like the following, dataset creation fails, and an error occurs during execution. ```txt └── dbt └── my_project └── models ├── 日本語名モデル.sql └── 日本語名モデル.yml ``` ``` File "/home/airflow/.local/lib/python3.12/site-packages/airflow/datasets/__init__.py", line 78, in _sanitize_uri raise ValueError("Dataset URI must only consist of ASCII characters") ValueError: Dataset URI must only consist of ASCII characters ``` To support model names with multibyte characters, it might be good to URL encode the names. closes: astronomer#1197 Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
Astronomer Cosmos Version
main (development)
If "Other Astronomer Cosmos version" selected, which one?
No response
dbt-core version
1.8
Versions of dbt adapters
No response
LoadMode
AUTOMATIC
ExecutionMode
VIRTUALENV
InvocationMode
None
airflow version
2.10.0
Operating System
Debian GNU/Linux 12 (official airflow docker image)
If a you think it's an UI issue, what browsers are you seeing the problem on?
No response
Deployment
Docker-Compose
Deployment details
No response
What happened?
In projects containing models with names like the following, dataset creation fails, and an error occurs during execution.
The error message is as follows:
Relevant log output
No response
How to reproduce
Add model has multibyte character name.
ex: 日本語名モデル.sql
Anything else :)?
No response
Are you willing to submit PR?
Contact Details
No response
The text was updated successfully, but these errors were encountered: