-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bigquery localisation #62
Conversation
3ebd7c7
to
ef87098
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember to make lint
before you push the code
dlt/pipeline/typing.py
Outdated
@@ -48,20 +49,19 @@ def default_dataset(self, new_value: str) -> None: | |||
self.DEFAULT_DATASET = new_value | |||
|
|||
@classmethod | |||
def from_services_dict(cls, services: StrAny, dataset_prefix: str) -> "GCPPipelineCredentials": | |||
def from_services_dict(cls, services: StrAny, dataset_prefix: str, location: str) -> "GCPPipelineCredentials": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a default for location: str = "US"
to make it backward compatible with all our examples. this code will be removed in v2 anyway so we can take some shortcuts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see the linter output
./check-package.sh
poetry run mypy --config-file mypy.ini dlt examples
examples/sync_schema_example.py:4: error: Missing positional argument "location" in call to "from_services_file" of "GCPPipelineCredentials"
examples/singer_tap_jsonl_example.py:12: error: Missing positional argument "location" in call to "from_services_file" of "GCPPipelineCredentials"
examples/singer_tap_example.py:10: error: Missing positional argument "location" in call to "from_services_file" of "GCPPipelineCredentials"
examples/quickstart.py:25: error: Missing positional argument "location" in call to "from_services_dict" of "GCPPipelineCredentials"
examples/google_sheets.py:9: error: Missing positional argument "location" in call to "from_services_file" of "GCPPipelineCredentials"
examples/google_drive_csv.py:6[8](https://github.com/scale-vector/dlt/actions/runs/3052952945/jobs/4922951182#step:8:9): error: Missing positional argument "location" in call to "from_services_file" of "GCPPipelineCredentials"
examples/demo_example.py:84: error: Missing positional argument "location" in call to "from_services_file" of "GCPPipelineCredentials"
examples/rasa_example.py:3[9](https://github.com/scale-vector/dlt/actions/runs/3052952945/jobs/4922951182#step:8:10): error: Missing positional argument "location" in call to "from_services_dict" of "GCPPipelineCredentials"
Found 8 errors in 8 files (checked [11](https://github.com/scale-vector/dlt/actions/runs/3052952945/jobs/4922951182#step:8:12)4 source files)
make: *** [Makefile:41: lint] Error 1
Error: Process completed with exit code 2.
@@ -62,7 +62,7 @@ def open_connection(self) -> None: | |||
credentials = None | |||
else: | |||
credentials = service_account.Credentials.from_service_account_info(self.C.as_credentials()) | |||
self._client = bigquery.Client(self.C.PROJECT_ID, credentials=credentials) | |||
self._client = bigquery.Client(self.C.PROJECT_ID, credentials=credentials, location=self.C.LOCATION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do an experiment?
line: 204
return BigQueryLoadJob(
JobClientBase.get_file_name_from_file_path(file_path),
self._create_load_job(table["name"], table["write_disposition"], file_path),
self.CREDENTIALS
)
replace self.CREDENTIALS with self.sql_client.native_connection()
and see if all works. then we are sure the same client contexts used in other places is passed to the load job
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are 2 places, start_file_load, and restore_file_load, should I just do the restore or both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I retract this :) all jobs are already using the client when created so they will be in the right location
dlt/load/bigquery/client.py
Outdated
@@ -203,7 +203,8 @@ def restore_file_load(self, file_path: str) -> LoadJob: | |||
return BigQueryLoadJob( | |||
JobClientBase.get_file_name_from_file_path(file_path), | |||
self._retrieve_load_job(file_path), | |||
self.CREDENTIALS | |||
#self.CREDENTIALS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I retract sorry :/
"sender_id":'90238094809sajlkjxoiewjhduuiuehd', | ||
"timestamp": str(pendulum.now()) | ||
} | ||
#job = expect_load_file(client, file_storage, json.dumps(load_json), user_table_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the problem with the test? let's set up the testing so you can run tests locally
1987430
to
bd92de8
Compare
No description provided.