-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] Handle non-JSON errors gracefully #741
Conversation
status_code = response.status_code | ||
is_http_unauthorized_or_forbidden = status_code in (401, 403) | ||
is_too_many_requests_or_unavailable = status_code in (429, 503) | ||
if is_http_unauthorized_or_forbidden: | ||
error.message = self._cfg.wrap_debug_info(error.message) | ||
if is_too_many_requests_or_unavailable: | ||
error.retry_after_secs = self._parse_retry_after(response) | ||
raise error from None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These used to be added in _make_nicer_error
. Now, _get_api_error
is basically identical to GetApiError in the Go SDK. This divergent codepath remains thus in the ApiClient to prevent any breaking changes.
@staticmethod | ||
def _make_sense_from_html(txt: str) -> str: | ||
matchers = [r'<pre>(.*)</pre>', r'<title>(.*)</title>'] | ||
for attempt in matchers: | ||
expr = re.compile(attempt, re.MULTILINE) | ||
match = expr.search(txt) | ||
if not match: | ||
continue | ||
return match.group(1).strip() | ||
return txt | ||
|
||
def _make_nicer_error(self, *, response: requests.Response, **kwargs) -> DatabricksError: | ||
status_code = response.status_code | ||
message = kwargs.get('message', 'request failed') | ||
is_http_unauthorized_or_forbidden = status_code in (401, 403) | ||
is_too_many_requests_or_unavailable = status_code in (429, 503) | ||
if is_http_unauthorized_or_forbidden: | ||
message = self._cfg.wrap_debug_info(message) | ||
if is_too_many_requests_or_unavailable: | ||
kwargs['retry_after_secs'] = self._parse_retry_after(response) | ||
kwargs['message'] = message | ||
return error_mapper(response, kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are moved to databricks/sdk/error/parser.py
, albeit in different form.
if not logger.isEnabledFor(logging.DEBUG): | ||
return | ||
request = response.request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation and remaining functions are moved to databricks/sdk/logger/round_trip_logger.py
.
to_log = self._only_n_bytes(body, self._debug_truncate_bytes) | ||
log_lines = [prefix + x.strip('\r') for x in to_log.split("\n")] | ||
return '\n'.join(log_lines) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only change made in this implementation, so that we can see the actual, truncated response when it is not JSON.
if _is_private_link_redirect(response): | ||
return _get_private_link_validation_error(response.url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to _get_api_error
.
# Handle API 1.2-style errors | ||
if 'error' in resp: | ||
error_args['message'] = resp['error'] | ||
|
||
# Handle SCIM Errors | ||
detail = resp.get('detail') | ||
status = resp.get('status') | ||
scim_type = resp.get('scimType') | ||
if detail: | ||
# Handle SCIM error message details | ||
# @see https://tools.ietf.org/html/rfc7644#section-3.7.3 | ||
error_args[ | ||
'message'] = f"{scim_type} {error_args.get('message', 'SCIM API Internal Error')}".strip(" ") | ||
error_args['error_code'] = f"SCIM_{status}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moved here from the DatabricksError constructor.
return None | ||
|
||
error_args = { | ||
'message': resp.get('message', 'request failed'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is taken from _make_nicer_error()
.
__HTML_ERROR_REGEXES = [re.compile(r'<pre>(.*)</pre>'), re.compile(r'<title>(.*)</title>'), ] | ||
|
||
def parse_error(self, response: requests.Response, response_body: bytes) -> Optional[dict]: | ||
payload_str = response_body.decode('utf-8') | ||
for regex in self.__HTML_ERROR_REGEXES: | ||
match = regex.search(payload_str) | ||
if match: | ||
message = match.group(1) if match.group(1) else response.reason | ||
return { | ||
'status': response.status_code, | ||
'message': message, | ||
'error_code': response.reason.upper().replace(' ', '_') | ||
} | ||
logging.debug('_HtmlErrorParser: no <pre> tag found in error response') | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is meant to replicate _make_sense_from_html
.
|
||
@pytest.mark.parametrize( | ||
'response, expected_error, expected_message', subclass_test_cases + | ||
[(fake_response('GET', 400, ''), errors.BadRequest, 'Bad Request'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new case
tests/test_errors.py
Outdated
(fake_response('GET', 400, 'MALFORMED_REQUEST: vpc_endpoints malformed parameters: VPC Endpoint ... with use_case ... cannot be attached in ... list'), errors.BadRequest, 'vpc_endpoints malformed parameters: VPC Endpoint ... with use_case ... cannot be attached in ... list'), | ||
(fake_response('GET', 400, '<pre>Worker environment not ready</pre>'), errors.BadRequest, 'Worker environment not ready'), | ||
(fake_response('GET', 400, 'this is not a real response'), errors.BadRequest, | ||
('unable to parse response. This is likely a bug in the Databricks SDK for Python or the underlying API. ' | ||
'Please report this issue with the following debugging information to the SDK issue tracker at ' | ||
'https://github.com/databricks/databricks-sdk-go/issues. Request log:```GET /api/2.0/service\n' | ||
'< 400 Bad Request\n' | ||
'< this is not a real response```')), ]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are new cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, left a couple of comments.
databricks/sdk/core.py
Outdated
@@ -12,8 +10,8 @@ | |||
from .config import * | |||
# To preserve backwards compatibility (as these definitions were previously in this module) | |||
from .credentials_provider import * | |||
from .errors import DatabricksError, error_mapper | |||
from .errors.private_link import _is_private_link_redirect | |||
from .errors import DatabricksError, _get_api_error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove the _
to indicate that the function is not internal to the errors
module?
See Pep 8:
_single_leading_underscore
: weak “internal use” indicator. E.g.from M import *
does not import objects whose names start with an underscore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
What do you think is a good way to indicate that the function is internal to the SDK but not to the errors
module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I don't have a definite answer here. Without changing the code structure, I would simply make that explicit in the function's doc. Otherwise, I would try to organize the core
package as a single "umbrella" interface to internal modules that start with a _
(e.g. core/_errors.py
). I've also seen people explicitly put private modules in a private
directory (e.g. core/private/errors.py
). I'll do some research to see if this is covered by our internal style guide.
### Bug Fixes * Handle non-JSON errors gracefully ([#741](#741)). ### Documentation * Add Data Plane access documentation ([#732](#732)). ### Internal Changes * Fix test_iam::test_scim_error_unmarshall integration test ([#743](#743)). ### API Changes: * Added `regenerate_dashboard()` method for [w.quality_monitors](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/quality_monitors.html) workspace-level service. * Added `databricks.sdk.service.catalog.RegenerateDashboardRequest` and `databricks.sdk.service.catalog.RegenerateDashboardResponse` dataclasses. * Added `databricks.sdk.service.jobs.QueueDetails`, `databricks.sdk.service.jobs.QueueDetailsCodeCode`, `databricks.sdk.service.jobs.RunLifecycleStateV2State`, `databricks.sdk.service.jobs.RunStatus`, `databricks.sdk.service.jobs.TerminationCodeCode`, `databricks.sdk.service.jobs.TerminationDetails` and `databricks.sdk.service.jobs.TerminationTypeType` dataclasses. * Added `status` field for `databricks.sdk.service.jobs.BaseRun`. * Added `status` field for `databricks.sdk.service.jobs.RepairHistoryItem`. * Added `status` field for `databricks.sdk.service.jobs.Run`. * Added `status` field for `databricks.sdk.service.jobs.RunTask`. * Added `max_provisioned_throughput` and `min_provisioned_throughput` fields for `databricks.sdk.service.serving.ServedModelInput`. * Added `columns_to_sync` field for `databricks.sdk.service.vectorsearch.DeltaSyncVectorIndexSpecRequest`. * Changed `workload_size` field for `databricks.sdk.service.serving.ServedModelInput` to no longer be required. OpenAPI SHA: d05898328669a3f8ab0c2ecee37db2673d3ea3f7, Date: 2024-09-04
### Bug Fixes * Handle non-JSON errors gracefully ([#741](#741)). ### Documentation * Add Data Plane access documentation ([#732](#732)). ### Internal Changes * Fix test_iam::test_scim_error_unmarshall integration test ([#743](#743)). ### API Changes: * Added `regenerate_dashboard()` method for [w.quality_monitors](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/quality_monitors.html) workspace-level service. * Added `databricks.sdk.service.catalog.RegenerateDashboardRequest` and `databricks.sdk.service.catalog.RegenerateDashboardResponse` dataclasses. * Added `databricks.sdk.service.jobs.QueueDetails`, `databricks.sdk.service.jobs.QueueDetailsCodeCode`, `databricks.sdk.service.jobs.RunLifecycleStateV2State`, `databricks.sdk.service.jobs.RunStatus`, `databricks.sdk.service.jobs.TerminationCodeCode`, `databricks.sdk.service.jobs.TerminationDetails` and `databricks.sdk.service.jobs.TerminationTypeType` dataclasses. * Added `status` field for `databricks.sdk.service.jobs.BaseRun`. * Added `status` field for `databricks.sdk.service.jobs.RepairHistoryItem`. * Added `status` field for `databricks.sdk.service.jobs.Run`. * Added `status` field for `databricks.sdk.service.jobs.RunTask`. * Added `max_provisioned_throughput` and `min_provisioned_throughput` fields for `databricks.sdk.service.serving.ServedModelInput`. * Added `columns_to_sync` field for `databricks.sdk.service.vectorsearch.DeltaSyncVectorIndexSpecRequest`. * Changed `workload_size` field for `databricks.sdk.service.serving.ServedModelInput` to no longer be required. OpenAPI SHA: d05898328669a3f8ab0c2ecee37db2673d3ea3f7, Date: 2024-09-04
## Changes #741 introduced a regression in retrieving error details from SCIM APIs. This PR addresses this and adds a regression test for this case. The implementation should now match the Go SDK's here: https://github.com/databricks/databricks-sdk-go/blob/main/apierr/errors.go#L220-L224. Fixes #749. ## Tests Added a unit test based on the supplied response in the ticket. - [ ] `make test` run locally - [ ] `make fmt` applied - [ ] relevant integration tests applied
## Changes #741 introduced a change to how an error message was modified in `ApiClient._perform`. Previously, arguments to the DatabricksError constructor were modified as a dictionary in `_perform`. After that change, `get_api_error` started to return a `DatabricksError` instance whose attributes were modified. The `message` attribute referred to in that change does not exist in the DatabricksError class: there is a `message` constructor parameter, but it is not set as an attribute. This PR refactors the error handling logic slightly to restore the original behavior. In doing this, we decouple all error-parsing and customizing logic out of ApiClient. This also sets us up to allow for further extension of error parsing and customization in the future, a feature that I have seen present in other SDKs. Fixes #755. ## Tests <!-- How is this tested? Please see the checklist below and also describe any other relevant tests --> - [ ] `make test` run locally - [ ] `make fmt` applied - [ ] relevant integration tests applied
## Changes databricks#741 introduced a change to how an error message was modified in `ApiClient._perform`. Previously, arguments to the DatabricksError constructor were modified as a dictionary in `_perform`. After that change, `get_api_error` started to return a `DatabricksError` instance whose attributes were modified. The `message` attribute referred to in that change does not exist in the DatabricksError class: there is a `message` constructor parameter, but it is not set as an attribute. This PR refactors the error handling logic slightly to restore the original behavior. In doing this, we decouple all error-parsing and customizing logic out of ApiClient. This also sets us up to allow for further extension of error parsing and customization in the future, a feature that I have seen present in other SDKs. Fixes databricks#755. ## Tests <!-- How is this tested? Please see the checklist below and also describe any other relevant tests --> - [ ] `make test` run locally - [ ] `make fmt` applied - [ ] relevant integration tests applied
Changes
Some errors returned by the platform are not serialized using JSON (see databricks/databricks-sdk-go#998 for an example). They are instead serialized in the form "<ERROR_CODE>: ". Today, the SDK cannot parse these error messages well, resulting in a poor user experience.
This PR adds support for parsing these error messages from the platform to the SDK. This should reduce bug reports for the SDK with respect to unexpected response parsing. This PR also refactors the error deserialization logic somewhat to make it more extensible in the future for other potential error formats that are not currently handled.
As a side-effect of this change, I've refactored the structure of the error handling in the Python SDK to more closely reflect how errors are handled in the Go SDK. This should make maintenance more straightforward in the future. It also introduces a new error message to the Python SDK to refer users to our issue tracker when the SDK receives an error response that it cannot parse, like what we do in the Go SDK.
Ports databricks/databricks-sdk-go#1031 to the Python SDK.
Deprecations
This PR deprecates several fields in the constructor for DatabricksError. Going forward, SCIM-specific and API 1.2-specific parameters should not be specified in the constructor; instead, they will be handled in error parsers.
Breaking Changes
The introduction of a different message for non-JSON responses may be a breaking change if users matched on the message structure used before.
Tests
Existing tests still pass, adding tests before merging this.
make test
run locallymake fmt
applied