Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service Unavailable (503) gRPC error from datastore occurs every 30 minutes #2896

Closed
quom opened this issue Dec 21, 2016 · 11 comments
Closed
Assignees
Labels
api: datastore Issues related to the Datastore API. grpc

Comments

@quom
Copy link
Contributor

quom commented Dec 21, 2016

  1. OS type and version
    Running on docker centos image on GCE

  2. Python version and virtual environment information python --version
    Python 3.5.2

  3. google-cloud-python version pip show google-cloud, pip show google-<service> or pip freeze
    google-cloud==0.21.1

  4. Stacktrace if available

INFO:werkzeug:10.36.0.1 - - [21/Dec/2016 06:24:20] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.132.0.6 - - [21/Dec/2016 06:54:21] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.132.0.6 - - [21/Dec/2016 07:24:22] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.36.0.1 - - [21/Dec/2016 07:54:23] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.36.0.1 - - [21/Dec/2016 08:24:25] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.132.0.6 - - [21/Dec/2016 08:54:27] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.132.0.6 - - [21/Dec/2016 09:24:28] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.132.0.6 - - [21/Dec/2016 09:54:29] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.36.0.1 - - [21/Dec/2016 10:24:31] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.36.0.1 - - [21/Dec/2016 10:54:33] "GET /endpoint HTTP/1.1" 500 -
INFO:werkzeug:10.132.0.6 - - [21/Dec/2016 11:24:34] "GET /endpoint HTTP/1.1" 500 -

Stacktrace from one of the errors:

Traceback (most recent call last):
  File "/opt/python3/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 253, in _grpc_catch_rendezvous
    yield
  File "/opt/python3/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 321, in run_query
    return self._stub.RunQuery(request_pb)
  File "/opt/python3/lib/python3.5/site-packages/grpc/_channel.py", line 481, in __call__
    return _end_unary_response_blocking(state, False, deadline)
  File "/opt/python3/lib/python3.5/site-packages/grpc/_channel.py", line 432, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, {"created":"@1482319474.777747614","description":"Secure read failed","file":"src/core/lib/security/transport/secure_endpoint.c","file_line":157,"grpc_status":14,"referenced_errors":[{"created":"@1482319474.777714092","description":"EOF","file":"src/core/lib/iomgr/tcp_posix.c","file_line":235}]})>`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/code/src/main.py", line 18, in getIPs
    black_ips = get_blacklisted_ip_addresses()
  File "/code/src/service.py", line 11, in get_blacklisted_ip_addresses
    return self.datastore_client.get_all_keys_of_kind(ENTITY_KIND_BLACKLIST)
  File "/code/src/datastore_client.py", line 15, in get_all_keys_of_kind
    for entity in query_iter:
  File "/opt/python3/lib/python3.5/site-packages/google/cloud/iterator.py", line 210, in _items_iter
    for page in self._page_iter(increment=False):
  File "/opt/python3/lib/python3.5/site-packages/google/cloud/iterator.py", line 239, in _page_iter
    page = self._next_page()
  File "/opt/python3/lib/python3.5/site-packages/google/cloud/datastore/query.py", line 499, in _next_page
    transaction_id=transaction and transaction.id,
  File "/opt/python3/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 574, in run_query
    response = self._datastore_api.run_query(project, request)
  File "/opt/python3/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 321, in run_query
    return self._stub.RunQuery(request_pb)
  File "/opt/python3/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/opt/python3/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 260, in _grpc_catch_rendezvous
    raise error_class(exc.details())
google.cloud.exceptions.ServiceUnavailable: 503 {"created":"@1482319474.777747614","description":"Secure read failed","file":"src/core/lib/security/transport/secure_endpoint.c","file_line":157,"grpc_status":14,"referenced_errors":[{"created":"@1482319474.777714092","description":"EOF","file":"src/core/lib/iomgr/tcp_posix.c","file_line":235}]}
  1. Steps to reproduce
    Occurs regularly every 30 mins.

  2. Code example

self.client = datastore.Client(project=project)
query = self.client.query(kind=kind)
query_iter = query.fetch()
@daspecster daspecster added api: datastore Issues related to the Datastore API. grpc labels Dec 21, 2016
@daspecster
Copy link
Contributor

@quom thanks for the report!

There's a new auth module that is used in google-cloud at the latest version (0.22.0) that may help with this.

Could you update google-cloud and let us know if the issue persists?

@quom
Copy link
Contributor Author

quom commented Dec 21, 2016

Thanks - I've updated the library and so far it looks to have worked.

@quom quom closed this as completed Dec 21, 2016
@quom quom reopened this Dec 22, 2016
@quom
Copy link
Contributor Author

quom commented Dec 22, 2016

Reopened - after a few hours of looking stable the exact same issue has reoccured.

@daspecster
Copy link
Contributor

Sorry to hear that @quom.
Is this a long lived process that uses query_iter?
If this happens you may have to create a new client or you could try refreshing the credentials.

@dhermes or @jonparrott, I'm guessing the token is expiring? If so, what's the right way to refresh the token? (client.credentials.refresh()?)

I don't see any issues with Bigtable on https://status.cloud.google.com.

@daspecster
Copy link
Contributor

@quom if you immediately retry the action, does it work the second time?

from google.cloud.exceptions import ServiceUnavailable

self.client = datastore.Client(project=project)
query = self.client.query(kind=kind)
try:
    query_iter = query.fetch()
except ServiceUnavailable:
    query_iter = query.fetch()

@sadovnychyi
Copy link

@daspecster re-trying immediately worked for me.

I got this error while trying to update an existing entity.

>>> entity['test'] = 0
>>> 
>>> client.put(entity)
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 253, in _grpc_catch_rendezvous
    yield
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 356, in commit
    return self._stub.Commit(request_pb)
  File "/usr/local/lib/python3.5/site-packages/grpc/_channel.py", line 481, in __call__
    return _end_unary_response_blocking(state, False, deadline)
  File "/usr/local/lib/python3.5/site-packages/grpc/_channel.py", line 432, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, {"created":"@1482491988.272167000","description":"Secure read failed","file":"src/core/lib/security/transport/secure_endpoint.c","file_line":157,"grpc_status":14,"referenced_errors":[{"created":"@1482491988.272010000","description":"EOF","file":"src/core/lib/iomgr/tcp_posix.c","file_line":235}]})>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 335, in put
    self.put_multi(entities=[entity])
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/google/cloud/datastore/client.py", line 362, in put_multi
    current.commit()
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/google/cloud/datastore/batch.py", line 265, in commit
    self._commit()
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/google/cloud/datastore/batch.py", line 242, in _commit
    self.project, self._commit_request, self._id)
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 628, in commit
    response = self._datastore_api.commit(project, request)
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 356, in commit
    return self._stub.Commit(request_pb)
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/google/cloud/datastore/_http.py", line 260, in _grpc_catch_rendezvous
    raise error_class(exc.details())
google.cloud.exceptions.ServiceUnavailable: 503 {"created":"@1482491988.272167000","description":"Secure read failed","file":"src/core/lib/security/transport/secure_endpoint.c","file_line":157,"grpc_status":14,"referenced_errors":[{"created":"@1482491988.272010000","description":"EOF","file":"src/core/lib/iomgr/tcp_posix.c","file_line":235}]}

@quom
Copy link
Contributor Author

quom commented Jan 3, 2017

@daspecster Yep it does work if you immediately retry but this logic should ideally be in the client library

@daspecster
Copy link
Contributor

@quom, I agree. I added this issue to #2694 as another example. I'm not sure when retry logic will get added but there is support to do so.

I'm glad retrying is working for you. It sounds like adding retry logic is all that's left for this issue and since we're tracking that in #2694, is it ok if we close this issue?

@quom quom closed this as completed Jan 3, 2017
@ghost
Copy link

ghost commented Jan 17, 2017

Same error seen here, using the library in a django app via mod_wsgi and apache2.

End of the trace I get is here:

File "/usr/lib/python2.7/site-packages/google/cloud/datastore/client.py", line 335, in put
self.put_multi(entities=[entity])
File "/usr/lib/python2.7/site-packages/google/cloud/datastore/client.py", line 362, in put_multi
current.commit()
File "/usr/lib/python2.7/site-packages/google/cloud/datastore/batch.py", line 265, in commit
self._commit()
File "/usr/lib/python2.7/site-packages/google/cloud/datastore/batch.py", line 242, in _commit
self.project, self._commit_request, self._id)
File "/usr/lib/python2.7/site-packages/google/cloud/datastore/_http.py", line 627, in commit
response = self._datastore_api.commit(project, request)
File "/usr/lib/python2.7/site-packages/google/cloud/datastore/_http.py", line 356, in commit
return self._stub.Commit(request_pb)
File "/usr/lib/python2.7/contextlib.py", line 35, in exit
self.gen.throw(type, value, traceback)
File "/usr/lib/python2.7/site-packages/google/cloud/datastore/_http.py", line 260, in _grpc_catch_rendezvous
raise error_class(exc.details())
ServiceUnavailable: 503 {"created":"@1484640492.298628143","description":"Secure read failed","file":"src/core/lib/security/transport/secure_endpoint.c","file_line":157,"grpc_status":14,"referenced_errors":[{"created":"@xxxxx.xxx","description":"EOF","file":"src/core/lib/iomgr/tcp_posix.c","file_line":235}]}

@mbyio
Copy link

mbyio commented Jan 19, 2017

I'm seeing the error as well - though I imagine there's nothing this library can do about it other than add retry logic.

@hir3npatel you can handle this by retrying with exponential backoff yourself since the library isn't doing it for us.

cobookman added a commit to cobookman/python-docs-samples that referenced this issue Mar 10, 2017
Generating a datastore grpc client for each request is not required and can lead to auth issues.
Here is one of the auth issues in question: googleapis/google-cloud-python#3085

Also note the following issue which is causing the client to need resetting every 30 mins: googleapis/google-cloud-python#2896. This will hopefully be fixed soon.
@vsoch
Copy link

vsoch commented Aug 27, 2017

We are seeing this error quite a bit too, especially since increasing the threads that the worker has to interact with datastore:

image

I'm using exponential backoff, and in this particular case I had overlooked a simple "get" to retrieve one entity that resulted in the exception:

image

I added an additional retry to this function, and I hope this resolves the issue! But +1 from me that it would be great if this retry could be built into the client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API. grpc
Projects
None yet
Development

No branches or pull requests

5 participants