Skip to content
This repository has been archived by the owner on Feb 16, 2023. It is now read-only.

More robust error handling #151

Closed
jvmncs opened this issue Mar 18, 2018 · 1 comment
Closed

More robust error handling #151

jvmncs opened this issue Mar 18, 2018 · 1 comment
Labels

Comments

@jvmncs
Copy link
Contributor

jvmncs commented Mar 18, 2018

Client need to be notified of errors on worker nodes. This isn't necessarily specific to the TorchService, but it's likely to happen there very often, and there's no way to robustly prevent incorrect Torch code from being sent to a worker. We need to have a way of notifying the Client when they send a bad command, likely by sending a return to sender message that contains either a Grid-specific error message (e.g. 'command' isn't a torch command, 'obj' isn't a torch object, etc.) or a normal Python error message from a stack trace (e.g. Runtime Error: cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?).

@jvmncs jvmncs added the torch label Mar 18, 2018
@jvmncs
Copy link
Contributor Author

jvmncs commented Mar 18, 2018

Shouldn't be too bad to do something general with try/except in targeted places (like process_command for torch). An exception triggers the except block, and there we compile the error from the exception into a message and then send that message back to the client. Ideally, the message would be identical to whatever error the stack trace would display for a normal Python process.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants