More robust error handling #151

jvmncs · 2018-03-18T02:31:10Z

Client need to be notified of errors on worker nodes. This isn't necessarily specific to the TorchService, but it's likely to happen there very often, and there's no way to robustly prevent incorrect Torch code from being sent to a worker. We need to have a way of notifying the Client when they send a bad command, likely by sending a return to sender message that contains either a Grid-specific error message (e.g. 'command' isn't a torch command, 'obj' isn't a torch object, etc.) or a normal Python error message from a stack trace (e.g. Runtime Error: cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?).

The text was updated successfully, but these errors were encountered:

jvmncs · 2018-03-18T02:35:18Z

Shouldn't be too bad to do something general with try/except in targeted places (like process_command for torch). An exception triggers the except block, and there we compile the error from the exception into a message and then send that message back to the client. Ideally, the message would be identical to whatever error the stack trace would display for a normal Python process.

jvmncs added the torch label Mar 18, 2018

jvmncs mentioned this issue Mar 31, 2018

Distributed PyTorch on Grid (via IPFS/PubSub) #166

Merged

iamtrask closed this as completed Jul 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More robust error handling #151

More robust error handling #151

jvmncs commented Mar 18, 2018

jvmncs commented Mar 18, 2018 •

edited

Loading

More robust error handling #151

More robust error handling #151

Comments

jvmncs commented Mar 18, 2018

jvmncs commented Mar 18, 2018 • edited Loading

jvmncs commented Mar 18, 2018 •

edited

Loading