-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception handler python hook #2496
Comments
(happy to work on this patch ourselves, and apologies if there is a way to do this already that I missed) |
hey @nicku33 - I buy the use case, but I think we might want to opt for a different implementation than the one you're describing. Just an FYI you can run dbt with
When you do that, any exceptions will be present in the log lines. I think you should find an |
OK. I'll try that, easier to parse json anyway. However will that emit at the time of exception or only at the end ? Also, what different implementation where you thinking of ? You don't like code injection I take it ? Slightly related, lmk if worth a ticket, but we've found a need to define define transient exceptions in our own script so we know when to retry. (tx conflicts, transient network issues. We're on Redshift now) I think other have mentioned the issue. #1630 These could be hard coded by db, or possibly use https://en.wikipedia.org/wiki/SQLSTATE codes, but there is some judgement involved as well depending on the nature of the job. In our own scripts we've implemented an error handler which passes back whether the error is believed to be transient and how often to retry. |
hey @nicku33 - these exceptions should be logged immediately when they occur, not at the end of the run! We don't really design for dbt's programmatic Python API, and while I do think we'll do a better job here in the future, we basically provide no guarantees or helpful interfaces to calling dbt from Python today. I do think that a Python hook for model completion (be it a success or error) would be a really natural part of this API when we do build it out though. See our current thinking on retrying failed runs over here #2465 |
Yes you are right, of course. We switched to json output and are parsing STDOUT as they output. A bit icky and I look forward to the API hook, but much better than before. Regarding retry on transient issues, let me look closer at how dbt works and if I have an idea I'll fork, try it, and make an issue if it works. #2465 I guess would allow one to rerun just failed models until exit was non-zero ? Might be fine but if there's a more elegant way to retry earlier it would save some grief with dependent jobs. Thanks for your response, it helped us. Closing the ticket. |
Describe the feature
I'd like to be able to register a function to be called whenever an exception occurs with details about the context it occurred in as well as the raw Exception instance. This is so that we can pass it on to our alerting systems as soon as they happen, with the data structure of the actual exception. Rollbar, for example, uses the stack trace itself to fingerprint exceptions.
Describe alternatives you've considered
We can parse job output, but that seems fragile. As well, with long dbt jobs we have to wait for the end to get alert output. Getting it as they happen would be better.
Additional context
Any database, really.
Who will this benefit?
Data engineers and ops who have pre-existing error processing and logging systems.
The text was updated successfully, but these errors were encountered: