Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Particle webhooks should follow web standard http response codes #65

Closed
markterrill opened this issue Jan 10, 2020 · 6 comments
Closed

Comments

@markterrill
Copy link

Bugs

Particle webhooks should acknowledge standard http response codes (3xx, 4xx, 5xx).

Expected behavior

If you receive an error or redirection etc, then return the message to the user and DO NOT retry it!

Observed behavior

Currently webhooks are retried if they encounter an error. A webhook is then disabled if a number of errors are encountered after the retries expire.

Steps to reproduce

Put in a destination that doesn't exist and hammer it a few times from a few devices. It'll get disabled, but it'll retry first.

Commentary

For years we've been filtering webhooks through AWS HTTP Gateway so that no matter what the actual meaningful server response is, Particle webhooks only ever receives a 200 response code.

The system should adhere to web standards and pass back the code and message provided.

@monkbroc
Copy link
Member

Hi Mark,

The Particle webhook system is not a general purpose web request maker. It has a specific requirement that the receiving server must accept the event payload with a 2xx response. It retries twice to make sure that temporary server issues don't lead to lost data. This behavior is not going to change.

@markterrill
Copy link
Author

markterrill commented Jan 15, 2020

@monkbroc

How is it not a general purpose web request maker? What is it but that? You can configure it to point at any HTTP/S service you like.

  • It doesn't react to HTTP codes per convention and 99.9% likely the RFCs
  • It blindly hits the service again even if the receiving service said the data was invalid / incorrect / lacked permissions. Per industry standard the user should be able to handle a webhook failure in userland code.
  • It then disables the whole webhook if sufficient error codes are encountered by the webhook proxy. Which means if one device happened to be sending garbage then the entire fleet using that webhook will be ignored.

I thought I'd raise the issue as I presumed it was simply more code that had been rushed out the door and after a few years it was a good time to fix prior oversights.

@markterrill
Copy link
Author

Since it looks like it won't be fixed, I've submitted a PR for the docs.
particle-iot/docs#1064

@markterrill
Copy link
Author

markterrill commented Jan 15, 2020

As a quick enlightening exercise on HTTP codes and how web services are meant to behave, here are the RFC's. https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

Skipping down to the 400 codes you get to the good stuff:

10.4.1 400 Bad Request
The request could not be understood by the server due to malformed syntax. 10.4.1 400 Bad Request
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.

.....

10.4.4 403 Forbidden
The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.

It's a really interesting 'engineering' choice to deliberately break web standards. It means the webhook system is half cooked, and it wouldn't take a lot of effort to actually make it exponentially better. It'd likely massively reduce outbound webhook requests, which translates to server expenditure, and it would make a big difference across customers if the choke point in their architecture worked the way it should.

The example is a wifi power plug. Appliances may decide to turn it on or off via the app, it'd be a simple webhook request post to do so, but as the plug may actually be offline the API will respond with a 4xx saying the device is online. A few of those and no customers can toggle their wifi plugs because the Particle system has been hammering return messages at 4xx's in the face of RFC standards. That would be bad, but liveable if there wasn't then a risk that the whole webhook would be suspended across clients! The only way to work around Particle's broken implementation is with a cloud function or AWS HTTP gateway implementation that placates the webhook with a forced 200.

@markterrill
Copy link
Author

PS, my latest use case that really triggered this all is IFTTT.

  • Customer inputs their IFTTT webhook key via the app, device then sends a post to IFTTT with their key.
  • IFTTT responds with a 401 if you get the key wrong. AFAIK there's no real way on the device to test whether a webhook worked, and stop publishing if there's an issue. The risk there is the whole webhook is taken down and customers start calling.
  • There are ways around it. Validating it via a cloud function before truly accepting the client input, using the tried and tested (but costing $ to develop and maintain) middleware that placates particle webhook with 200's. All of those options wouldn't be required if the webhook system adhered to RFC standards - which is to change the authorisation attempt and try again, not hammer it twice more with invalid details and then disable the webhook.

@markterrill
Copy link
Author

markterrill commented Apr 2, 2020

@monkbroc strangely enough today I think I tripped over the very likely reason why the Particle system behaves the way it does. You're probably using request promise with default options. From their github docs:

Request and Bluebird are pretty awesome, but I found myself using the same design pattern. Request-Promise adds a Bluebird-powered .then(...) method to Request call objects. By default, http response codes other than 2xx will cause the promise to be rejected. This can be overwritten by setting options.simple = false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants