Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Support dropping conflicting types on writes #4856

Closed
otoolep opened this issue Nov 20, 2015 · 10 comments
Closed

[Feature request] Support dropping conflicting types on writes #4856

otoolep opened this issue Nov 20, 2015 · 10 comments

Comments

@otoolep
Copy link
Contributor

otoolep commented Nov 20, 2015

Started with #3460

Should we support an option at the shard-level to simply drop (and count) points that conflict by type? if set and if a batch of data contains 1000s of points, but one point is of the wrong type, should the system drop the point, and return success?

@otoolep otoolep added the RFC label Nov 20, 2015
@otoolep
Copy link
Contributor Author

otoolep commented Nov 20, 2015

PM call required @pauldix

@otoolep
Copy link
Contributor Author

otoolep commented Nov 20, 2015

Other systems in the area of time-series and indexing systems, which determine value types on the fly, support similar options.

That said, InfluxDB does not have a single source-of-truth for field types, so even with this in place, one could still have differences between shards. The only way to truly lock down types is through #3006

@pauldix
Copy link
Member

pauldix commented Nov 20, 2015

This one actually isn't an issue anymore now that the line protocol requires a trailing i to explicitly call out an integer. I think it's fine to throw an error.

@otoolep
Copy link
Contributor Author

otoolep commented Nov 20, 2015

I was thinking about the case @pauldix where one builds and runs a service for other uses (multi-tenant SaaS, in-house monitoring system etc), and InfluxDB is not directly accessible to the users. The designers and operators may wish to offer this flexibility to their users.

For high-volume systems, the code that accepts the data for ingestion may be decoupled from InfluxDB (a Kafka-based pipeline is the canonical example these days). The system returns OK to the client once the pipeline accepts the data. By the time it is detected that 1 point in the batch of 10K points is bad, the system components upstream may have responded OK to the client. The client sees the entire batch of 10K points missing, when they might have preferred losing just 1 point.

@otoolep
Copy link
Contributor Author

otoolep commented Nov 20, 2015

I do need to check the latest code to determine how much of the batch would actually be affected (assuming it's all destined for a single shard -- perhaps all good points are actually written).

@jwilder
Copy link
Contributor

jwilder commented Nov 20, 2015

The other endpoints and the PointsWriter already support partial write semantics. I think it would be good to have the lower-level shard writes also support this. So, if you write 1000 points, but 2 fail because of type conflicts, 998 should succeed and we return a partial write error (with details of what failed) back up the stack.

In the SQL world, this is analogous to auto-commit mode.

@pauldix
Copy link
Member

pauldix commented Nov 20, 2015

We do already have the partial write semantics since writes can hit multiple shards. So we'd just need to push it down to the shard level.

@phemmer
Copy link
Contributor

phemmer commented Aug 8, 2016

Just got bit by this. Had some hosts which were writing one field of a single measurement as a different type than other hosts were. With the way telegraf works, this was causing the majority of our metrics to go missing (even completely different measurements than the one that was erroring) as it does a batch write, and only the metrics before the bad one were getting written (and the bad one was near the top of the batch).
I think this is a bad user experience. If the metrics before the bad one get inserted, then the ones after it should be inserted too. And if the solution is a change so that metrics before the failing one don't get inserted, a failure of one measurement type shouldn't affect other measurements.

@phemmer
Copy link
Contributor

phemmer commented Nov 21, 2016

Any status on this? We're writing in batches of 1 point to avoid potentially losing data because of this.

@sparrc
Copy link
Contributor

sparrc commented Jan 14, 2017

closing this because #7814 is a dupe of it

should have closed #7814 but we started discussing there so 🤷‍♂️

@sparrc sparrc closed this as completed Jan 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants