-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Old fields and tags show up after dropping measurement and rewriting #10052
Comments
@cheribral could you provide some steps for us to reproduce this issue? Which index type are you using? |
This is using the disk based tsi1 index. I went back to make a test measurement to copy and paste the steps, and I can't reproduce it for some reason. I have no idea what the difference is other than that I don't have writes coming in to the measurement while I delete. |
@cheribral thanks, we will need example data and steps so we can follow along and understand the issue better. |
@e-dard
All these fields shown are old fields I am not using anymore but I can't seem to get them to go away. I have gotten them to go away before by running this: Hope this helps. |
Thanks @dustin96080, do you have a set of data to insert that will reproduce this bug? |
Also, were you using TSI previous to |
@e-dard We have been using TSI from the beginning (about 1 year). Not sure how much data you want or how i would get that to you as i don't want it public. I have included a small subset of the data.
|
Also seeing this issue on Influx Cloud |
I'm running into this behavior of influxdb as well. I'm using logstash to write json into an influx measurement. I'm actually only trying to change the field type from float to int - since this seems to be impossible I tried to |
Same issue for me - why can't we drop and wipe out a measurement; Influxdb - why are you caching the OLD fields .... grr .... If I insert the same data into a new table, insert goes through; Otherwise I get this error: influxdb.exceptions.InfluxDBClientError: 400: {"error":"partial write: field type conflict: input field "BLAH_007" on measurement "BB2" is type float, already exists as type integer dropped=1"} |
me too have same issue.. |
What I've found is that If I do a backup, and then restore this backup somewhere else (in another docker container, for example), the new database will work fine. |
The only way that I managed to deal with this (very hackish workaround) was creating a backup to the database that I want, drop this database and finally restoring the backup. |
I am still not able to drop measurements, and I have been waiting quite some time for an update that fixes this. How is it even possible that you are releasing versions that are so buggy??? drop measurement "collectd" influxdb-1.7.1-1.x86_64 |
Yes, it is quite bad here. I think it has to do also with the GO language, such a 'low entry threshold level' language, compared to something like c and c++. Is attracting the 'left over' programmers that were 'unable' adopt to more complex and more effort taking language. This difference is also noticeable with something like php. But that is a personal opinion from years of experience, maybe I was just not to lucky with my contacts. |
Currently experiencing this issue with InfluxCloud - really painful as I can't even try the backup/restore option someone else mentioned |
We think this could be fixed with the 1.7.2 release, where we have fixed a few bugs to do with concurrent writes and deletes. @nibynool please drop a ticket into support where you will be assisted. |
@f1-outsourcing would you upgrade to 1.7.2 and see if that resolves your issue? If you have a dataset that we can use to reproduce your issue that would be great too. |
This seemed to work, (just posting it here also) find . -type d -name index -exec rm -Rf {} ; influx_inspect buildtsi -datadir data/ -waldir wal/ |
Running 1.7.2, experiencing the same issue here.. |
@garceri do you have a definitive way to reproduce this issue on 1.7.2? |
@e-dard I'm experiencing this problem too. I tried to reproduce it on a lean install, but I was not able to. But I have a database where it consistently happens. It's 75 Mb and contains info that is not confidential, but I'm not comfortable sharing publicly. I can send it your way so you can check what's going on. A few more details:
|
Experiencing this issue as well on 1.7.2. I'm unable to reproduce on demand. I just have a screwed up measurement I can't get fixed. |
I have spend quite some time to find a sure way to reproduce the problem. It did not work. The same procedure (with little data) only seldomly reproduced the problem. However, I have a large number of larger datasets where I almost always found the problem. Therefore, I have the feeling that this only occurs if large amounts of data are accumulated, e.g. a datapoint every 5 or 10 seconds for several days or weeks (maybe if data is spread over different files or shards?). Thanks for looking into this, this gives us a huge headache as somtimes datapoints are written with the wrong datatype and we have to rename the values and keep the old fields. Currently, the only way is to copy the whole measurement without the wanted fields, drop the measurements, copy it back and then backup and restore the whole database. Can take hours just to get rid of a single field. |
I just emailed you, hope that helps. |
Any news on this issue? |
Just got slammed with this in production. Really really frustrating. |
Same here with InfluxDB 1.6.6 on Linux I wrote some fields as strings instead of float. Old schema survives a "drop measurement" if you reuse measurement name. It is really surprising a database company:
|
We are actively investigating this issue. Thanks for your patience, and to those of you who have provided me with data to reproduce the issue. |
This commit fixes an issue where field keys would reappear in results when querying previously dropped measurements. The issue manifests itself when duplicates of a new series are inserted into the `inmem` index. In this case, a map that tracks the number of series belonging to a measurement was incorrectly incremented once for each duplication of the series. Then, when it came time to drop the measurement, the index assumed there were several series belonging to the measurement left in the index (because the counter was higher than it should be). The result of that was that the `fields.idx` file (which stores a mapping between measurements and field keys) was not truncated and rebuilt. This left old field keys in that file, which were then returned in subsequent queries over all field keys.
Fixes #10052 This commit fixes an issue where field keys would reappear in results when querying previously dropped measurements. The issue manifests itself when duplicates of a new series are inserted into the `inmem` index. In this case, a map that tracks the number of series belonging to a measurement was incorrectly incremented once for each duplication of the series. Then, when it came time to drop the measurement, the index assumed there were several series belonging to the measurement left in the index (because the counter was higher than it should be). The result of that was that the `fields.idx` file (which stores a mapping between measurements and field keys) was not truncated and rebuilt. This left old field keys in that file, which were then returned in subsequent queries over all field keys.
Fixes #10052 This commit fixes an issue where field keys would reappear in results when querying previously dropped measurements. The issue manifests itself when duplicates of a new series are inserted into the `inmem` index. In this case, a map that tracks the number of series belonging to a measurement was incorrectly incremented once for each duplication of the series. Then, when it came time to drop the measurement, the index assumed there were several series belonging to the measurement left in the index (because the counter was higher than it should be). The result of that was that the `fields.idx` file (which stores a mapping between measurements and field keys) was not truncated and rebuilt. This left old field keys in that file, which were then returned in subsequent queries over all field keys.
Fixes #10052 This commit fixes an issue where field keys would reappear in results when querying previously dropped measurements. The issue manifests itself when duplicates of a new series are inserted into the `inmem` index. In this case, a map that tracks the number of series belonging to a measurement was incorrectly incremented once for each duplication of the series. Then, when it came time to drop the measurement, the index assumed there were several series belonging to the measurement left in the index (because the counter was higher than it should be). The result of that was that the `fields.idx` file (which stores a mapping between measurements and field keys) was not truncated and rebuilt. This left old field keys in that file, which were then returned in subsequent queries over all field keys.
UpdateHello everyone affected by this issue. Firstly, I would like to apologise that it's been almost a year since this issue was filed. Yesterday I was able to really dig into what was causing this issue, mainly due to the This is a pretty difficult issue to reproduce without an existing dataset. Triggering the issue relies on duplicates of a new series being inserted into a database using the The cause of the issue is that the I believe I have fixed this issue in #14266 The fix will be available in the Operational Mitigation StepsHere are some operational steps you could take to try and resolve this issue without waiting for 1.8 or 1.7.8. Use the TSI indexI was unable to reproduce this issue using the TSI index. Even when I triggered the issue on the You can find out more information about how to upgrade to TSI here. In the simplest case, you bring your server down and then do something like:
Remove invalid fields.idx filesThe bug is caused because the $ rm -i ~/.influxdb/data/<db_name>/<rp_name>/*/fields.idx |
That's some great news! Will we need to do the manual cleanup if we just update to 1.7.8 when it's released? |
@1ma that's a great point. We will have to add something to the release notes. You will have to either do a manual cleanup, or the issue will resolve itself if you re-drop the measurement. The manual cleanup would involve removing the stale |
Fixes #10052 This commit fixes an issue where field keys would reappear in results when querying previously dropped measurements. The issue manifests itself when duplicates of a new series are inserted into the `inmem` index. In this case, a map that tracks the number of series belonging to a measurement was incorrectly incremented once for each duplication of the series. Then, when it came time to drop the measurement, the index assumed there were several series belonging to the measurement left in the index (because the counter was higher than it should be). The result of that was that the `fields.idx` file (which stores a mapping between measurements and field keys) was not truncated and rebuilt. This left old field keys in that file, which were then returned in subsequent queries over all field keys.
officail docker image influxdb:1.7.7 still has this issue. |
Any news on this ? I am not sure the issue mentioned by @e-dard above is the same as what everyone is encountering here. The issue is not "hard to reproduce", for me it happens systematically with any measurement that I drop. |
also experienced this issue on influx v1.7.7. |
This issue should be fixed in 1.7.8. Please open a new issue if you see problems on 1.7.8. |
We are having same issue with InfluxDB. We use tag 2.5.1-alpine inside a container. I dropped old table and Influx seems to work fine. Checked my metrics in Kafka and there is no old message, it seems somehow influx caches the old data somewhere and we are not sure how to resolve it. |
Using version influxdb-1.5.2, if I drop a measurement, and then write again using the same name, the measurement is recreated but it contains all the tags and keys from the previous measurement.
I noticed this after writing a measurement with an identifier as a field, then deciding to make it a tag. I could no longer query the data without a ::tag cast because the database still retained the old field key. I started playing around with arbitrary keys and values, and dropping the measurement. Every thing I've sent is retained across 'drop measurement's
I've even tried to drop the measurement, stop the database and rebuild the index, but this doesn't work either.
The text was updated successfully, but these errors were encountered: