Old fields and tags show up after dropping measurement and rewriting #10052

cheribral · 2018-07-06T03:59:24Z

Using version influxdb-1.5.2, if I drop a measurement, and then write again using the same name, the measurement is recreated but it contains all the tags and keys from the previous measurement.

I noticed this after writing a measurement with an identifier as a field, then deciding to make it a tag. I could no longer query the data without a ::tag cast because the database still retained the old field key. I started playing around with arbitrary keys and values, and dropping the measurement. Every thing I've sent is retained across 'drop measurement's

I've even tried to drop the measurement, stop the database and rebuild the index, but this doesn't work either.

e-dard · 2018-07-09T11:11:05Z

@cheribral could you provide some steps for us to reproduce this issue? Which index type are you using?

cheribral · 2018-07-10T05:32:32Z

This is using the disk based tsi1 index.
This came right after I deleted the measurement and let it sit for a day before writing again.

I went back to make a test measurement to copy and paste the steps, and I can't reproduce it for some reason. I have no idea what the difference is other than that I don't have writes coming in to the measurement while I delete.
I'll see if I can get it happen again tomorrow.

e-dard · 2018-07-10T10:57:04Z

@cheribral thanks, we will need example data and steps so we can follow along and understand the issue better.

dustin96080 · 2018-07-13T01:34:37Z

@e-dard
Im having the same issue. Im using version: 1.5.2. Here is an example:
> drop measurement memory
>show field keys from memory Returns nothing
>select * from memory Returns nothing

> insert memory,host=test,type=memory value=0
> show field keys from memory
name: memory
fieldKey       fieldType
--------       ---------
buffered       float
cached         float
free           float
heap_usage     float
non_heap_usage float
slab_recl      float
slab_unrecl    float
used           float
value          float
> select * from memory limit 10
name: memory
time                buffered cached free heap_usage host non_heap_usage slab_recl slab_unrecl type   used value
----                -------- ------ ---- ---------- ---- -------------- --------- ----------- ----   ---- -----
1531445331027521830                                 test                                      memory      0

All these fields shown are old fields I am not using anymore but I can't seem to get them to go away. I have gotten them to go away before by running this:
nflux_inspect buildtsi -database graphite -datadir /var/lib/influxdb/data/ -waldir /var/lib/influxdb/wal/
This is not fixing this issue this time.

Hope this helps.

e-dard · 2018-07-13T09:36:39Z

Thanks @dustin96080,

do you have a set of data to insert that will reproduce this bug?

e-dard · 2018-07-13T09:37:12Z

Also, were you using TSI previous to 1.5.2?

dustin96080 · 2018-07-13T14:10:56Z

@e-dard We have been using TSI from the beginning (about 1 year). Not sure how much data you want or how i would get that to you as i don't want it public. I have included a small subset of the data.

memory,host=dustintest01,type=memory buffered=4.272128e+06 1531180067000000000
memory,host=dustintest01,type=memory buffered=4.272128e+06 1531180127000000000
memory,host=dustintest01,type=memory cached=9.96225024e+08 1531179767000000000
memory,host=dustintest01,type=memory cached=9.9631104e+08 1531179827000000000
memory,host=dustintest01,type=memory cached=9.9631104e+08 1531179887000000000
memory,host=dustintest01,type=memory cached=9.9631104e+08 1531179947000000000
memory,host=dustintest01,type=memory cached=9.9631104e+08 1531180007000000000
memory,host=dustintest01,type=memory cached=9.9631104e+08 1531180067000000000
memory,host=dustintest01,type=memory cached=9.9631104e+08 1531180127000000000
memory,host=dustintest01,type=memory free=1.43478784e+09 1531179767000000000
memory,host=dustintest01,type=memory free=1.572323328e+09 1531179827000000000
memory,host=dustintest01,type=memory free=1.572294656e+09 1531179887000000000
memory,host=dustintest01,type=memory free=1.57231104e+09 1531179947000000000
memory,host=dustintest01,type=memory free=1.57231104e+09 1531180007000000000
memory,host=dustintest01,type=memory free=1.572327424e+09 1531180067000000000
memory,host=dustintest01,type=memory free=1.571733504e+09 1531180127000000000
memory,host=dustintest01,type=memory slab_recl=1.668096e+08 1531179767000000000
memory,host=dustintest01,type=memory slab_recl=1.66854656e+08 1531179827000000000
memory,host=dustintest01,type=memory slab_recl=1.66854656e+08 1531179887000000000
memory,host=dustintest01,type=memory slab_recl=1.66854656e+08 1531179947000000000
memory,host=dustintest01,type=memory slab_recl=1.66854656e+08 1531180007000000000
memory,host=dustintest01,type=memory slab_recl=1.66854656e+08 1531180067000000000
memory,host=dustintest01,type=memory slab_recl=1.66846464e+08 1531180127000000000
memory,host=dustintest01,type=memory slab_unrecl=4.1725952e+07 1531179767000000000
memory,host=dustintest01,type=memory slab_unrecl=4.1197568e+07 1531179827000000000
memory,host=dustintest01,type=memory slab_unrecl=4.093952e+07 1531179887000000000
memory,host=dustintest01,type=memory slab_unrecl=4.0833024e+07 1531179947000000000
memory,host=dustintest01,type=memory slab_unrecl=4.0833024e+07 1531180007000000000
memory,host=dustintest01,type=memory slab_unrecl=4.0833024e+07 1531180067000000000
memory,host=dustintest01,type=memory slab_unrecl=4.0833024e+07 1531180127000000000
memory,host=dustintest01,type=memory used=4.396711936e+09 1531179767000000000
memory,host=dustintest01,type=memory used=4.25957376e+09 1531179827000000000
memory,host=dustintest01,type=memory used=4.25986048e+09 1531179887000000000
memory,host=dustintest01,type=memory used=4.259950592e+09 1531179947000000000
memory,host=dustintest01,type=memory used=4.259950592e+09 1531180007000000000
memory,host=dustintest01,type=memory used=4.259934208e+09 1531180067000000000
memory,host=dustintest01,type=memory used=4.26053632e+09 1531180127000000000

shakefu · 2018-07-27T23:51:40Z

Also seeing this issue on Influx Cloud 1.5.3-c1.5.3.

cha87de · 2018-10-06T15:21:43Z

I'm running into this behavior of influxdb as well. I'm using logstash to write json into an influx measurement. I'm actually only trying to change the field type from float to int - since this seems to be impossible I tried to DROP MEASUREMENT ... - but the field with the old type reappears. Seems I'm going from one issue to another :-( I'm using the docker image influxdb:1.4-alpine.

sada-narayanappa · 2018-11-02T20:08:55Z

Same issue for me - why can't we drop and wipe out a measurement; Influxdb - why are you caching the OLD fields .... grr ....

If I insert the same data into a new table, insert goes through; Otherwise I get this error:

influxdb.exceptions.InfluxDBClientError: 400: {"error":"partial write: field type conflict: input field "BLAH_007" on measurement "BB2" is type float, already exists as type integer dropped=1"}

abbasqamar · 2018-11-08T15:11:44Z

me too have same issue..

derrix060 · 2018-12-05T23:36:26Z

What I've found is that If I do a backup, and then restore this backup somewhere else (in another docker container, for example), the new database will work fine.

derrix060 · 2018-12-05T23:49:18Z

The only way that I managed to deal with this (very hackish workaround) was creating a backup to the database that I want, drop this database and finally restoring the backup.

f1-outsourcing · 2018-12-08T10:29:10Z

I am still not able to drop measurements, and I have been waiting quite some time for an update that fixes this. How is it even possible that you are releasing versions that are so buggy???
I would be ashamed if I put clients in such position, also totally no support on your community forum.

drop measurement "collectd"
drop series from "collectd"
delete from "collectd"

influxdb-1.7.1-1.x86_64
CentOS Linux release 7.5.1804 (Core)
Linux db1 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

f1-outsourcing · 2018-12-08T10:42:36Z

Seems I'm going from one issue to another :-( I'm using the docker image influxdb:1.4-alpine.

Yes, it is quite bad here. I think it has to do also with the GO language, such a 'low entry threshold level' language, compared to something like c and c++. Is attracting the 'left over' programmers that were 'unable' adopt to more complex and more effort taking language. This difference is also noticeable with something like php. But that is a personal opinion from years of experience, maybe I was just not to lucky with my contacts.
Here some rookie accepts this as being normal. I guess it is a sign of the times.
https://community.influxdata.com/t/or-rention-policies-are-not-correctly-dropped-or-there-is-something-wrong-with-the-cli/7461/4?u=f1outsourcing

nibynool · 2018-12-19T03:39:43Z

Currently experiencing this issue with InfluxCloud - really painful as I can't even try the backup/restore option someone else mentioned

e-dard · 2018-12-19T12:22:11Z

We think this could be fixed with the 1.7.2 release, where we have fixed a few bugs to do with concurrent writes and deletes.

@nibynool please drop a ticket into support where you will be assisted.

e-dard · 2018-12-19T12:24:43Z

@f1-outsourcing would you upgrade to 1.7.2 and see if that resolves your issue? If you have a dataset that we can use to reproduce your issue that would be great too.

f1-outsourcing · 2018-12-29T13:29:41Z

This seemed to work, (just posting it here also)

find . -type d -name index -exec rm -Rf {} ;

influx_inspect buildtsi -datadir data/ -waldir wal/

garceri · 2019-01-09T20:38:24Z

Running 1.7.2, experiencing the same issue here..

e-dard · 2019-01-11T11:21:41Z

@garceri do you have a definitive way to reproduce this issue on 1.7.2?

silviot · 2019-01-18T13:47:31Z

@e-dard I'm experiencing this problem too.

I tried to reproduce it on a lean install, but I was not able to.

But I have a database where it consistently happens. It's 75 Mb and contains info that is not confidential, but I'm not comfortable sharing publicly. I can send it your way so you can check what's going on.

A few more details:

I had a type mismatch after changing telegraf config, so I dropped the measurement to start clean
I can only insert new rows using the old types
if I restart the database the measurement does not show up with SHOW MEASUREMENTS until I try to insert a new row. After that it does show up, and SHOW FIELD KEYS shows the incorrect and old types

phemmer · 2019-01-21T07:17:43Z

Experiencing this issue as well on 1.7.2.
Tried the index delete & influx_inspect mentioned by @f1-outsourcing but it didn't work for me :-/

I'm unable to reproduce on demand. I just have a screwed up measurement I can't get fixed.

drb-germany · 2019-05-07T19:00:22Z

I have spend quite some time to find a sure way to reproduce the problem. It did not work. The same procedure (with little data) only seldomly reproduced the problem. However, I have a large number of larger datasets where I almost always found the problem.

Therefore, I have the feeling that this only occurs if large amounts of data are accumulated, e.g. a datapoint every 5 or 10 seconds for several days or weeks (maybe if data is spread over different files or shards?).

Thanks for looking into this, this gives us a huge headache as somtimes datapoints are written with the wrong datatype and we have to rename the values and keep the old fields. Currently, the only way is to copy the whole measurement without the wanted fields, drop the measurements, copy it back and then backup and restore the whole database. Can take hours just to get rid of a single field.

wollew · 2019-05-07T20:51:30Z

@wollew that would be great. Please email me edd@<nameoftherepo>.com and I will provide you with some credentials where you can securely upload data to our company SFTP server. If that doesn't work for you we can figure something else out.

I just emailed you, hope that helps.

wollew · 2019-06-04T08:38:24Z

Any news on this issue?

kezsto · 2019-06-27T23:23:36Z

Just got slammed with this in production. Really really frustrating.

jfcg · 2019-07-04T11:18:46Z

Same here with InfluxDB 1.6.6 on Linux

I wrote some fields as strings instead of float. Old schema survives a "drop measurement" if you reuse measurement name. It is really surprising a database company:

fails to test a fundamental feature
fails to recreate a simple and dire bug

e-dard · 2019-07-04T13:51:15Z

We are actively investigating this issue. Thanks for your patience, and to those of you who have provided me with data to reproduce the issue.

This commit fixes an issue where field keys would reappear in results when querying previously dropped measurements. The issue manifests itself when duplicates of a new series are inserted into the `inmem` index. In this case, a map that tracks the number of series belonging to a measurement was incorrectly incremented once for each duplication of the series. Then, when it came time to drop the measurement, the index assumed there were several series belonging to the measurement left in the index (because the counter was higher than it should be). The result of that was that the `fields.idx` file (which stores a mapping between measurements and field keys) was not truncated and rebuilt. This left old field keys in that file, which were then returned in subsequent queries over all field keys.

Fixes #10052 This commit fixes an issue where field keys would reappear in results when querying previously dropped measurements. The issue manifests itself when duplicates of a new series are inserted into the `inmem` index. In this case, a map that tracks the number of series belonging to a measurement was incorrectly incremented once for each duplication of the series. Then, when it came time to drop the measurement, the index assumed there were several series belonging to the measurement left in the index (because the counter was higher than it should be). The result of that was that the `fields.idx` file (which stores a mapping between measurements and field keys) was not truncated and rebuilt. This left old field keys in that file, which were then returned in subsequent queries over all field keys.

e-dard · 2019-07-05T13:39:27Z

Update

Hello everyone affected by this issue. Firstly, I would like to apologise that it's been almost a year since this issue was filed. Yesterday I was able to really dig into what was causing this issue, mainly due to the .influxdb directly that @ragnarkurmwunder sent me a while back.

This is a pretty difficult issue to reproduce without an existing dataset. Triggering the issue relies on duplicates of a new series being inserted into a database using the inmem index within the same batch. Further, it looks like they need to sit inside of the WAL so that when the database is restarted they will be replayed in and the problem will continue...

The cause of the issue is that the inmem index, in this rare case over counts how many series belong to the measurement (it counts the duplicate points for the same series as different series). Then, when you go to delete the measurement, the index thinks there are still some series around for the measurement and it does not clean up the fields.idx file. This file contains mappings from measurements to field keys, and if it's not cleaned up properly, then those old field keys can be returned in some cases.

I believe I have fixed this issue in #14266

The fix will be available in the 1.8 release, and also in a future 1.7.8 release.

Operational Mitigation Steps

Here are some operational steps you could take to try and resolve this issue without waiting for 1.8 or 1.7.8.

Use the TSI index

I was unable to reproduce this issue using the TSI index. Even when I triggered the issue on the inmem index, and then upgraded to the TSI index, I saw the issue disappear. Whilst we will of course continue to support the inmem index on the 1.x line, from 2.x onwards the TSI index will be the main index InfluxDB uses, and all our development effort will continue on that.

You can find out more information about how to upgrade to TSI here. In the simplest case, you bring your server down and then do something like:

influx_inspect buildtsi -datadir ~/.influxdb/data -waldir ~/.influxdb/wal

Remove invalid fields.idx files

The bug is caused because the fields.idx files (there is one file per shard directory) are not properly rebuilt when the measurement is deleted. However, InfluxDB will rebuild these files if they're missing. If you are currently suffering from fields that are appearing in queries when they shouldn't be then I recommend that you delete all of the field.idx files for the problematic database/retention policy. You will need to bring down your server to do this, then:

$ rm -i ~/.influxdb/data/<db_name>/<rp_name>/*/fields.idx

1ma · 2019-07-05T13:48:40Z

That's some great news!

Will we need to do the manual cleanup if we just update to 1.7.8 when it's released?

e-dard · 2019-07-05T14:08:37Z

@1ma that's a great point. We will have to add something to the release notes. You will have to either do a manual cleanup, or the issue will resolve itself if you re-drop the measurement.

The manual cleanup would involve removing the stale fields.idx files.

Fixes #10052 This commit fixes an issue where field keys would reappear in results when querying previously dropped measurements. The issue manifests itself when duplicates of a new series are inserted into the `inmem` index. In this case, a map that tracks the number of series belonging to a measurement was incorrectly incremented once for each duplication of the series. Then, when it came time to drop the measurement, the index assumed there were several series belonging to the measurement left in the index (because the counter was higher than it should be). The result of that was that the `fields.idx` file (which stores a mapping between measurements and field keys) was not truncated and rebuilt. This left old field keys in that file, which were then returned in subsequent queries over all field keys.

ghost · 2019-08-23T10:14:42Z

officail docker image influxdb:1.7.7 still has this issue.
So I had to drop database

lovasoa · 2019-10-11T12:21:24Z

Any news on this ? I am not sure the issue mentioned by @e-dard above is the same as what everyone is encountering here. The issue is not "hard to reproduce", for me it happens systematically with any measurement that I drop.

jeankarunadewi · 2019-10-22T08:00:57Z

also experienced this issue on influx v1.7.7.
just now, I'm experimenting using telegraf to convert a csv to my influxdb database.
At first the drop is successful, but after 3 attempts the drops suddenly didn't work. The points are gone from the measurement, but everytime I "show measurements", it's still there. Also when I use "show tag keys" and "show field keys", the tags and fields are still there. Then untill now, the measurement cannot be dropped at all.
I hope there is a continuation from influxdb team to solve this "bug"

e-dard · 2019-11-11T15:44:45Z

This issue should be fixed in 1.7.8. Please open a new issue if you see problems on 1.7.8.

rakopoul · 2023-12-05T13:50:11Z

We are having same issue with InfluxDB. We use tag 2.5.1-alpine inside a container.
I have a table where metrics coming from Kafka are inserted there.
Initially the table had three tags and several fields. To make measurement faster i changed telegraf agent config to make some of the fields as tags.
So instead of having measurement like:
tag1,tag2,tag3 field1,field2,field3,field4,field5
I made them
tag1,tag2,tag3,tag4 (name of field1),tag5(name of field2) field3,field4,field5

I dropped old table and Influx seems to work fine.
After some time i wanted to change my dashboards showing the metrics and i redeployed the whole stack (changing only dashboards code). After this one it seemed like Influx brought back somehow the old metrics as well, creating conflict and renaming the new tags to tag_1 etc, of course making the queries not to show anything. Dropping again the tables fixed the problem, till new restart reproduces it.

Checked my metrics in Kafka and there is no old message, it seems somehow influx caches the old data somewhere and we are not sure how to resolve it.

cheribral changed the title ~~Old fileds and tags show up after dropping measurement and rewriting~~ Old fields and tags show up after dropping measurement and rewriting Jul 6, 2018

e-dard added the need more info label Jul 9, 2018

rbetts added kind/bug area/tsi and removed need more info labels Aug 6, 2018

conet mentioned this issue Nov 12, 2018

InfluxDB stuck in full compaction process #10465

Closed

dgnorton added the 1.x label Jan 7, 2019

e-dard mentioned this issue Jun 28, 2019

Data can re-appear after it's been deleted #14229

Closed

e-dard self-assigned this Jul 4, 2019

e-dard mentioned this issue Jul 5, 2019

fix(storage): Fix issue where fields re-appear #14266

Merged

e-dard removed the area/tsi label Jul 5, 2019

e-dard mentioned this issue Jul 5, 2019

Typed iterators are racy #14267

Closed

andsee mentioned this issue Sep 19, 2019

[Feature request] Drop field #6150

Open

nbari added a commit to s3mon/s3mon that referenced this issue Oct 15, 2019

using integers, can't just change datatype influxdata/influxdb#10052

c0e58d0

e-dard closed this as completed Nov 11, 2019

andybets mentioned this issue Feb 20, 2020

Old measurement reappears after dropping and then restarting #16947

Open

Bugazelle mentioned this issue Mar 18, 2020

field type error Bugazelle/export-csv-to-influx#29

Closed

coguin mentioned this issue Nov 28, 2020

BUG Old tags show up after dropping measurement #20192

Open

Old fields and tags show up after dropping measurement and rewriting #10052

Old fields and tags show up after dropping measurement and rewriting #10052

Comments

cheribral commented Jul 6, 2018

e-dard commented Jul 9, 2018 • edited Loading

cheribral commented Jul 10, 2018

e-dard commented Jul 10, 2018

dustin96080 commented Jul 13, 2018 • edited Loading

e-dard commented Jul 13, 2018

e-dard commented Jul 13, 2018

dustin96080 commented Jul 13, 2018 • edited Loading

shakefu commented Jul 27, 2018

cha87de commented Oct 6, 2018

sada-narayanappa commented Nov 2, 2018

abbasqamar commented Nov 8, 2018

derrix060 commented Dec 5, 2018 • edited Loading

derrix060 commented Dec 5, 2018

f1-outsourcing commented Dec 8, 2018

f1-outsourcing commented Dec 8, 2018

nibynool commented Dec 19, 2018

e-dard commented Dec 19, 2018

e-dard commented Dec 19, 2018

f1-outsourcing commented Dec 29, 2018

garceri commented Jan 9, 2019

e-dard commented Jan 11, 2019

silviot commented Jan 18, 2019

phemmer commented Jan 21, 2019

drb-germany commented May 7, 2019

wollew commented May 7, 2019

wollew commented Jun 4, 2019

kezsto commented Jun 27, 2019

jfcg commented Jul 4, 2019

e-dard commented Jul 4, 2019

e-dard commented Jul 5, 2019

Update

Operational Mitigation Steps

Use the TSI index

Remove invalid fields.idx files

1ma commented Jul 5, 2019

e-dard commented Jul 5, 2019 • edited Loading

ghost commented Aug 23, 2019

lovasoa commented Oct 11, 2019

jeankarunadewi commented Oct 22, 2019

e-dard commented Nov 11, 2019

rakopoul commented Dec 5, 2023 • edited Loading

e-dard commented Jul 9, 2018 •

edited

Loading

dustin96080 commented Jul 13, 2018 •

edited

Loading

dustin96080 commented Jul 13, 2018 •

edited

Loading

derrix060 commented Dec 5, 2018 •

edited

Loading

e-dard commented Jul 5, 2019 •

edited

Loading

rakopoul commented Dec 5, 2023 •

edited

Loading