Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Drop field #6150

Open
dmke opened this issue Mar 29, 2016 · 52 comments
Open

[Feature request] Drop field #6150

dmke opened this issue Mar 29, 2016 · 52 comments

Comments

@dmke
Copy link

dmke commented Mar 29, 2016

Due to inattention, I have two fields, rx_pkt and rx_packets, each with a few million measurements, and I only need either of those.

Is there an equivalent to the following SQL to get rid of a single field in InfluxDB?

alter table traffic drop `rx_pkt`, drop `tx_pkt`
@dmke
Copy link
Author

dmke commented Mar 29, 2016

On a side note: I also expected one of these inserts/updates

traffic,id=42 tx_pkt=null,rx_pkt=null 1459265806000000000
traffic,id=42 tx_pkt=,rx_pkt= 1459265806000000000

to delete the field key/value from the field set (in #2429 it was stated that TSM doesn't handle null values). Would that be a feasible feature request?

@dmke dmke changed the title Drop field? [0.11] Drop field? Mar 30, 2016
@gvohra

This comment has been minimized.

2 similar comments
@PaulKuiper

This comment has been minimized.

@dstreppa

This comment has been minimized.

@dmke dmke changed the title [0.11] Drop field? [Feature request] Drop field Aug 9, 2016
@retorquere
Copy link

+1. I don't so much want to store nulls myself, but I'd love to have a way to signal "don't use this value". I store enriched data with my measurements, and if the enrichment changes, sometimes fields need to disappear.

@dandv
Copy link
Contributor

dandv commented Sep 14, 2016

Another use case: I've been storing a field that I later realized can be approximated well enough from two other fields. I'd like to remove that field from the measurement.

While this isn't possible yet, are there some migration tools that can assist with "filtering" a database into a new one, sans undesired fields and tags?

@drmclean
Copy link

+1 if a random field gets added to a measurement by accident I'd definitely like to be able to remove it, at the moment it stays even if that field has 0 non-null datapoints

@dandv
Copy link
Contributor

dandv commented Sep 25, 2016

Until/if this gets implemented, what is the recommended workaround? SELECT INTO? How do you make sure that all wanted points were copied to the new measurement, if you have points containing NULLs, given than COUNT ignores nulls?

@dandv
Copy link
Contributor

dandv commented Sep 26, 2016

StackOverflow question: How to delete columns of a series in InfluxDB.

@superbool

This comment has been minimized.

@dmke
Copy link
Author

dmke commented Oct 5, 2016

Please stop commenting "+1" or similar non-comments. It becomes a little annoying and doesn't really add to the discussion.

If you want to share your reaction, hit the little icon:
add your raction

If you want to get notified about updates, click the subscribe button on the sidebar:
Subscribe

Thanks.

@anazia
Copy link

anazia commented Oct 5, 2016

adding onto what @dandv said: SELECT / INTO is not very efficient (been running for 10h for 15G of data with 16 processors and still not done). Would love to see this feature implemented.

Occasionally an input plugin will fail and insert bad data, which ruins your entire measurement.

@JeremySTX
Copy link

One of the great attractions of InfluxDB for us was the free-form data storage: you can store any field/value pairs you like. The trap seems to be that once set it's impossible to remove field/value pairs from a measurement without recreating the entire measurement series using SELECT .. INTO.

Aside from being very slow, it seems that the INTO part must specify a different database/rp/measurement (i.e. one or more of those must be different to the source data), which creates another problem: you can't rename any of those, once written. So you either have to write a 'temporary' measurement then delete the original and copy the temporary back to the original name, or modify the application to use the new database/dp/measurement name.

Being able to DROP FIELD <fieldname> FROM MEASUREMENT <name> would save a lot of hassle when it's necessary to perform what should be a fairly simple repair.

@lqueryvg
Copy link

DROP TAG <tagname> FROM MEASUREMENT <name>
would be really useful too.

@mvadu
Copy link
Contributor

mvadu commented Nov 3, 2017

If you drop all series which have the column (in my case only few series had an extra column due to an error) InfluxDB automatically removes that column. Show Field Keys no longer lists that column.

@7c

This comment has been minimized.

@NateZimmer
Copy link

This is kinda crippling. I'll do the following:

DROP SERIES WHERE "myTagName" = 'PeskyTagIWantToDelete'

and theres only like a 50/50 that it gets removed when I do

SHOW TAG VALUES WITH KEY "myTagName"

@geometrybase
Copy link

@NateZimmer if you drop series, and restart influxdb. the fields will be gone.

@Dwood15
Copy link

Dwood15 commented Jul 5, 2018

Can we get a status update on this issue from the Influx team? This issue really is pretty crippling.

@JeremySTX
Copy link

I had a thought about this which may or may not be helpful ...
I have to assume that the lack of a DROP FIELD or DROP TAG operation and the lack of any sign of activity to provide such an operation is because it's hard to do, or is estimated to have significant performance impact on the data store. (How the performance impact could be worse than DROP SERIES and then recreating the series, I don't know!)

Anyway perhaps an alternative would be HIDE FIELD or HIDE TAG, which means "don't return this field (or tag) unless specifically requested in the query". So for example

HIDE FIELD battery_charge FROM meters

would add a flag to the field key for 'battery_charge' in 'meters' so that any future query such as

SELECT * FROM meters WHERE meter_id='grid1'

would return every field except the 'battery_charge' field, but

SELECT *,battery_charge FROM meters WHERE meter_id='grid1'

would return every field including the 'battery_charge' field, and
SELECT battery_charge FROM meters WHERE meter_id='grid1'
would return only the 'battery_charge' field.

I suppose it might be extended to the SERIES level but for our purposes having it at the MEASUREMENT level would be sufficient.

Just a suggestion.

@shakefu
Copy link

shakefu commented Jul 31, 2018

I keep running into the need for this because there's no way to effectively search for data points with certain fields to remove them manually, without first querying all the data.

@candlerb
Copy link

Another example is with the new telegraf syslog receiver, which creates fields dynamically from structured data. To test this I did the following:

# echo '173 <165>1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"] An application event log entry...' | nc 127.0.0.1 6514

It works fine, and generates dynamic fields from the structured data.

The trouble is, when I drop the test record, those dynamic fields live forever.

> delete from syslog where "appname" = 'evntslog'
> show field keys from syslog
name: syslog
fieldKey                      fieldType
--------                      ---------
exampleSDID@32473_eventID     string
exampleSDID@32473_eventSource string
exampleSDID@32473_iut         string
facility_code                 integer
message                       string
msgid                         string
procid                        string
severity_code                 integer
timestamp                     integer
version                       integer

And now select * from syslog... always includes these fat columns which never contain data :-(

Aside: if those extra values had been tags then it would have been very easy, because you can just drop the series: e.g. drop series where hostname='mymachine.example.com'

@jfcg
Copy link
Contributor

jfcg commented Dec 13, 2018

I too need to delete an unnecessary field on Influxdb 1.6.4

@dgnorton dgnorton added the 1.x label Jan 7, 2019
@nathanielatom
Copy link

Could we at least hear from the developers about whether this is/is not on the roadmap?

@drb-germany
Copy link

drb-germany commented Mar 26, 2019

@pzystorm I was able to get this to work (at least for my measurement) by flipping the order:

> drop series from <measurement>;
> drop measurement <measurement>;

Flipping the order did not work for me. The only way to get rid of the fields was to copy into a new measurement and copy it back (as described by @pzystorm) thereby making the fields empty (not deleting them). Afterwards backup the whole db, drop the db, and restore the db. The whole procedure can take hours if you want to get rid of a single field (in my case because it was written once with a wrong datatype). Very very cumbersome...

@stale
Copy link

stale bot commented Jul 23, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 23, 2019
@maraswrona
Copy link

Wait, what? This is higly demanded feature and some stupid bot decides to close it just because there is no "recent activity"? What do you want us to do - talk about it all day? Will it make you implement it faster?

@stale stale bot removed the wontfix label Jul 23, 2019
@ivanscattergood
Copy link

ivanscattergood commented Jul 23, 2019 via email

@drmclean
Copy link

drmclean commented Jul 23, 2019

I'm happy to set up a bot that will request this feature once a week if that helps?

not all heroes wear capes.

@andsee
Copy link

andsee commented Sep 19, 2019

I just had this issue. I accidentally added some data as fields instead of the expected tags. So I ended up with a tag and field with the same name. This, unfortunately, meant that the original tag started being returned as tagname_1 in queries, breaking stuff expecting it to be called tagname.

I deleted all measurements that had values applied to the erroneous field 'tagname' but still InfluxDB insisted that the fields existed.

Interestingly the Chronograf explore via flux stopped showing them in the schema though.

I restarted InfluxDB but that didn't help.

I then spotted the following post #10052 and updated to 1.7.8 but this did not fix the issue.

However, I then deleted the .idx files as suggested in that post and this removed the fields! Happy days.

But please give us DROP FIELD & DROP TAG.

@avrhack
Copy link

avrhack commented Oct 18, 2019

Interesting. Complete radio silence from anyone in the development community.......

@positron96
Copy link

For almost 4 years(

@etcinitd
Copy link

etcinitd commented Dec 3, 2019

Can someone from the development community at least summarise the steps and/or challenges for implementing this? Maybe someone else can help to implement it because it's a much needed feature.

@mmihalev
Copy link

Is Indflux a dead project? This issue is 4 years old and highly demanded ... still .. no reply from devs at all.

@retorquere
Copy link

It's not dead. Last commit was just half an hour ago.

@tedleyem
Copy link

tedleyem commented Jan 16, 2020

Idk how similar this is but I had an issue with my redis telegraf input where it was giving errors on one of my linux servers saying that the filed rss_overhead_bytes already had a value so both values were dropping
Error:
[outputs.influxdb]: when writing to [influx-ip-address]: received error partial write: field type conflict: input field "mem_fragmentation_bytes" on measurement "redis" is type integer, already exists as type float dropped=2; discarding points

So I checked

SHOW FIELD KEYS FROM "redis
repl_backlog_size integer
rss_overhead_bytes float
rss_overhead_bytes integer

rss_overhead_ratio float

And my fix was to drop the redis field.
I figured it would repopulate and it did

DROP FIELD KEYS FROM "redis

SHOW FIELD KEYS FROM "redis
repl_backlog_size integer
rss_overhead_bytes float
rss_overhead_ratio float

This resolved my issue. not sure if it helps with others but hope it does.

@leedhi
Copy link

leedhi commented May 18, 2020

It would be nice to have this feature.

We use InfluxDB Cloud 1 and one of our developers (using the nginx telegraf plugin) pushed 87K unique fields to one of our DBs. This caused us to be upgraded to the next hosting tier when we dont need to be.

If we could remove the erroneous field then we can go back down to the previous tier we were on. The select INTO option is slow and a horrible workaround. Since there was so much data we ended up just removing the measurement all together.

@dbo-wizata
Copy link

We experienced this issue as well and got 1d of data not being written by telegraf because of two fields with the same name but different type. The risk of not having the drop field or equivalent feature is high - as written in one of the comments above - one small incident renders complete db useless and can trigger good business and monetary losses.

@JetForMe
Copy link

If you drop all series which have the column (in my case only few series had an extra column due to an error) InfluxDB automatically removes that column. Show Field Keys no longer lists that column.

I have not found this to be the case. I just inserted a row by accident and deleted it, but all the new columns remain.

@tomaszG247
Copy link

Same deal, multiple types (by mistake) in same field cause impossibilities to write new values in a series
Dropping responsible series does not suffice to fix the issue. We are losing a lot of time and effort on this seemingly minor issue. Please address this.

@lainy
Copy link

lainy commented May 3, 2021

Is there a least a documented workaround for when one encounters this?
Without at least that much, InfluxDB is a non-starter for my purposes.

@aldas
Copy link

aldas commented Jun 8, 2021

Will it be addressed in v2 or in Iox?

Our case - we have extracted data in wrong endian from some of the sensors and now have incorrect values stored for couple weeks. As we store more than one field on measurement it would be helpful to be able to drop fields.

@idontcare1996
Copy link

Have there been any developments with this issue?
I've found this issue in a staging environment, mere days before launch into production and still can't find a good solution to this problem. It's been more than 6 years...

@ser
Copy link

ser commented May 20, 2023

Another year, no delete.

Apparently there is a hack, but requires shutdown of the server:

https://github.com/Abc-Arbitrage/infix

@robertsLando
Copy link

Any news on this? A comment from developers would be higly appreciated

@TheXtremeKing
Copy link

Oh nooo... It seems this easiest fix is to just delete the bucket and start over... Since this is a hobby project of mine for some home sensors it doens't matter much, but professionally I'll make sure to stick to ms sql

@bvasunr
Copy link

bvasunr commented Aug 7, 2023

A workaround we use in production is adding a tag variable="FIELD_NAME" when writing data:

MEASUREMENT_NAME,variable=FIELD FIELD=123456 1690300868644000000

The variable label is present twice: as a field and as a tag.

You can not delete a field, but you can delete a tag.

E.g. in python :

drop='_measurement="XXX" AND variable="FIELD_TO_DROP"
delete_api = client.delete_api()
delete_api.delete(start,
				  stop, 
				  predicate = drop, 
			  	  bucket = 'YYY', 
			  	  org = 'ZZZ')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests