Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add previousN and nextN to range #702

Closed
pauldix opened this issue Jun 15, 2018 · 13 comments
Closed

Add previousN and nextN to range #702

pauldix opened this issue Jun 15, 2018 · 13 comments

Comments

@pauldix
Copy link
Member

pauldix commented Jun 15, 2018

It's very common for users to want to get a range of data, but to also pull back some number of points before or after the time range specified. For example, query from 2018-06-15T11:47:00 to now and include the 3 points that occur before the start time. The default values for previousN and nextN should be 0. The function signature would look like:

range = (start, stop=now(), previousN=0, nextN=0)

@Kortenbach
Copy link

This would be a great addition and it would solve issue#6878 as mentioned by @vdwpsmt. Without it storing data with irregular intervals (think: events) using influxdb makes no sense.

@MikeTheTux
Copy link

MikeTheTux commented Oct 24, 2021

openHAB is using following as workaround:

from(bucket:"openhab_db/autogen")
	|> range(start:-100y, stop:2021-10-23T22:00:00.000000000Z)
	|> filter(fn: (r) => r["_measurement"] == "ElectricMeter_Total")
	|> sort(desc:true, columns:["_time"])
	|> limit(n:1, offset:0)

start:-100y 👎
You can imagine the performance on large databases -> timeout.

Using the following influxdb1 query was much more performant:

SELECT "value"::field,"item"::tag FROM autogen.ElectricMeter_Total WHERE time <= '2021-10-23T22:00:00Z' ORDER BY time DESC LIMIT 1;

@Kortenbach
Copy link

Suppose the size of the "ElectricMeter_Total" measurement is 1GB for the last 100y. Would the query above indeed have to browse through and sort 1GB of data before getting to the result?? That doesn't sound very efficient. Surely there is some optimization build in InfluxDB that prevents such inefficiencies?

@fetzerch
Copy link

fetzerch commented Sep 6, 2022

This feature request would also help to fix the values that difference/derivative return. See #5174

@SGStino
Copy link

SGStino commented Sep 25, 2022

I have tried to solve this with:

prefix = from(bucket: "Home")
  |> range(start: 0, stop: v.timeRangeStart)
  |> filter(fn: (r) => r["_measurement"] == "Zigbee")
  |> filter(fn: (r) => r["_field"] == "Temperature")
  |> last()
  |> map(fn: (r) => ({r with _time: v.timeRangeStart}))
data_range = from(bucket: "Home")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "Zigbee")
  |> filter(fn: (r) => r["_field"] == "Temperature") 


union(tables: [prefix, data_range])   
  |> group(columns:["Device", "EndPoint","_field","_measurement"])

But notice the map(...) after last(), it's invalid because v.timeRangeStart is a time , and for some reason flux expects it to be a duration:

 type error @13:24-13:34: expected duration but found time

So asside of that it's wildly inefficient to go through the last 50+ years of data (1970 - 2202), it also doesn't work.
If i omit the map(), the value is timestamped the last day, month or even year the value changed before the selected range. and most visualization tools (grafana, flux dashboards), include it inside the view.

And if window fuctions were to be used after the group(...), they'd generate a lot of window segements in that inteval before timeRangeStart

So there really needs to be an efficient way to include the previous value in the query. And the proposed change here looks to be the most efficient one.

@Kortenbach
Copy link

Kortenbach commented Sep 30, 2022

@SGStino
Your solution is not half bad I think.
If you supply a measurement AT LEAST every 24h then you can replace the start of "0" by "-24h".
That would bring the performance to an acceptable level.

I want to mention that I made a time-series database myself (before starting to use InfluxDB).
I ran into the same problem but the file structure was organized in a way that just by taking the previous record you could find the last stored value. I have no idea how InfluxDB data is stored, but could it be that simple in InfluxDB too?

@SGStino
Copy link

SGStino commented Sep 30, 2022

It's a workaround that works for some cases in grafana: if you select absolute timestamps, and the range isn't to big, because window functions can't be used.

I guess this should work with the 24h:

import "date"

prefix = from(bucket: "Home")
  |> range(start: date.sub(d: 24h, from: v.timeRangeStart), stop: v.timeRangeStart)
  |> filter(fn: (r) => r["_measurement"] == "Zigbee")
  |> filter(fn: (r) => r["_field"] == "Temperature")
  |> last()
  |> map(fn: (r) => ({r with _time: v.timeRangeStart}))
data_range = from(bucket: "Home")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "Zigbee")
  |> filter(fn: (r) => r["_field"] == "Temperature") 

union(tables: [prefix, data_range])   
  |> group(columns:["Device", "EndPoint","_field","_measurement"])

however it seems to complain about "sub" from the package date:

"invalid: type error @4:19-4:23: record is missing label sub"

EDIT: nevermind, this works great (and fast) in 2.4, sub didn't exist in 2.0 and i didn't upgrade my test instance ...

@Kortenbach
Copy link

I will definitely try that.
"...that works for some cases in grafana: if you select absolute timestamps, and the range isn't to big, because window functions can't be used." --> I don't understand this part. Can you explain this a bit more?

@SGStino
Copy link

SGStino commented Oct 3, 2022

when you set the timerange to now-1d to now in the datepicker , the v.timeRangeStart contains -1d, which is a duration, not a time. and you can't set durations in time fields (e.g. _time = -1d)

But I seem to have been mistaken, Grafana doesn't do this, it's the Influx's built-in dashboard.
Grafana always uses absolute timestamps, even for relative ranges.

@Thyraz
Copy link

Thyraz commented Feb 6, 2023

Is there a reason to map the last value to the beginning of the timerange?
Grafana seems to be able to consider points outside the graph and seems to draw the lines correct in my tests.

But maybe there's a limitation how far outside a value can be to be still used for the drawing engine ...

@Thyraz
Copy link

Thyraz commented Feb 6, 2023

Btw. this is also useful with date.add on the timeRangeStop when looking at time ranges that doesn't stop at now().

One last point that needs to be solved to satisfy my perfect self:
Any ideas how to add a value at "now"( if the timespan is including it) with the last value?

This was quite easy in SQL and I start regretting my switch over to Flux. :P

@Thyraz
Copy link

Thyraz commented Feb 6, 2023

Ok, that way it works for me.

For the sake of simplicity, I load the complete data of a few hours before and after the current timespan instead using multiple queries.
How long this is has to be adjusted on the sensor data. My temperature sensors report quite often, so I used 6h in this example.

I also requested data after the timespan in case you scrolled back in Grafana, to fill the end of the graph.

And at the end, I look for the last value that the sensor sent, and move it to "now()".
This way the lines get extended to the current moment if it is in the visible range.

import "date"

history = from(bucket: "homeassistant")
  |> range(start: date.sub(d: 6h, from: v.timeRangeStart), stop: date.sub(d: -6h, from: v.timeRangeStop))
  |> filter(fn: (r) => r["friendly_name"] == "Temperatur Wohnzimmer")
  |> filter(fn: (r) => r["_field"] == "value")

extend = from(bucket: "homeassistant")
  |> range(start: date.sub(d: 6h, from: now()), stop: now())
  |> filter(fn: (r) => r["friendly_name"] == "Temperatur Wohnzimmer")
  |> filter(fn: (r) => r["_field"] == "value")
  |> last()
  |> map(fn: (r) => ({r with _time: now()}))

union(tables: [history, extend])
  |> drop(columns: ["_start", "_stop"])

Copy link

github-actions bot commented Dec 6, 2024

This issue has had no recent activity and will be closed soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants