[0.9.4 & 0.9.5-nightly-6682752] Continuous Queries stop running #4646

dsw6 · 2015-11-03T20:16:38Z

Using the build, I create two continuous queries. One query runs every 5m and one query runs every 1h. The log shows the queries are successfully created and the queries begin to run.

After about and hour (sometime more, sometimes less) the queries stop running. The log shows no errors, just no continuous query activity. Using the admin console to list the continuous queries ("show continuous queries") returns no results. However, trying to recreate the query in the admin console reports an error, indicating the query already exists.

Below are snippets from the log file showing successful query creation and the queries executing for a period of time.

This has happened multiple times. Each time, I started fresh with a new database.

Log snippets:
=======================

2015/11/03 10:14:15 InfluxDB starting, version 0.9.5-nightly-6682752, branch master, commit 66827524081d1e97558d0384d84789a337c9cc87, built 2015-11-02T05:00:42+0000

[query] 2015/11/03 10:15:26 CREATE CONTINUOUS QUERY totals_5m ON esf BEGIN SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events GROUP BY time(5m), serviceKey, method, client END
[http] 2015/11/03 10:15:26 10.255.197.38 - influxAdmin [03/Nov/2015:10:15:26 -0700] GET /query?q=CREATE+CONTINUOUS+QUERY+totals_5m+ON+esf+BEGIN+SELECT+count(respTime)+AS+%22methodCount%22%2C+mean(respTime)+AS+%22respTime%22+INTO+%22esf%22.%22rp_30d%22.esf_totals_5m+FROM+%22esf%22.%22rp_7d%22.esf_events+GROUP+BY+time(5m)%2C+serviceKey%2C+method%2C+client+END&db=_internal HTTP/1.1 200 40 http://10.96.110.46:8083/ Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36 79f68f54-824e-11e5-8018-000000000000 2.819671ms
[continuous_querier] 2015/11/03 10:15:27 executing continuous query totals_5m
[query] 2015/11/03 10:15:27 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:15:00Z' AND time < '2015-11-03T17:20:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:15:27 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:10:00Z' AND time < '2015-11-03T17:15:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:15:27 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:05:00Z' AND time < '2015-11-03T17:10:00Z' GROUP BY time(5m), serviceKey, method, client

[http] 2015/11/03 10:15:34 10.255.197.38 - - [03/Nov/2015:10:15:34 -0700] OPTIONS /query?q=CREATE+CONTINUOUS+QUERY+totals_1h+ON+esf+BEGIN+SELECT+sum(methodCount)+AS+%22methodCount%22%2C+mean(respTime)+AS+%22respTime%22+INTO+%22esf%22.%22rp_60d%22.esf_totals_1h+FROM+%22esf%22.%22rp_30d%22.esf_totals_5m+GROUP+BY+time(1h)%2C+serviceKey%2C+method%2C+client+END&db=_internal HTTP/1.1 200 0 http://10.96.110.46:8083/ Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36 7e99f8ac-824e-11e5-801b-000000000000 73.739µs
[query] 2015/11/03 10:15:34 CREATE CONTINUOUS QUERY totals_1h ON esf BEGIN SELECT sum(methodCount) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_60d".esf_totals_1h FROM "esf"."rp_30d".esf_totals_5m GROUP BY time(1h), serviceKey, method, client END
[continuous_querier] 2015/11/03 10:15:35 executing continuous query totals_1h
[query] 2015/11/03 10:15:35 SELECT sum(methodCount) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_60d".esf_totals_1h FROM "esf"."rp_30d".esf_totals_5m WHERE time >= '2015-11-03T17:00:00Z' AND time < '2015-11-03T18:00:00Z' GROUP BY time(1h), serviceKey, method, client


[continuous_querier] 2015/11/03 10:17:27 executing continuous query totals_5m
[query] 2015/11/03 10:17:27 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:15:00Z' AND time < '2015-11-03T17:20:00Z' GROUP BY time(5m), serviceKey, method, client

[continuous_querier] 2015/11/03 10:19:27 executing continuous query totals_5m
[query] 2015/11/03 10:19:27 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:15:00Z' AND time < '2015-11-03T17:20:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:19:29 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:10:00Z' AND time < '2015-11-03T17:15:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:19:29 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:05:00Z' AND time < '2015-11-03T17:10:00Z' GROUP BY time(5m), serviceKey, method, client

[continuous_querier] 2015/11/03 10:21:28 executing continuous query totals_5m
[query] 2015/11/03 10:21:28 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:20:00Z' AND time < '2015-11-03T17:25:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:21:29 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:15:00Z' AND time < '2015-11-03T17:20:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:21:30 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:10:00Z' AND time < '2015-11-03T17:15:00Z' GROUP BY time(5m), serviceKey, method, client

[continuous_querier] 2015/11/03 10:21:35 executing continuous query totals_1h
[query] 2015/11/03 10:21:35 SELECT sum(methodCount) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_60d".esf_totals_1h FROM "esf"."rp_30d".esf_totals_5m WHERE time >= '2015-11-03T17:00:00Z' AND time < '2015-11-03T18:00:00Z' GROUP BY time(1h), serviceKey, method, client

[continuous_querier] 2015/11/03 10:23:28 executing continuous query totals_5m
[query] 2015/11/03 10:23:28 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:20:00Z' AND time < '2015-11-03T17:25:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:23:30 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:15:00Z' AND time < '2015-11-03T17:20:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:23:32 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:10:00Z' AND time < '2015-11-03T17:15:00Z' GROUP BY time(5m), serviceKey, method, client

[continuous_querier] 2015/11/03 10:25:29 executing continuous query totals_5m
[query] 2015/11/03 10:25:29 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:25:00Z' AND time < '2015-11-03T17:30:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:25:29 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:20:00Z' AND time < '2015-11-03T17:25:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:25:31 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:15:00Z' AND time < '2015-11-03T17:20:00Z' GROUP BY time(5m), serviceKey, method, client

[continuous_querier] 2015/11/03 10:27:29 executing continuous query totals_5m
[query] 2015/11/03 10:27:29 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:25:00Z' AND time < '2015-11-03T17:30:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:27:30 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:20:00Z' AND time < '2015-11-03T17:25:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:27:32 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:15:00Z' AND time < '2015-11-03T17:20:00Z' GROUP BY time(5m), serviceKey, method, client

[continuous_querier] 2015/11/03 10:27:36 executing continuous query totals_1h
[query] 2015/11/03 10:27:36 SELECT sum(methodCount) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_60d".esf_totals_1h FROM "esf"."rp_30d".esf_totals_5m WHERE time >= '2015-11-03T17:00:00Z' AND time < '2015-11-03T18:00:00Z' GROUP BY time(1h), serviceKey, method, client


[continuous_querier] 2015/11/03 10:29:29 executing continuous query totals_5m
[query] 2015/11/03 10:29:29 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:25:00Z' AND time < '2015-11-03T17:30:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:29:31 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:20:00Z' AND time < '2015-11-03T17:25:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:29:33 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:15:00Z' AND time < '2015-11-03T17:20:00Z' GROUP BY time(5m), serviceKey, method, client

[continuous_querier] 2015/11/03 10:31:30 executing continuous query totals_5m
[query] 2015/11/03 10:31:30 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:30:00Z' AND time < '2015-11-03T17:35:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:31:31 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:25:00Z' AND time < '2015-11-03T17:30:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:31:33 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:20:00Z' AND time < '2015-11-03T17:25:00Z' GROUP BY time(5m), serviceKey, method, client

.... 
<<<final log entries for the continuous_querier>>>
[continuous_querier] 2015/11/03 10:49:34 executing continuous query totals_5m
[query] 2015/11/03 10:49:34 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:45:00Z' AND time < '2015-11-03T17:50:00Z' GROUP BY time(5m), serviceKey, method, client
[query] 2015/11/03 10:49:36 SELECT count(respTime) AS "methodCount", mean(respTime) AS "respTime" INTO "esf"."rp_30d".esf_totals_5m FROM "esf"."rp_7d".esf_events WHERE time >= '2015-11-03T17:40:00Z' AND time < '2015-11-03T17:45:00Z' GROUP BY time(5m), serviceKey, method, client

The text was updated successfully, but these errors were encountered:

dsw6 · 2015-11-03T20:20:18Z

Note: stopping and restarting the influxdb service starts the continuous queries running again.

beckettsean · 2015-11-03T20:26:31Z

@dgnorton Any ideas?

njurgens · 2015-11-06T15:33:32Z

I may be experience a similar issue on InfluxDB version 0.9.4.1. Continuous queries just stop running after a while. I have four continuous queries that rollup data into various retention policies. One query has GROUP BY time(5m) so I would expect that one to run fairly often. According to the logs, though, it’s been about 9 hours since the last continuous query ran.

beckettsean · 2015-11-06T19:38:37Z

@njurgens that definitely seems wrong. If you restart the process do the CQs resume running?

beckettsean · 2015-11-06T19:40:17Z

@njurgens do ad hoc queries execute and return? Basically, is the system otherwise healthy and CQs have just stopped running periodically?

njurgens · 2015-11-06T20:51:38Z

@beckettsean I'm able to still execute queries as normal. Data is still being written to the measurement's default retention policy and I can query that no problem. If I query any of the retention policies that are populated by my CQs, I only have data from before about 2015-11-06T06:40:00Z (around the time CQs stopped appearing in the logs). I'll restart the database and see if they resume running.

Update: After I restarted InfluxDB and the CQs seem to be running again.

njurgens · 2015-11-16T21:13:00Z

Continuous queries started to hang again and queries to one of my retention policies timeout. This retention policy is the same as the one queried by the last CQ that ran (according to the logs).

After restarting the database, CQs run again, but that retention policy remains unqueryable.

beckettsean · 2015-11-16T23:26:11Z

retention policy remains unqueryable

Can you be more specific? Queries return bad values? Null values? Queries don't return? The process throws stack?

brettdh · 2015-11-17T13:54:09Z

@beckettsean Queries to that retention policy don't return. (Queries to other retention policies do return, though.)

Other log items (potentially) of note:

Queries to that RP seem to hang immediately on startup.
- A continuous query on that RP runs within a second of server startup (HTTP listen) but never writes any points.
Shortly before that CQ runs (within a second), we see this:

WAL writing to /path/to/influxdb/wal/dbname/4w/128

Not sure if that's a red herring, or a potential deadlock. The 4w retention policy is the one to which queries are hanging.

I had started wondering if it's related to #3469 due to the cascading continuous queries, but that seems to be about write timeouts. Also possibly related: #4203, #3158 (though we are using sum() rather than count())

njurgens · 2015-12-01T13:27:49Z

I haven't seen this issue since upgrading to InfluxDB 0.9.5 a week ago.

dgnorton · 2015-12-01T13:52:57Z

@brettdh are you still seeing this issue? If so, have you tried 0.9.5?

brettdh · 2015-12-01T14:42:33Z

I am still on 0.9.4.1, and I haven't been testing this actively, but I haven't noticed a hang since the last time I commented here. This issue is frustratingly intermittent, though, so I'm not confident it's gone until it can be reliably reproduced.

@njurgens has been steadily storing new measurements in his 0.9.5 deployment, though, so that gives me some hope. :-)

jsternberg · 2016-05-18T02:07:37Z

This is an old issue for a now unsupported version of InfluxDB. I'm going to close this, but please comment or make a new issue if you see this with 0.13 or newer. Thank you.

beckettsean added the area/continuous queries label Nov 6, 2015

beckettsean changed the title ~~[0.9.5-nightly-6682752] Continuous Queries stop running~~ [0.9.4 & 0.9.5-nightly-6682752] Continuous Queries stop running Nov 6, 2015

jsternberg closed this as completed May 18, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[0.9.4 & 0.9.5-nightly-6682752] Continuous Queries stop running #4646

[0.9.4 & 0.9.5-nightly-6682752] Continuous Queries stop running #4646

dsw6 commented Nov 3, 2015

dsw6 commented Nov 3, 2015

beckettsean commented Nov 3, 2015

njurgens commented Nov 6, 2015

beckettsean commented Nov 6, 2015

beckettsean commented Nov 6, 2015

njurgens commented Nov 6, 2015

njurgens commented Nov 16, 2015

beckettsean commented Nov 16, 2015

brettdh commented Nov 17, 2015

njurgens commented Dec 1, 2015

dgnorton commented Dec 1, 2015

brettdh commented Dec 1, 2015

jsternberg commented May 18, 2016

[0.9.4 & 0.9.5-nightly-6682752] Continuous Queries stop running #4646

[0.9.4 & 0.9.5-nightly-6682752] Continuous Queries stop running #4646

Comments

dsw6 commented Nov 3, 2015

dsw6 commented Nov 3, 2015

beckettsean commented Nov 3, 2015

njurgens commented Nov 6, 2015

beckettsean commented Nov 6, 2015

beckettsean commented Nov 6, 2015

njurgens commented Nov 6, 2015

njurgens commented Nov 16, 2015

beckettsean commented Nov 16, 2015

brettdh commented Nov 17, 2015

njurgens commented Dec 1, 2015

dgnorton commented Dec 1, 2015

brettdh commented Dec 1, 2015

jsternberg commented May 18, 2016