InfluxDB unable to write data (localhost) with 25 tags and 300 fields. #5826

deepujain · 2016-02-25T04:13:31Z

One day of data: 74 files, 1000 points in each file, Each point has 25 tags and 300 fields

I was able to ingest 1 file correctly without any syntax errors.
However write of all files started to throw {"error":"timeout"}

InfluxD :
After a while it started to go crazy
Logs: http://pastebin.com/b2wJwFHT

I was planning to load 30 days of data. It did with speed and no errors on Druid + Imply, wanted to compare that with InfluxDB + Grafana. All of this installation is on local machine (Mac, with 24 GB RAM)

jonseymour · 2016-02-25T04:29:23Z

How were you loading the data? How many points in each batch? How many batches in parallel?

jonseymour · 2016-02-25T04:38:15Z

Ah, I see you were using curl. So, one curl command per file? How many curl commands in parallel?

deepujain · 2016-02-25T04:46:09Z

All in sequence in a for loop. Am not testing write part now. I can afford
that to be slow but I want correct ingestion and fast query
On Wed, Feb 24, 2016 at 8:38 PM Jon Seymour notifications@github.com
wrote:

Ah, I see you were using curl. So, one curl command per file? How many
curl commands in parallel?

—
Reply to this email directly or view it on GitHub
#5826 (comment)
.

deepujain · 2016-02-25T05:23:53Z

In addition that above question, how can i recover. InfluxDB wont start now.

filestore]2016/02/24 20:00:06 /Users/dvasthimal/.influxdb/data/mydb/default/28/000000003-000000002.tsm (#2) opened in 123.004398ms
[filestore]2016/02/24 20:00:14 /Users/dvasthimal/.influxdb/data/mydb/default/28/000000002-000000002.tsm (#0) opened in 8.452389905s
[filestore]2016/02/24 20:00:15 /Users/dvasthimal/.influxdb/data/mydb/default/28/000000003-000000001.tsm (#1) opened in 8.57616921s
[cacheloader] 2016/02/24 20:00:15 reading file /Users/dvasthimal/.influxdb/wal/mydb/default/28/_00025.wal, size 29333172
[cacheloader] 2016/02/24 20:00:17 reading file /Users/dvasthimal/.influxdb/wal/mydb/default/28/_00026.wal, size 22579152
[cacheloader] 2016/02/24 20:00:18 reading file /Users/dvasthimal/.influxdb/wal/mydb/default/28/_00027.wal, size 14073637.

.
...
...

jonseymour · 2016-02-25T05:25:12Z

How big is each file? 300 fields per point seems quite high and it possible this might be a factor. How many of these are numeric, how many strings?

deepujain · 2016-02-25T05:26:48Z

All fields (metrics) are numeric. (doubles mostly). dimensions (tags) are strings.

One day of data: 74 files, 1000 points in each file, Each point has 25 tags and 300 fields

jonseymour · 2016-02-25T05:27:40Z

Could you post the output of:

find '/Users/dvasthimal/.influxdb/' -ls

I am interested to see the total size of all the .wal files.

There is a possibility that you have run out of RAM.

I'll leave it for the influx guys to advise on a recovery approach.

jonseymour · 2016-02-25T05:27:51Z

How big - in bytes - is each file?

deepujain · 2016-02-25T05:28:26Z

10MB

jonseymour · 2016-02-25T05:28:32Z

How many files are processed before you start seeing errors?

deepujain · 2016-02-25T05:29:41Z

20

jonseymour · 2016-02-25T05:44:52Z

Ok, here is a guess at what the problem is and perhaps a workaround, depending on whether you are in a position to rewrite the point tags.

This line of code:

https://github.com/influxdata/influxdb/blob/master/models/points.go#L331

use an insertion sort to sort tags. If your tags are not in go sort order, then there is a potentially O(n^2) sort for each point. If the tags were in go sort order then the cost of this sort will be ~ O(n). For n = 25, the difference between O(n^2) and O(n) is quite a lot.

(I am not 100% sure about this diagnosis - it is possible that something in the write path prior to this code normally sorts the tags in go sort order and so this isn't a factor, but it is worth testing if you are in position to do so).

updated: my fears about a potential O(n^2) issue are actually unfounded. Influx sorts the tags on the way in, so that when they hit the insertion sort later on it will execute with O(n) efficiency. Confirmed with a unit test.

deepujain · 2016-02-25T05:47:52Z

What you mean by go sort ?
Do you want each row (point) to be sorted by multiple tags ? How can I do
that ?
On Wed, Feb 24, 2016 at 9:45 PM Jon Seymour notifications@github.com
wrote:

Ok, here is a guess at what the problem is and perhaps a workaround,
depending on whether you are in a position to rewrite the point tags.

This line of code:

https://github.com/influxdata/influxdb/blob/master/models/points.go#L331

use an insertion sort to sort tags. If your tags are not in go sort order,
then there is a potentially O(n^2) sort for each point. If the tags were in
go sort order then the cost of this sort will be ~ O(n). For n = 25, the
difference between O(n^2) and O(n) is quite a lot.

(I am not 100% sure about this diagnosis - it is possible that something
prior to write path in this code normally sorts the tag in go sort order
and so this isn't a factor, but it is worth testing if you are in position
to do so).

—
Reply to this email directly or view it on GitHub
#5826 (comment)
.

jonseymour · 2016-02-25T05:51:32Z

Actually, it isn't going to help anyway because of the way influx makes points. [update: influx does the right thing and sorts tags on construction so that the binary form of the point is sorted in the optimal order ]

Are you in a position to build your own version of influx, or are you using a packaged version?

deepujain · 2016-02-25T06:01:21Z

O/p of the command:
find '/Users/dvasthimal/.influxdb/' -ls

http://pastebin.com/QL6ZcKnw

deepujain · 2016-02-25T06:01:35Z

I am using packaged version.

deepujain · 2016-02-25T06:02:06Z

What is the maximum tested data ? # of tags, # of fields in a given measurement ?

jonseymour · 2016-02-25T06:17:16Z

When you say it doesn't start, does it fail with an error message or do you kill it. How long do you wait? Can you paste the tail of the log file at the point the restart fails/is terminated? Is the system showing any evidence of high memory or CPU usage at the point the start fails/is terminated?

I'll have to defer to the influx support team regarding what they consider to be a reasonable number of tags and fields.

updated: to delete a question that is not relevant.

jonseymour · 2016-02-25T07:17:47Z

@deepujain The O(n^2) issues I were worried about with point construction don't actually exist, verified both by closer inspection of the code and some actual unit tests. Still 300 fields is a lot of fields, so it could still be a factor.

jonseymour · 2016-02-25T07:23:45Z

The number of wal logs seems quite high. How much disk space do you have free in your home directory?

jonseymour · 2016-02-25T15:35:51Z

@deepujain According to this line:

https://github.com/influxdata/influxdb/blob/v0.10.1/tsdb/shard.go#L490-L492

influx only supports up to 255 fields per point. Perhaps violating this restriction is a contributing factor to your issues?

Updated: that's a restriction that only applies to b1 and bz1 engines - you are using tsm, so that shouldn't be an issue here.

deepujain · 2016-02-25T15:45:51Z

I reduced number of fields to ~200. This times 95% of files got ingested. Rest has same error

Client Side: {"error":"timeout"}

Server Side:
[http] 2016/02/25 07:41:05 ::1 - - [25/Feb/2016:07:41:04 -0800] POST /write?db=ppw HTTP/1.1 204 0 - curl/7.43.0 2dfdca80-dbd6-11e5-805a-000000000000 1.394094614s
[http] 2016/02/25 07:41:11 ::1 - - [25/Feb/2016:07:41:05 -0800] POST /write?db=ppw HTTP/1.1 500 20 - curl/7.43.0 2ed6337a-dbd6-11e5-805b-000000000000 5.132936465s
[http] 2016/02/25 07:41:16 ::1 - - [25/Feb/2016:07:41:11 -0800] POST /write?db=ppw HTTP/1.1 500 20 - curl/7.43.0 31ea6c52-dbd6-11e5-805c-000000000000 5.171391369s
[http] 2016/02/25 07:41:20 ::1 - - [25/Feb/2016:07:41:16 -0800] POST /write?db=ppw HTTP/1.1 204 0 - curl/7.43.0 3506b06d-dbd6-11e5-805d-000000000000 4.353095492s

Is there a limit on number of fields (metrics) and what is tsm / b1 / bz1 ?
This is only 1 day of data for POC.

deepujain · 2016-02-25T15:49:29Z

Influx QL is dead slow and disk starts spinning like crazy.

select * from ppw limit 10

Never saw the output

deepujain · 2016-02-25T15:55:11Z

Better question:

I have a time series db that is updated once every day. It has around 30 dimensions (used as filters, that will match to tags) and 175 to 200 metrics (to perform aggregate functions and this will map to fields).

Each day around 100 to 150 files each with 1000 points will be ingested.

Ingestion time can take few minutes, but query time has to be order of few seconds.

I was able to do this with Druid (backend) and Imply (front end) but I like Grafana better, hence what to try out influxDB
Can influx DB support this kind of data ?

jackzampolin · 2016-02-25T19:12:41Z

With 25 tags I would be surprised if your series cardinality was under 10 million. Influx should be able to handle this data given proper schema design.

deepujain · 2016-02-25T22:01:40Z

What do you mean by proper schema design ? I will have one measurement with
25 tags and rest 200 as fields ?

Please share details on proper schema design
On Thu, Feb 25, 2016 at 11:13 AM Jack Zampolin notifications@github.com
wrote:

With 25 tags I would be surprised if your series cardinality
https://docs.influxdata.com/influxdb/v0.10/concepts/glossary/#series-cardinality
was under 10 million. Influx should be able to handle this data given
proper schema design.

—
Reply to this email directly or view it on GitHub
#5826 (comment)
.

deepujain · 2016-03-04T03:29:57Z

Any suggestions here ? This issue was closed. https://influxdata.com/blog/announcing-influxdb-v0-10-100000s-writes-per-second-better-compression/ It clearly says that with columnar store there is no limit on number of fields ( 0 to 100 to 1000). However my use case does not seem to work.

jackzampolin closed this as completed Feb 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InfluxDB unable to write data (localhost) with 25 tags and 300 fields. #5826

InfluxDB unable to write data (localhost) with 25 tags and 300 fields. #5826

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

jackzampolin commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Mar 4, 2016

InfluxDB unable to write data (localhost) with 25 tags and 300 fields. #5826

InfluxDB unable to write data (localhost) with 25 tags and 300 fields. #5826

Comments

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

jonseymour commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Feb 25, 2016

jackzampolin commented Feb 25, 2016

deepujain commented Feb 25, 2016

deepujain commented Mar 4, 2016