-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InfluxDB unable to write data (localhost) with 25 tags and 300 fields. #5826
Comments
How were you loading the data? How many points in each batch? How many batches in parallel? |
Ah, I see you were using curl. So, one curl command per file? How many curl commands in parallel? |
All in sequence in a for loop. Am not testing write part now. I can afford
|
In addition that above question, how can i recover. InfluxDB wont start now. filestore]2016/02/24 20:00:06 /Users/dvasthimal/.influxdb/data/mydb/default/28/000000003-000000002.tsm (#2) opened in 123.004398ms . |
How big is each file? 300 fields per point seems quite high and it possible this might be a factor. How many of these are numeric, how many strings? |
All fields (metrics) are numeric. (doubles mostly). dimensions (tags) are strings. One day of data: 74 files, 1000 points in each file, Each point has 25 tags and 300 fields |
Could you post the output of:
I am interested to see the total size of all the .wal files. There is a possibility that you have run out of RAM. I'll leave it for the influx guys to advise on a recovery approach. |
How big - in bytes - is each file? |
10MB |
How many files are processed before you start seeing errors? |
20 |
Ok, here is a guess at what the problem is and perhaps a workaround, depending on whether you are in a position to rewrite the point tags. This line of code: https://github.com/influxdata/influxdb/blob/master/models/points.go#L331 use an insertion sort to sort tags. If your tags are not in go sort order, then there is a potentially O(n^2) sort for each point. If the tags were in go sort order then the cost of this sort will be ~ O(n). For n = 25, the difference between O(n^2) and O(n) is quite a lot. (I am not 100% sure about this diagnosis - it is possible that something in the write path prior to this code normally sorts the tags in go sort order and so this isn't a factor, but it is worth testing if you are in position to do so). updated: my fears about a potential O(n^2) issue are actually unfounded. Influx sorts the tags on the way in, so that when they hit the insertion sort later on it will execute with O(n) efficiency. Confirmed with a unit test. |
What you mean by go sort ?
|
Actually, it isn't going to help anyway because of the way influx makes points. [update: influx does the right thing and sorts tags on construction so that the binary form of the point is sorted in the optimal order ] Are you in a position to build your own version of influx, or are you using a packaged version? |
O/p of the command: |
I am using packaged version. |
What is the maximum tested data ? # of tags, # of fields in a given measurement ? |
When you say it doesn't start, does it fail with an error message or do you kill it. How long do you wait? Can you paste the tail of the log file at the point the restart fails/is terminated? Is the system showing any evidence of high memory or CPU usage at the point the start fails/is terminated? I'll have to defer to the influx support team regarding what they consider to be a reasonable number of tags and fields. updated: to delete a question that is not relevant. |
@deepujain The O(n^2) issues I were worried about with point construction don't actually exist, verified both by closer inspection of the code and some actual unit tests. Still 300 fields is a lot of fields, so it could still be a factor. |
The number of wal logs seems quite high. How much disk space do you have free in your home directory? |
@deepujain According to this line: https://github.com/influxdata/influxdb/blob/v0.10.1/tsdb/shard.go#L490-L492 influx only supports up to 255 fields per point. Perhaps violating this restriction is a contributing factor to your issues? Updated: that's a restriction that only applies to b1 and bz1 engines - you are using tsm, so that shouldn't be an issue here. |
I reduced number of fields to ~200. This times 95% of files got ingested. Rest has same error Client Side: {"error":"timeout"} Server Side: Is there a limit on number of fields (metrics) and what is tsm / b1 / bz1 ? |
Influx QL is dead slow and disk starts spinning like crazy.
Never saw the output |
Better question: I have a time series db that is updated once every day. It has around 30 dimensions (used as filters, that will match to tags) and 175 to 200 metrics (to perform aggregate functions and this will map to fields). Each day around 100 to 150 files each with 1000 points will be ingested. Ingestion time can take few minutes, but query time has to be order of few seconds. I was able to do this with Druid (backend) and Imply (front end) but I like Grafana better, hence what to try out influxDB |
With 25 tags I would be surprised if your series cardinality was under 10 million. Influx should be able to handle this data given proper schema design. |
What do you mean by proper schema design ? I will have one measurement with Please share details on proper schema design
|
Any suggestions here ? This issue was closed. https://influxdata.com/blog/announcing-influxdb-v0-10-100000s-writes-per-second-better-compression/ It clearly says that with columnar store there is no limit on number of fields ( 0 to 100 to 1000). However my use case does not seem to work. |
One day of data: 74 files, 1000 points in each file, Each point has 25 tags and 300 fields
I was able to ingest 1 file correctly without any syntax errors.
However write of all files started to throw {"error":"timeout"}
InfluxD :
After a while it started to go crazy
Logs: http://pastebin.com/b2wJwFHT
I was planning to load 30 days of data. It did with speed and no errors on Druid + Imply, wanted to compare that with InfluxDB + Grafana. All of this installation is on local machine (Mac, with 24 GB RAM)
The text was updated successfully, but these errors were encountered: