Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add sumarize data by interval in the log parser input plugin #1478

Closed
toni-moreno opened this issue Jul 11, 2016 · 3 comments

Comments

@toni-moreno
Copy link
Contributor

Directions

On big infraestructures we would like to store and process only needed data. Suppose a cluster of apache servers , with big load and we need only the number of hits/interval and response time processed from their access log.

Suppose our servers are proccessing up than 3millons of hits/hour ( on 10 servers) and only need 3 metrics ( hits and average,max,p90 response time)

So we would only store 3x10x60 = 180 metrics / hour instead of 3 millions of inserts with a lot of unneeded data.

We can just do this with collectd +Tail Plugin

https://collectd.org/wiki/index.php/Plugin:Tail

or collectd + apachelog plugin.

https://github.com/toni-moreno/collectd-apachelog-plugin

We can use telegraf and logparser as the base for this work, this could be interesting to get log processing also over windows systems.

Feature Request

We would like to have a config option for each file with switch behaviour from "all events sent" to "only summarized send", and also the kind of summarization , how to group data and what to send.

Proposal:

configuration could be something like that.

[[inputs.logparser]]
# files should be an array of "id"-"filename"
  files = [
        ["8080","/var/log/httpd/access8080.log"],
        ["80","/var/log/httpd/access80.log"],
        ["443","/var/log/httpd/access443.log]
   ]
  from_beginning = false

  [inputs.logparser.grok]
    custom_patterns = '''
   APACHE_LOG_WITH_RT %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-)
%{COMMONAPACHELOG} %{QS:referrer} %{QS:agent}
 %{INT:rt}
    '''
  #input.logparse.(id) 

  [inputs.logparser."8080"]
     send_all_events=true 
     #nothing more to config here.

  #input.logparse.(id) 
  [inputs.logparser."80"]
         send_all_events=false
         id_tag=log_port 
         #match.grok field.regex_filter 
         [match."rawrequest"."/some/url.*[a-Z]$"]
              #measurement where group data
               extratags=[ url="myurl" , othertag="valuetag"]
               measurement="http_stats"
               groupby_grok_field=response
               groupby_tag="httpcode"

               [field  "hits_x_interval"]
                      #summarize_type should be any of "counter,sum,max,min,avg,percentile(N)"
                      summarize_type="counter"
                      summarize_grok_field=any
               [field  "rt_avg"]
                      summarize_type="avg"
                      summarize_grok_field=rt
                [field  "rt_max"]
                      summarize_type="max"
                      summarize_grok_field=rt
                [field  "rt_p90"]
                      summarize_type="percentile(90)"
                      summarize_grok_field=rt

Desired behavior:

with this config we will get data : measurement [fields] tags, as follows

"http_stats" [ hits_x_interval,rt_avg,rt_max,rt_p90] http_code=XXXX,  log_port=80/443,  url="myurl" , othertag="valuetag"

What do you think about?

@sparrc
Copy link
Contributor

sparrc commented Jul 14, 2016

We will be doing a generic solution for this for all plugins, not just the logparser. See #1419. You may also be able to do some of this already using the pass/drop filters?

@sparrc sparrc closed this as completed Jul 14, 2016
@toni-moreno
Copy link
Contributor Author

There is any planned date to release a beta for this generic solution ? I would like to help you test it.

@sparrc
Copy link
Contributor

sparrc commented Jul 14, 2016

Not at the moment, but you can subscribe to #380 and get notified about any progress or status changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants