-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple IPs in nginx module #4322
Comments
As reference, here is the pattern nginx.access log currently uses: https://github.com/elastic/beats/blob/master/filebeat/module/nginx/access/ingest/default.json#L7 I agree it is kind of tricky to get multiple ip addresses as it raises the question of how they will be named. I assume the list of IP's could also be longer then 3? |
Yes probably, but I haven't tested it. I should also mentioned that the X-Forwarded-For header is not standard, but is beeing used by a lot of tools & services. There is also an RFC mentioning it: https://tools.ietf.org/html/rfc7239 |
If you find a grok solution to only pick the first one (instead of the last one) feel free to open a PR with it. I was thinking of something like |
So the correct pattern should actually be |
You probably need to delete the existing pipeline definition in Elasticsearch for the change to take effect. |
Thanks thats what was missing. The pattern works and elasticsearch now stores the correct IP address. I created a PR for the change: #4351 |
Thanks, really appreciate you opened a PR. |
I am currently analysing nginx logs that have been generated via the following configuration (xxxx are anonymisations):
e.g.
in these cases the logic behind which ip to use to look up as geoip is complex. In this case, the best logic may be to resolve ips that are not private from left to right for geoip lookup. This syntax requires different handling than the current PR. Also, the naming of fields may need to be reviewed. |
@stevedodson One tricky part with having a list of IP is that either we need to find a smart way to assign each a specific field name or will have an array in elasticsearch which then is hard to query in es. Any ideas? |
In summary, the issue seems to be: if
An option which involves minimal change would be to assign Longer term, additional configurations that could allow reporting such as number of private IPs that went through proxy x could also be defined. A sample configuration that can identify the first non-private IP (according to rfc1918) - with a fallback to
|
Thanks, I'm going to work on integrating a version of this in the nginx module. |
I came up with a version of the Painless script that doesn't need regexp support to be enabled:
|
A common customization to the nginx logs is to add the contents of the X-Forwarded-For header in front of the remote IPs. This typically results in a list of remote IPs. This adds a new field `remote_ip_list` which is an array, and uses a Painless script to automatically select the first non-private IP for the `remote_ip` field, which is the field on which GeoIP is applied. Fixes elastic#4322.
A common customization to the nginx logs is to add the contents of the X-Forwarded-For header in front of the remote IPs. This typically results in a list of remote IPs. This adds a new field `remote_ip_list` which is an array, and uses a Painless script to automatically select the first non-private IP for the `remote_ip` field, which is the field on which GeoIP is applied. Fixes #4322.
…4417) A common customization to the nginx logs is to add the contents of the X-Forwarded-For header in front of the remote IPs. This typically results in a list of remote IPs. This adds a new field `remote_ip_list` which is an array, and uses a Painless script to automatically select the first non-private IP for the `remote_ip` field, which is the field on which GeoIP is applied. Fixes elastic#4322. (cherry picked from commit a2c162f)
…ess as remote_ip (#4703) * Nginx module: use first not private IP address as remote_ip (#4417) A common customization to the nginx logs is to add the contents of the X-Forwarded-For header in front of the remote IPs. This typically results in a list of remote IPs. This adds a new field `remote_ip_list` which is an array, and uses a Painless script to automatically select the first non-private IP for the `remote_ip` field, which is the field on which GeoIP is applied. Fixes #4322. (cherry picked from commit a2c162f)
Currently the nginx module only allows to fetch one IP, but if you use a proxy, you might want to output the
X-Forwarded-For
header into the logs, which would result in lines containing multiple IPs. Here is an example from our log file (with random client IPs).In our case the first IP is the actual client IP, the second is the one from cloudflare, and the localhost address comes from varnish, which runs on the same host as nginx.
Filebeat only takes logs the localhost address from varnish i.e.
127.0.0.1
, which is kind of useless. It would be cool to have all IPs to store all IPs into elastic search, but I guess it raises some questions regarding how to geocode them.As a quick hack I tried to create a custom pattern, which fetches the first IP instead of the last, but didn't succeed. I would really appreciate it if someone could point me in the right direction.
My Idea was to fetch the first IPORHOST by also matching by a comma, but I don't get how to exclude it with the "not captured group", at least on https://grokconstructor.appspot.com it doesn't seem to work:
Here are my specs:
Filebeat 5.4 (Running in docker)
OS: Debian 8
Steps to Reproduce:
$http_x_forwarded_for
variable. Here is our config:The text was updated successfully, but these errors were encountered: