-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error while marshaling: proto: field \"kolide.agent.LogCollection.Log.Data\" contains invalid UTF-8 #445
Comments
Osquery sometimes mis-encodes utf8 data osquery/osquery#5288 This is a broad attempt to repair log files that exhibit that issue. This runs against the entire log file. Hopefully, there isn’t going to be a case where it misfires. Fixes: #445
Hello, This bug still exists and block a normal usage. I'm on Kolide Fleet I confirm that the bug happen when a query is done on table I think the bug is on Kolide Launcher side because when I run osquery with exact same parameters as Kolide Launcher starts it, it works like a charm, Here is the command I run for osqueryd:
Can you fix this bug please? I can do more tests if required. |
I think the encoding bug is in osquery, as discussed in osquery/osquery#5288 but I agree that launcher shouldn't crash. How are you issuing the that query? Is it a scheduled query, or being sent as a distributed query? |
This is a scheduled query with logging as snapshot. Yes this is very blocking that Launcher crash because the only fix is to delete local database. |
I have not yet spun up a test environment for this, but I don't think we see this on the SaaS side of things. I'm wondering if launcher is still up, but it's stuck in a loop trying to send the same log to the server. Do the launcher logs indicate? |
Yes the same log occur every minute:
|
I would say that indicates it's not a launcher bug. Launcher is sending a log, and fleet is rejecting it. Launcher is then resending it. The reason you need to drop the database, is because launcher will always try to resend stored logs. I am not immediately sure whether I think launcher should drop the log when the server rejects it. It's the sort of thing that would cause logs to be dropped, and potentially break any kind of monitoring. My gut sense is that fleet should accept these and either fix them, and record the data, or discard them. @zwass what do you think here? |
As reference, the original issue has been created on fleet : kolide/fleet#2014 But as with osquery this works fine, I think a fix should be done on Kolide side to accept the log and not discard it. |
This does not look like an error on the Fleet side. Notice the error message: rpc error: code = Internal desc = grpc: error while marshaling: proto: field "kolide.agent.LogCollection.Log.Data" contains invalid UTF-8 There wouldn't be any marshalling on the Fleet side, only unmarshalling. |
Either escape it in some way, or drop the invalid characters I suppose. Probably best to do this only after receiving the error so we don't add an extra encoding/scanning step for every single log. |
I seem to be experience this error based on logs im seeing from the service on windows host, has a workaround or solution been determined yet, im running the most recent version of launcher (0.11.9) and fleet version 2.6.0 |
For my concern, the bug has been fixed with latest osquery release (4.3.0). The related PR is osquery/osquery#6190 All encoding issues should be fixed when the PR osquery/osquery#6338 will be released |
Is there any progress on this? 4.5.1 and all my event logs across all my Windows servers are seeing thousands of: caller=log.go:124 ts=2021-01-19T16:50:26.127444Z caller=level.go:63 level=info caller=extension.go:494 err="sending string logs: writing logs: transport error sending logs: rpc error: code = Internal desc = grpc: error while marshaling: proto: field "kolide.agent.LogCollection.Log.Data" contains invalid UTF-8" caller=log.go:124 ts=2021-01-19T16:50:26.1264442Z caller=level.go:63 level=info caller=publish_logs.go:179 method=PublishLogs uuid=6e2ccedd-1f13-4ea9-9bcc-37d9611e3eff logType=string log_count=1103 message= errcode= reauth=false err="rpc error: code = Internal desc = grpc: error while marshaling: proto: field "kolide.agent.LogCollection.Log.Data" contains invalid UTF-8" took=14.0018ms |
@B3DTech FWIW we (github.com/fleetdm) are working on an osquery autoupdater that takes a different approach by letting osquery do the heavy lifting. This (I believe) will help resolve issues like this one with the grpc transport in Launcher. |
@B3DTech The underlying issue here, is that osquery sometimes produces data that is not utf-8 compatible. This is generally believed to be a bug. What do you think launcher should do when that happens? |
My problem is that it isn't "sometimes". It's my event logs on every machine are getting flooded with 2-4 of these message every minute. I'm not sure what should happen because I don't know why it's happening or what it's choking on. Osquery shouldn't be failing to produce UTF-8 messages. And additionally, my 4.5.1 clients don't seem to be sending event logs, but my 4.2 clients are. |
Regardless of whether osquery "should" produce non-utf8, it does. So what do you think launcher should do? Attempt to repair the data? Drop it on the floor? Encode it? |
Osquery sometimes mis-encodes utf8 data osquery/osquery#5288 This is a broad attempt to repair log files that exhibit that issue. This runs against the entire log file. Hopefully, there isn’t going to be a case where it misfires. Fixes: #445
I've updated #481. It should now attempt to repair the data, then redact the non-utf8 characters. I can't easily test it though |
It was determined that there is mis-formatted osquery results in the osquery store from the previous version of osquery, and Launcher is still trying to send that. Removing the C:\Program Files\Kolide\Launcher-so-launcher\data\ directory and restarting the service fixed the issue - no more UTF8 events. |
I think this is as resolved as it's getting |
We are in the process of deploying Kolide to our customers and we get this error on multiple of the windows clients. When the error is logged launcher stops sending data to the server.
We have tried updating osqueryd (3.3.2) and launcher (9.0.2) to latest versions but the error still persists.
If we delete the osquery.db files and restart launcher it works for a while and then the error reappears and the log traffic stops.
We have done some investigation and nailed it down to this query:
SELECT * FROM programs
This works perfectly in Kolide fleet dashboard, but when running it scheduled launcher fails to send the logs due to the error message. We have seen the error message on multiple computers but are still investigation what queries triggers those endpoints.
To nail it down a bit further we found the software in the list that had the characters that triggered the error:
SELECT * FROM programs WHERE name = "HxD Hex Editor version 1.7.7.0"
What version of
fleet
are you using (fleet version --full
)?fleet - version 2.0.2
branch: master
revision: 8ca0358bf28173685815b79d8683a4239d629a14
build date: 2019-01-18T00:39:40Z
build user: zwass
go version: go1.11.3¨
What operating system are you using?
Windows 10 Pro 1893
What did you do?
What did you expect to see?
Data sent from launcher.exe to kolide fleet
What did you see instead?
The two first log entries are the error message, the last one is the software that we think is causing them.
{"caller":"publish_logs.go:157","err":"rpc error: code = Internal desc = grpc: error while marshaling: proto: field "kolide.agent.LogCollection.Log.Data" contains invalid UTF-8","errcode":"","logType":"string","log_count":3,"message":"","method":"PublishLogs","reauth":false,"severity":"info","took":"0s","ts":"2019-03-12T12:34:15.8004054Z","uuid":"bde64b33-697d-4782-b8a2-866e0a44e71a"}
{"caller":"extension.go:494","err":"sending string logs: writing logs: transport error sending logs: rpc error: code = Internal desc = grpc: error while marshaling: proto: field "kolide.agent.LogCollection.Log.Data" contains invalid UTF-8","severity":"info","ts":"2019-03-12T12:34:15.8014026Z"}
{"caller":"publish_results.go:168","err":null,"errcode":"","message":"","method":"PublishResults","reauth":false,"results":"[{"query_name":"kolide_distributed_query_352","status":0,"rows":[{"identifying_number":"","install_date":"20180625","install_location":"C:\\Program Files (x86)\\HxD\\","install_source":"","language":"","name":"HxD Hex Editor version 1.7.7.0","publisher":"Ma�l H�rz","uninstall_string":"\"C:\\Program Files (x86)\\HxD\\unins000.exe\"","version":"1.7.7.0"}]}]","severity":"debug","took":"16.9527ms","ts":"2019-03-12T22:34:46.991793Z","uuid":"7729e05c-71a1-4f7b-bd37-8fb37e76e6db"}
The text was updated successfully, but these errors were encountered: