Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong token used by Chef server to send data to Chef Automate #1281

Closed
ocsig opened this issue May 26, 2017 · 13 comments
Closed

Wrong token used by Chef server to send data to Chef Automate #1281

ocsig opened this issue May 26, 2017 · 13 comments

Comments

@ocsig
Copy link

ocsig commented May 26, 2017

Chef server doesn't pick up the data_collector["token"] value and it is misusing the default token.

Expected Behavior

After changing the data_collector["token"] value, it should be sent as an HTTP header named 'x-data-collector-token'.

Current Behavior

Following this instruction:
https://learn.chef.io/modules/manage-a-node-chef-automate/ubuntu/automate/set-up-your-chef-server#/

  1. Configure Chef server to send data to Chef Automate
    To configure Chef server to send data to Chef Automate, you need to modify /etc/opscode/chef-server.rb on the Chef server to include the FQDN and a token.

After running the chef-server-ctl reconfigure command the Chef server is still using the wrong default token.

Further more, by logging the "x-data-collector-token" header in the Chef Automate NginX access log, I can see, the cookbook upload service is using a malformed version of the default token:

35.187.169.205 - - [25/May/2017:21:15:11 +0000] "POST /data-collector/v0/ HTTP/1.1" 204 "0.002" 0 "-" "-" "127.0.0.1:9611" "204" "<<\x2293a49a4f2482c64126f7b6015e6b0f30284287ee4054ff8807fb63d9cbd1c506\x22>>" "0.002" "-" "-" "-" "-" $

If I'm changing the token of the Chef Automate to '<<"93a49a4f2482c64126f7b6015e6b0f30284287ee4054ff8807fb63d9cbd1c506">>' then the Chef server is able to send cookbook information to the Automate server, but node information fails (401). (the " character is displayed as \x22)
If I'm changing the token of the Chef Automate to '<<"93a49a4f2482c64126f7b6015e6b0f30284287ee4054ff8807fb63d9cbd1c506">>' then the Cookbook upload event communication fails (401) however the node reporting (Chef runs on nodes) are reported correctly to Chef Automate.

Note: data_collector["root_url"] is correctly picked up.

Steps to Reproduce (for bugs)

  1. New Chef server + Chef Automate installation.
    2 My DNS with SSL setup
    3 Changing data_collector["token"] = "mytesttoken
    3 An Organisation added to the Chef server
    4 In nano /var/opt/delivery/nginx/etc/nginx.conf to the log_forma I've added "$http_x_data_collector_token" for visibility of the token sent
    5 Uploading a cookbook with knife to Chef server to trigger the data-collector call. --> this shows the default token sent with the pre- and post-fix extra characters.
    6 Chef-client run triggers a different data-collector call which is using the wrong default token. ( Even if it should be overwritten.)

Your Environment

  • Chef Server Version: chef-server 12.15.6
  • Operating System and Version: Ubuntu 14.04.5 LTS on Google Cloud
  • Running in a container? NO
@srenatus
Copy link
Contributor

This is weird indeed.

"<<\x2293a49a4f2482c64126f7b6015e6b0f30284287ee4054ff8807fb63d9cbd1c506\x22>>"

looks like the Erlang binary value <<"93...06">> was sent as-is, whereas it should have sent 93...06.

Ok, it's probably a 🐛 here: This call to chef_secrets returns not a list(), but binary(), when the token is found.

Thank you for reporting this. We'll get it ironed out quickly, I bet. I'm sorry to say that there's no obvious, quick workaround for this (you've tried them already! 😉) that I can think of, besides changing the erlang code in the running system. (LMK if you'd like to engage in this instead of waiting for the next release.)

@srenatus
Copy link
Contributor

I've dug into this a little further and come up with this: While this should definitely be fixed, the data collection that can't happen here is currently not used on the Automate side. However, node conversion data comes in through a different data path, and that doesn't have this 🐛

So, this issue should not affect your Automate experience. ✨

@ocsig
Copy link
Author

ocsig commented May 29, 2017

Can you guide me, how should I pass cookbook and node run, compliance data to Automate without using the data-collection?

@srenatus
Copy link
Contributor

Sorry, my last comment might have been misleading. Data collection should work fine for you, it's just that this specific piece, i.e., uploading a cookbook, doesn't have any effect on the Nodes tab, and thus doesn't matter right now.

However, data posted to /organizations/<your_org>/data-collector will be forwarded properly if the data collector token is properly configured on both ends:

When a signed POST request comes in there, chef-server will inject the configured token and forward the request to Automate (see here).

So, node run data should work out of the box, I think; compliance data is sent using the audit cookbook -- and there's various ways to have your data reach Automate, but you're probably interested in Reporting to Chef Automate via Chef Server.

We've also got a community slack, which maybe hosts a better audience for specific questions around that audit cookbook, for example, I suppose. 😃

@ocsig
Copy link
Author

ocsig commented May 29, 2017

Node update wasn't working for me either. It only works, if I'm adding the weird configured token to the Chef-Automate.
So part of the bug is: Chef-server is not picking up my custom token. ( But true, the workaround works for me for both the Node and Cookbook data.)

@srenatus
Copy link
Contributor

Could you elaborate "doesn't work", please?

I've spun up an Automate + Chef Server pair, and have them both share the same data-collector token:

# grep token /etc/opscode/chef-server.rb /etc/delivery/delivery.rb
/etc/opscode/chef-server.rb:data_collector["token"] = "7dLxb1mPoJFoCMDp9lwuN5iRiR4yv-XD9rBinA0jKnY"
/etc/delivery/delivery.rb:data_collector["token"] = "7dLxb1mPoJFoCMDp9lwuN5iRiR4yv-XD9rBinA0jKnY"

Now, I've used knife bootstrap to add a bootstrap a node -- and the node's convergence data ends up in the Nodes Tab:
screen shot 2017-05-29 at 11 27 59

In the logs, I find chef-client POSTing the data to chef-server's /organizations/default/data-collector endpoint (/var/log/opscode/nginx/access.log):

127.0.0.1 - - [29/May/2017:08:56:10 +0000]  "POST /organizations/default/data-collector HTTP/1.1" 204 "0.007" 0 "-" "Chef Client/12.20.3 (ruby-2.3.1-p112; ohai-8.23.0; x86_64-linux; +https://chef.io)" "127.0.0.1:443" "204" "0.002" "12.20.3" "algorithm=sha1;version=1.1;" "ip-172-31-44-18.eu-west-1.compute.internal" "2017-05-29T08:56:10Z" "R6bk7peq/UyIN2ZkJDN+pLH+0Wk=" 77962

The request is proxied by chef-server's nginx and eventually consumed by Automate, as can be seen from its nginx (/var/log/delivery/nginx/delivery.access.log):

127.0.0.1 - - [29/May/2017:08:56:10 +0000]  "POST /data-collector/v0/ HTTP/1.0" 204 "0.002" 0 "-" "Chef Client/12.20.3 (ruby-2.3.1-p112; ohai-8.23.0; x86_64-linux; +https://chef.io)" "127.0.0.1:9611" "204" "0.002" "12.20.3" "algorithm=sha1;version=1.1;" "ip-172-31-44-18.eu-west-1.compute.internal" "2017-05-29T08:56:10Z" "R6bk7peq/UyIN2ZkJDN+pLH+0Wk=" 78004

This is with chef-server-core 12.15.6 and automate 0.7.315. Could you try following this path through the logs in your system?

@ocsig
Copy link
Author

ocsig commented May 29, 2017

I've changed the value at the Chef config:
/etc/opscode/chef-server.rb:data_collector["token"] = "7dLxb1mPoJFoCMDp9lwuN5iRiR4yv-XD9rBinA0jKnY"

Then I was running 'chef-server-ctl reconfigure'.

I can see in the notes:
`- update content in file /etc/opscode/chef-server-running.json [....]

  •  "token": "7dLxb1mPoJFoCMDp9lwuN5iRiR4yv-XD9rBinA0jKnY"
    

`
Which looks like how I expect.

In the Chef Automate NginX logs I can still see:
"<<\x2293a49a4f2482c64126f7b6015e6b0f30284287ee4054ff8807fb63d9cbd1c506\x22>>"

The issue was:
I had to run 'chef-server-ctl restart'.

The documentation should mention this:
https://learn.chef.io/modules/manage-a-node-chef-automate/ubuntu/automate/set-up-your-chef-server#/step7

@srenatus
Copy link
Contributor

@ocsig 😃 Sorry, I wasn't clear: I can totally reproduce what you've seen -- and the fix is here: #1282. However, for using Automate with Chef-Server's data collector, it does not matter if that particular request fails. The happy path for seeing your node show up after a finished chef-client run is what I've outlined above. That path should work for you, without any changes (besides the proper configuration of both ends, Automate and Chef-Server).

But I've now also come to understand the other bug you've probably found -- nginx doesn't pick up the changed value on reconfigure. I'm sorry you've run into this, I'll get that bug on our board.

@ocsig
Copy link
Author

ocsig commented May 30, 2017

Unfortunately the issue is coming up during the compliance check as well.
I've turned on the Automate Compliance beta feature and run a compliance profile check via the audit cookbook on a node.
The check runs nicely and communicate with the Automate server, however every second call fails. I assume it's some kind of internal loopback:
[29/May/2017:22:30:54 +0000] "POST /data-collector/v0/ HTTP/1.1" 401 "0.004" 26 "<<\x227dLxb1mPoJFoCMDp9lwuN5iRiR4yv-XD9rBinA0jKnY\x22>>" "-" "-" "127.0.0.1:9611" "401" "0.001" "-" "-" "-" "-" "-" 75902 [29/May/2017:22:30:54 +0000] "POST /data-collector/v0/ HTTP/1.0" 204 "0.003" 0 "7dLxb1mPoJFoCMDp9lwuN5iRiR4yv-XD9rBinA0jKnY" "-" "Chef Client/13.0.118 (ruby-2.4.1-p111; ohai-13.0.1; x86_64-linux; +https://chef.io)" "127.0.0.1 :9611" "204" "0.001" "13.0.118" "algorithm=sha1;version=1.1;" "ubuntu14_test_node" "2017-05-29T22:30:53Z" "kGrM9IqtqanbuqFqy/0hFJMoOAA=" 76903

The issue is the same, the token gets some unwanted pre-/postfix string.

@srenatus
Copy link
Contributor

@ocsig Thanks for adding to this -- Do you see the information in the Compliance tab, though?

@ocsig
Copy link
Author

ocsig commented May 30, 2017

It's a bit early here...
No I don't see it in the Compliance tab, that's why I was looking into the logs :D
(But the initial communication when the node gets the compliance policy, is successful, I just don't have the result.)

@srenatus
Copy link
Contributor

The check runs nicely and communicate with the Automate server, however every second call fails. I assume it's some kind of internal loopback

No, it's two different POSTs -- If I'm not mistaken, the data collection POST that counts for audit data collection is actually sent by the audit cookbook from here, "reporting through chef server to automate"; the failing POST could be unrelated to that. (And I think it would be, since that feature has been extensively tested (right? @alexpop))

@alexpop
Copy link
Contributor

alexpop commented May 30, 2017

@ocsig I tested the following setup today:

  • audit cookbook 4.0 on Linux node with the following node attributes:
{
  "audit": {
    "reporter": "chef-server-automate",
    "fetcher": "chef-server",
    "insecure": true,
    "inspec_version": "1.25.1",
    "profiles": [
      {
        "name": "linux-baseline",
        "compliance": "admin/linux-baseline"
      }
    ]
  }
}
  • standalone chef-server 12.15.6
  • standalone chef-automate 0.8.5
  • with default data-collector token and custom one(7dLxb1mPoJFoCMDp9lwuN5iRiR4yv-XD9rBinA0jKnY)

Both converge and compliance data got in successfully and the UI displayed it.

What the operating system of the node you are converging?
Are you using a custom compliance profile? If so, can you reproduce the issue with one of the profiles shipped with Automate?(e.g. linux-baseline)?
Can you tail the logstash logs while you run chef-client and see if any exception is reported?

automate-ctl tail logstash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants