How do chef-client reports flow through an Automate1 cluster?

Sean Horn -

We're assuming here that you are proxying chef-client reports through a Chef Server, which is the default. If you don't see the words "token" or "data-collector" in a node's /etc/chef/client.rb, the node reports are either not being sent, or they are being proxied automatically through the Chef Server, if it and the Automate system have been set up for that function. More detail on that can be found here


Check the nodes' debug level log file for data_collector connections when you run it manually. There should be one at the beginning of the run to signify the start of the chef-client run, then one with the collected data reported from the run.

chef-client -l debug -L myrun.log.txt

[2017-07-26T17:36:42-04:00] DEBUG: Initiating POST to https://chef-432.lxc/organizations/delivery/data-collector

Additionally, if you are using the Audit cookbook to run compliance scans you should see a fetch of the profiles from the defined profile source, along with an additional post to the data_collector endpoint in the AuditReport Handler:

[2018-09-27T22:05:40+00:00] INFO: Using InSpec 2.2.102
[2018-09-27T22:05:40+00:00] DEBUG: Options are set to: {"report"=>true, "format"=>"json-automate", "reporter"=>["json-automate"], "output"=>"/dev/null", "logger"=>Chef::Log, :backend_cache=>true, :attributes=>{}}
[2018-09-27T22:05:40+00:00] INFO: Fetching profile from: https://chef-432.lxc/compliance/organizations/delivery/owners/admin/compliance/linux-baseline/tar
[2018-09-27T22:05:57+00:00] INFO: Reporting to chef-server-automate
[2018-09-27T22:05:58+00:00] INFO: Report to Chef Automate via Chef Server: https://chef-432.lxc/organizations/delivery/data-collector
- Chef::Handler::AuditReport
Running handlers complete

Be sure to check for any errors that could happen when fetching the profiles or when node is running Inspec, they might kill the Inspec run and prevent a completed Audit Report from being sent to the data_collector endpoint.


Are there problems showing up in /var/log/opscode/nginx/access.log on the Chef Server?
If your Chef Server is configured to proxy data collector connections from nodes back into the Automate system, then the connections from the nodes have to show up there first. Here are the posts for the data-collector in the access.log on my Chef Server system from the chef-client run above. They were successful, showing 204s, the first has very little content length, because it's just a start of run marker, and the final payload content length is large, because it holds the data proxied back to the Automate system. Basically, the full JSON content of the node. - - [26/Jul/2017:17:36:42 -0400]  "POST /organizations/delivery/data-collector HTTP/1.1" 204 "0.007" 0 "-" "Chef Client/12.19.36 (ruby-2.3.1-p112; ohai-8.23.0; x86_64-linux; +" "" "204" "0.002" "12.19.36" "algorithm=sha1;version=1.1;" "runner-2-432.lxc" "2017-07-26T21:36:42Z" "XAILfUH4CddKZAX1hI6hlROtWL8=" 1457 - - [26/Jul/2017:17:36:42 -0400]  "POST /organizations/delivery/environments/_default/cookbook_versions HTTP/1.1" 200 "0.020" 2 "-" "Chef Client/12.19.36 (ruby-2.3.1-p112; ohai-8.23.0; x86_64-linux; +" "" "200" "0.020" "12.19.36" "algorithm=sha1;version=1.1;" "runner-2-432.lxc" "2017-07-26T21:36:42Z" "cRLXPTAdvq6x5/h9QUVSIhGKjnc=" 1142 - - [26/Jul/2017:17:36:42 -0400]  "PUT /organizations/delivery/nodes/runner-2-432.lxc HTTP/1.1" 200 "0.035" 17470 "-" "Chef Client/12.19.36 (ruby-2.3.1-p112; ohai-8.23.0; x86_64-linux; +" "" "200" "0.032" "12.19.36" "algorithm=sha1;version=1.1;" "runner-2-432.lxc" "2017-07-26T21:36:42Z" "yMhaqLRzhlw5sNmLRCtWgwdE7pc=" 84043 - - [26/Jul/2017:17:36:42 -0400]  "POST /organizations/delivery/data-collector HTTP/1.1" 204 "0.016" 0 "-" "Chef Client/12.19.36 (ruby-2.3.1-p112; ohai-8.23.0; x86_64-linux; +" "" "204" "0.002" "12.19.36" "algorithm=sha1;version=1.1;" "runner-2-432.lxc" "2017-07-26T21:36:42Z" "iSh6HXt6s6sC9TqbJ3gwQcu3wIw=" 84605

Finally, the requests make it back into /var/log/delivery/nginx/delivery.access.log on the Automate system - - [26/Jul/2017:17:36:42 -0400]  "POST /data-collector/v0/ HTTP/1.0" 204 "0.001" 0 "-" "Chef Client/12.19.36 (ruby-2.3.1-p112; ohai-8.23.0; x86_64-linux; +" "" "204" "0.001" "12.19.36" "algorithm=sha1;version=1.1;" "runner-2-432.lxc" "2017-07-26T21:36:42Z" "XAILfUH4CddKZAX1hI6hlROtWL8=" 1509 - - [26/Jul/2017:17:36:42 -0400]  "POST /data-collector/v0/ HTTP/1.1" 204 "0.002" 0 "-" "-" "" "204" "0.001" "-" "-" "-" "-" "-" 83673 - - [26/Jul/2017:17:36:42 -0400]  "POST /data-collector/v0/ HTTP/1.0" 204 "0.001" 0 "-" "Chef Client/12.19.36 (ruby-2.3.1-p112; ohai-8.23.0; x86_64-linux; +" "" "204" "0.001" "12.19.36" "algorithm=sha1;version=1.1;" "runner-2-432.lxc" "2017-07-26T21:36:42Z" "iSh6HXt6s6sC9TqbJ3gwQcu3wIw=" 84657

However, the end of the story is that the data-collector POST request payload has to be submitted and ingested into ES on the Automate system. The first step in this ingestion process is that Automate's delivery process puts incoming data-collector items on the rabbitmq /insights/data-collector queue.

logstash is used to consume from this queue. It takes the queued items off the rabbitmq /insights/data-collector queue, manipulates the messages in specific ways, and then submits the messages to elasticsearch.

You can check your data-collector queue with 

export PATH=/opt/delivery/embedded/bin:$PATH
rabbitmqctl list_queues -p /insights

The submission of items to elasticsearch by logstash from the data-collector queue can be seen in one file on the Automate system

/var/log/delivery/nginx/es_proxy.access.log - - [26/Jul/2017:17:36:46 -0400]  "POST /elasticsearch//_bulk HTTP/1.1" 200 "0.033" 200 "-" "Manticore 0.6.0" "" "200" "0.033" "-" "-" "-" "-" "-" 107078

Only during the first ingestion will you see something like this in the ES logfile, as the dynamic mappings are created for the converge and node-state indices. The logfile name for ES is called /var/log/delivery/elasticsearch/current. There are other daily logfiles with more or less detail. ES is otherwise silent unless something major or bad happens.

2017-07-26_21:35:54.83263 [2017-07-26 17:35:54,832][INFO ][cluster.metadata         ] [automate-432] [insights-2017.07.26] update_mapping [converge]
2017-07-26_21:35:55.18765 [2017-07-26 17:35:55,187][INFO ][cluster.metadata         ] [automate-432] [node-state-1] update_mapping [node-state]
Have more questions? Submit a request


Powered by Zendesk