Known Issues

Sean Horn -

Automate 1.8.3+ Bad Cronjob Definition

  • Run the reaper using the command line from /etc/cron.d/reaper to cleanup manually.
  • Then, when the Reaper cannot clean up anymore manually, please add a newline to the end of the file /opt/delivery/embedded/cookbooks/delivery/templates/default/reaper.cron.erb and reconfigure the automate system. There is a bug in Automate 1.8.3 that results in the Reaper never running, because the format of the generated file /etc/cron.d/reaper does not fit with what most linux are expecting.

You can check that the change to the cron jobs took place and that it is accepted by the cron daemon by looking in your /var/log/syslog, /var/log/messages, or /var/log/cron. My /var/log/cron has this, for example. The first message shows that the crontab file the system read was accepted. The second will repeat as the system runs the job.

Apr 18 04:56:01 automate-432 crond[1218]: (*system*) RELOAD (/etc/cron.d/reaper)
May 13 05:00:01 automate-432 CROND[5283]: (root) CMD (/var/opt/delivery/reaper/run_reaper.sh)

You should be monitoring Reaper runs. Any sort of monitoring on the total number of indices(this number should not change by more than 1 or 2 indices a day while the system remains in a single configuration) or reading Reaper's logfile would have also caught that it was not running.

Automate 1.8.38 and before faulty profile encoding

This issue occurs when using Automate as a profile store with the audit cookbook running on target machines. When the audit cookbook Inspec run occurs and it downloads the profile from the Automate system, the following condition occurs. The CIS-RHEL7-level2-2.1.1-9 profile is known to have this problem. 2.1.1-12 does not have the problem. Please send a ticket to Support to get an updated profile until the fixed profiles are released in the next Automate release.

ERROR: Report handler Chef::Handler::AuditReport raised #<Encoding::UndefinedConversionError: "\xC2" from ASCII-8BIT to UTF-8>

ChefDK 2.x Runners incompatible with Automate < 8.1.4 

Waiting for a worker.
Job dispatch on job runner example-dev.example.org has failed:
Exit code: 1
Details:
Executing remotely on [example-dev.example.org]...
/opt/chefdk/embedded/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- chef/rest (LoadError)
from /opt/chefdk/embedded/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /usr/local/bin/delivery-cmd:5:in `<main>
Connection to example-dev.example.org closed.

This issue is actually with the /usr/local/bin/delivery-cmd on that builder, which uses the old chef/rest API that isn't included in these later versions of the ChefDK. That delivery-cmd program was likely installed by an Automate server older than 0.8.14, since Automate 0.8.14 and newer use chef/server_api which is actually included on the newer ChefDKs. Rolling back the ChefDK to something around 1.6.x is one workaround, but if you want to use ChefDKs newer than around 1.6.x when installing your runners you'll need to have a newer version of Automate.

Chef Server 413 opscode-erchef max_request_size and Automate Compliance

In all versions of Chef Server that support proxying data-collector data reports to an Automate system, the following condition will happen in the chef-client logfile when larger chef-clients try to save their node and Compliance data reports back through the Chef Server to the Automate system.

413 "Request Entity Too Large"

To fix, set the following in /etc/opscode/chef-server.rb on the Chef Server system

opscode_erchef['max_request_size'] = 2000000

and then reconfigure with

chef-server-ctl reconfigure

Zendesk 17268

Chef Server Compatibility

Due to various issues between Chef Server 12.14.0 and 12.15.5, we recommend that you only use Chef Server 12.11.1 or 12.15.8 with Automate 0.7.151 - 0.7.239.

If you are running Chef Backend HA 1.3.2, you should run Chef Server 12.15.8 only, as searching does not work for various reasons in Chef Server 12.11.1 when combined with Chef Backend HA.

 

Delivery Service Failure Loop for Builder Jobs, but no Push Jobs server on Chef Server

Automate 0.8.5 and before will show the delivery service failing in a loop when trying to dispatch a job previously configured for Push Jobs, but the Chef Server system Automate is configured to run with is not running a Push Jobs server, like this in the current log

 2017-05-26_21:12:35.73155 17:12:35.715 [error] CRASH REPORT Process  with 6 neighbours exited with reason: no match of right hand value {error,{ok,"404",[{"Server","openresty/1.11.2.1"},{"Date","Fri, 26 May 2017 21:12:52 GMT"},{"Content-Type","text/html"},{"Content-Length","1084"},{"Connection","keep-alive"},{"ETag","\"590e00ca-43c\""}],<<"\n\n\n\n\nChef - 404 Not Found\n  >}} in deliv_push_job:find_node/1 line 239 in gen_server:terminate/7 line 826

2017-05-26_21:12:35.74116 17:12:35.717 [error] Supervisor deliv_stage_sup had child deliv_stage started with deliv_stage:start_link({deliv_stage,{deliv_change,<<"5d095590-2144-4a37-850e-b4ce623352be">>,100,<<"initialize-delivery-p...">>,...},...}) at  exit with reason no match of right hand value {error,{ok,"404",[{"Server","openresty/1.11.2.1"},{"Date","Fri, 26 May 2017 21:12:52 GMT"},{"Content-Type","text/html"},{"Content-Length","1084"},{"Connection","keep-alive"},{"ETag","\"590e00ca-43c\""}],<<"\n\n\n\n\nChef - 404 Not Found\n  >}} in deliv_push_job:find_node/1 line 239 in context child_terminated

2017-05-26_21:12:35.74131 17:12:35.721 [info] audit_log stage_name=started; action=stage_verify; event={audit_stage_event,started,{{2017,5,26},{21,12,35}},<<"running">>,<<"5d095590-2144-4a37-850e-b4ce623352be">>,<<"New pipeline verification commit">>,<<"ns2-dev">>,<<"Chef">>,<<"master">>,<<"windows_rdp">>,verify,{{2017,5,10},{20,9,9.888081}},<<"i826342">>,undefined,undefined};
2017-05-26_21:12:35.74652 17:12:35.723 [error] chef_api:do_push_client_status got bad response: {ok,"404",[{"Server","openresty/1.11.2.1"},{"Date","Fri, 26 May 2017 21:12:52 GMT"},{"Content-Type","text/html"},{"Content-Length","1084"},{"Connection","keep-alive"},{"ETag","\"590e00ca-43c\""}],<<"\n\n\n\n\nChef - 404 Not Found\n  \n  \n\n\n

Reaper Failure 0.6.x - 0.8.5

You can fix it by replacing the template at /opt/delivery/embedded/cookbooks/delivery/templates/default/reaper.cron.erb with the following content, then do an `automate-ctl reconfigure`

<%=
# cron.d/run-parts will only execute the first line of a cron.d/cron entry.
# Here we're composing the cron entry in a legible way and then trimming
# newlines and extra spaces so that the rendered entry one command on a single
# line. <<-CRON.gsub(/\n/, ' ').gsub(/\s{2,}/, ' ')
*/15 * * * * root
PATH=/opt/delivery/embedded/bin:$PATH
CURATOR_ELASTICSEARCH_HOST='127.0.0.1'
CURATOR_ELASTICSEARCH_PORT='8080'
CURATOR_ELASTICSEARCH_PREFIX='/elasticsearch'
REAPER_WORKFLOW_API_HOST='#{node['delivery']['delivery']['listen']}'
REAPER_WORKFLOW_API_PORT='#{node['delivery']['delivery']['port']}'
ruby /opt/delivery/embedded/service/reaper/bin/reaper
--config #{node['delivery']['reaper']['conf_file']}
--log-file #{node['delivery']['reaper']['log_file']}
CRON
%>

Afterwards, you can test whether or not it is working with the following command. Read the /var/log/delivery/reaper/reaper.log logfile afterwards to make sure it happened. If it's working, you will see messages that make it obvious that the Reaper deleted unneeded Elasticsearch indices.

PATH=/opt/delivery/embedded/bin:$PATH CURATOR_ELASTICSEARCH_HOST='127.0.0.1'\
CURATOR_ELASTICSEARCH_PORT='8080'\ CURATOR_ELASTICSEARCH_PREFIX='/elasticsearch'\
REAPER_WORKFLOW_API_HOST='127.0.0.1'\ REAPER_WORKFLOW_API_PORT='9611' \ ruby /opt/delivery/embedded/service/reaper/bin/reaper --config\ /var/opt/delivery/reaper/reaper_config.json --log-file /var/log/delivery/reaper/reaper.log

Reaper reconfigure failure

If you ever see this on on Automate during a reconfigure

[2017-10-29T20:45:02-05:00] INFO: Reading config from /var/opt/delivery/reaper/reaper_config.json...
[2017-10-29T20:45:02-05:00] ERROR: Elasticsearch lock index not created. Please run `automate-ctl reconfigure`

Then do the following on the Automate system command line as root. Sometimes, the .locky index doesnt get created. It's used so that the Reaper can keep track of snapshots it uses. After the creation, the reconfigure should succeed

```
curl -XPUT 'http://localhost:9200/.locky'
automate-ctl reconfigure
```

delivery setup failure

If you see this after any delivery-cli command, but especially `delivery setup`, the issue is that your terminal doesn't support colors, but the delivery command needs to use them for highlights.

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: 
ColorOutOfRange',
/buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libcore/result.rs:837

 

Have more questions? Submit a request

Comments

Powered by Zendesk