If you see entries like this continuously on the active Chef Server backend in your '/var/log/opscode/opscode-pushy-server/current', the most likely issue is that the system times on your push jobs client and push jobs server do not match. They must match to get good push jobs runs.
2016-04-12_22:50:21.86752 18:50:07.232 [error] (pushy_node_state:364) <0.21274.0> Bad timestamp in message: ={"node":"vm-2628-5132.someone.net","client":"vm-2628-5132","org":"someonechef","type":"heartbeat","sequence":142,"timestamp":"Tue, 12 Apr 2016 22:38:44 GMT","incarnation_id":"223c888a-eca6-4d70-8403-7e128b5f9e0b","job_state":"idle","job_id":null}
The node 'vm-2628-5132.someone.net' has a system time that is about 12 minutes behind the timestamp on the opscode-pushy-server that is trying to send it jobs and that it is exchanging heartbeats with.
This issue can happen in several ways:
- Push jobs server and client times not synced
- Push jobs clients separated from the server by a firewall
- A push jobs client may have the correct time, but be so busy for other reasons that it cannot respond to the heartbeats in a timely manner. So it's heartbeat responses come in too late and the server rejects them
- A push server on a small system can receive such a flood of resigned key renewals that it can fall behind far enough that it cannot process them within their TTL. The symptom of this in the log file looks exactly like the desync circumstance. An indication that it's not is that the Chef Server where the Push Jobs Server is running should be extremely busy and the Push Jobs Server should be taking a lot of CPU time. This situation occurs with thousands of nodes.
Reference the following for another way to check on the health of push jobs client nodes https://docs.chef.io/plugin_knife_push_jobs.html#node-status. You must have a knife client configured against the desired Chef Server and the client must have the knife-push gem installed.
The output will look something like this if the push jobs client nodes are available and ready to run jobs:
knife node status
acceptance-node-1 available
build-node-test-1 available
build-node-test-2 available
build-node-test-3 available
Comments
0 comments
Article is closed for comments.