This article describes necessary system tunings and checks to ensure your Chef Automate instance is performant, and able to receive configuration changes. This is essential for general operability as well as recovering from certain failure states.
Preparation: Evaluate your host instance(s) sizing needs appropriately according to published minimum system requirements as well as our scaling documentation ( Chef Automate: Deployment Planning and Performance tuning (transcribed from 'Scaling Chef Automate Beyond 100,000 nodes') ).
Configure the appropriate monitoring and alerting on thresholds which will provide ample time to request more disk space and/or rotate logs. Also be considerate as to whether adequate local disk space for data migrations during upgrades is in place. We recommend that you should have an excess of anything up to 50% free disk if you have a remote share/location to which back-ups are directed or 70% free disk if a local backup location is part of the implementation.
Design: If needs seem greater than a standalone Chef Automate instance can sustain, multi-node topologies need also be considered. Contact your account team for further information deploying a clustered instance of Chef Automate.
Application: Very importantly, ensure that the following system tunings are in place:
# sysctl vm.max_map_count
vm.max_map_count = 262144
# sysctl vm.dirty_expire_centisecs
vm.dirty_expire_centisecs = 20000
If they are not, set them:
# sysctl -w vm.max_map_count=262144
# sysctl -w vm.dirty_expire_centisecs=20000
And add them to /etc/sysctl.conf so that they persist through system reboots:
# tail -n 2 /etc/sysctl.conf
Analysis: If you are still seeing errors or cannot apply new Automate configuration, you may need to also increase the limit of file descriptors available to the service.
Remediation: Open up the file handle limit for Automate services:
# systemctl stop chef-automate
# mkdir -p /etc/systemd/system/chef-automate.service.d
# cat <<EOF >> /etc/systemd/system/chef-automate.service.d/custom.conf
LimitNOFILE = 128000
# systemctl daemon-reload
# systemctl start chef-automate
Further Reading: N/A