Auto Scanning and Remediation for a Telco in APAC - Can Admins have a Good Night Sleep?
Last year during the first wave of pandemic, while we were still evaluating the impact of the situation, we were provided a unique opportunity. It involved automating the process of OS upgrade at a telco. We had never done such a project so we quickly started work.
The customer had hundreds of servers running disparate OS like Windows, Linux (multiple flavours) Sun Solaris etc. They had put in a policy of updating servers as per the hardened images provided by Centre for Internet Security norms (https://www.cisecurity.org/). Updating so many live servers presented many problems like:
Scanning which servers needed an update
Manual labour of updating all servers
Checking for success of upgrade tasks
Commit the changes in case of success
Roll-back the changes in case of failure
We used Ansible playbooks for automating the complete cycle. Right from scanning to update, remediation and rollback – if required. Due to various pass-fail parameters, it is very important to have a detailed playbook written. It should incorporate all the possible scenarios since a small change (or miss) can result in operation being ‘failed’ once the playbook runs.
Red Hat Ansible