Advanced Ansible Usage for Enterprise Environments

Eran Goldman-Malka · November 13, 2025

Ansible DevOps

Static inventory works until your infrastructure is defined somewhere else—a CMDB, a cloud provider, a service registry. At that point maintaining a hand-edited hosts file is a compliance problem as much as an operational one. Enterprise Ansible means dynamic inventory, tuned parallelism, robust error handling, and pipelines that can patch five hundred nodes without manual intervention.

Dynamic Inventory

Dynamic inventory scripts or plugins query an external source at runtime and return a JSON structure Ansible treats as a normal inventory.

AWS EC2 (using the built-in plugin):

inventory/aws_ec2.yml:

plugin: amazon.aws.aws_ec2
regions:
  - eu-west-1
  - us-east-1
filters:
  instance-state-name: running
  tag:Environment: production
keyed_groups:
  - key: tags.Role
    prefix: role
  - key: tags.OS
    prefix: os
compose:
  ansible_host: public_ip_address

Run against it directly:

ansible-inventory -i inventory/aws_ec2.yml --list
ansible-playbook -i inventory/aws_ec2.yml patch-linux.yml

Custom CMDB or API source — write a script that outputs JSON conforming to the Ansible dynamic inventory format and mark it executable. Ansible calls it automatically.

chmod +x inventory/cmdb_inventory.py
ansible-playbook -i inventory/cmdb_inventory.py patch-linux.yml

Parallelism and Forks Tuning

By default Ansible runs against 5 hosts in parallel. For large environments, increase this in ansible.cfg:

[defaults]
forks = 20

Or override at runtime:

ansible-playbook patch-linux.yml -f 50

For patching specifically, use serial to control rollout batch size regardless of forks:

- name: Rolling patch across web tier
  hosts: webservers
  become: true
  serial: 10          # patch 10 hosts at a time
  # serial: "20%"     # or patch 20% of the group at a time
  roles:
    - patching

serial is your primary tool for staggered patching. It ensures that if a batch fails, the remaining hosts are untouched.

Error Handling

ignore_errors — continue the play even if a task fails (use sparingly; failures should usually stop execution):

- name: Try optional post-patch script
  ansible.builtin.command: /opt/scripts/post-patch-hook.sh
  ignore_errors: true

failed_when — define what failure means for a task:

- name: Check if reboot is required
  ansible.builtin.command: needs-restarting -r
  register: reboot_check
  failed_when: reboot_check.rc not in [0, 1]  # 0 = no, 1 = yes, anything else is real failure
  changed_when: false

block / rescue / always — structured try/catch/finally:

- block:
    - name: Apply patches
      ansible.builtin.apt:
        upgrade: dist

    - name: Run post-patch validation
      ansible.builtin.command: /opt/validate.sh

  rescue:
    - name: Alert on failure
      ansible.builtin.uri:
        url: ""
        method: POST
        body_format: json
        body:
          text: "Patch failed on "

  always:
    - name: Log result regardless of outcome
      ansible.builtin.lineinfile:
        path: /var/log/ansible-patch.log
        line: " patched at "
        create: true

Conditional Execution

Patch only hosts meeting specific criteria:

- name: Patch only if uptime exceeds 30 days
  ansible.builtin.apt:
    upgrade: dist
  when: (ansible_uptime_seconds | int) > (30 * 86400)

- name: Patch only if kernel is outdated
  ansible.builtin.apt:
    upgrade: dist
  when: ansible_kernel != kernel_target_version

- name: Patch only Debian-family hosts
  ansible.builtin.apt:
    upgrade: dist
  when: ansible_os_family == "Debian"

Handlers and Service Restarts

Handlers run once at the end of a play, only if notified by a task that made a change. This prevents unnecessary restarts:

tasks:
  - name: Apply patches
    ansible.builtin.apt:
      upgrade: dist
    notify:
      - Restart nginx
      - Restart sshd

handlers:
  - name: Restart nginx
    ansible.builtin.service:
      name: nginx
      state: restarted

  - name: Restart sshd
    ansible.builtin.service:
      name: ssh
      state: restarted

If the patch task reports changed: false (nothing to update), the handlers never fire. If the play fails mid-run, use --force-handlers to run handlers anyway during cleanup.

CI/CD Integration

GitHub Actions:

name: Weekly Patch Run

on:
  schedule:
    - cron: '0 2 * * 0'   # every Sunday at 02:00 UTC
  workflow_dispatch:        # allow manual trigger

jobs:
  patch:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Ansible
        run: pip install ansible

      - name: Install collections
        run: ansible-galaxy collection install -r requirements.yml

      - name: Write vault password
        run: echo "$" > /tmp/.vault && chmod 600 /tmp/.vault

      - name: Write SSH key
        run: |
          mkdir -p ~/.ssh
          echo "$" > ~/.ssh/ansible_id
          chmod 600 ~/.ssh/ansible_id

      - name: Run patch playbook
        run: |
          ansible-playbook -i inventory/production/hosts.yml \
            playbooks/patch-linux.yml \
            --vault-password-file /tmp/.vault \
            --limit staging_first   # always run against a canary group first

      - name: Cleanup secrets
        if: always()
        run: rm -f /tmp/.vault ~/.ssh/ansible_id

GitLab CI:

patch_linux:
  stage: deploy
  image: python:3.12
  before_script:
    - pip install ansible
    - ansible-galaxy collection install -r requirements.yml
    - echo "$VAULT_PASS" > /tmp/.vault && chmod 600 /tmp/.vault
  script:
    - ansible-playbook -i inventory/production/hosts.yml playbooks/patch-linux.yml
        --vault-password-file /tmp/.vault
  after_script:
    - rm -f /tmp/.vault
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"

Previous: Ansible Architecture and Best Practices

Next in the series: Linux Server Patching with Ansible — apt vs dnf vs zypper, kernel patching, reboot handling, and rolling update workflows.

Share: Twitter, Facebook