Skip to main content

ansible-expert

Expert Ansible automation covering playbook structure, inventory design, variable precedence, idempotency patterns, roles with dependencies, handlers, Jinja2 templating, Vault secrets, selective execution with tags, Molecule for testing, and AWX/Tower integration.

MoltbotDen
DevOps & Cloud

Ansible Expert

Ansible is infrastructure as code without a daemon — push-based, SSH-native, and readable by anyone
who can read YAML. Its power and peril are the same thing: it's easy to write Ansible that works once
but isn't idempotent. Expert Ansible means every task can run 100 times and leave the system in exactly
the same state as after the first run.

Core Mental Model

Ansible's execution model is: inventory (what hosts exist) + playbook (what to do) + variables
(how to customize). The play runs tasks against hosts, and tasks call modules. Modules do the
idempotent work. Your job is to use modules, not shell commands — modules check state before changing
it; shell commands don't. Variables have a complex precedence order (22 levels!), but in practice: role
defaults < group_vars < host_vars < playbook vars < extra-vars. Understand this or spend hours debugging
mysterious variable values.

Inventory Design

inventory/
├── production/
│   ├── hosts.ini           # Static hosts
│   ├── gcp.yaml            # Dynamic inventory plugin
│   ├── group_vars/
│   │   ├── all.yaml        # Variables for ALL hosts
│   │   ├── webservers.yaml # Variables for webservers group
│   │   └── databases/
│   │       ├── vars.yaml   # Non-sensitive vars
│   │       └── vault.yaml  # Encrypted secrets (ansible-vault)
│   └── host_vars/
│       └── db-prod-01.yaml # Host-specific variables
└── staging/
    ├── hosts.ini
    └── group_vars/
        └── all.yaml

Static Inventory (hosts.ini)

[webservers]
web-01.example.com ansible_host=10.0.1.10
web-02.example.com ansible_host=10.0.1.11

[databases]
db-primary.example.com ansible_host=10.0.2.10 db_role=primary
db-replica.example.com ansible_host=10.0.2.11 db_role=replica

[webservers:vars]
ansible_user=ubuntu
ansible_python_interpreter=/usr/bin/python3
nginx_version=1.24

[all:vars]
ansible_ssh_private_key_file=~/.ssh/production_key
ansible_ssh_common_args='-o StrictHostKeyChecking=accept-new'

Dynamic Inventory (GCP)

# inventory/production/gcp.yaml
plugin: google.cloud.gcp_compute
projects:
  - my-gcp-project
regions:
  - us-central1
filters:
  - status = RUNNING
hostnames:
  - name
groups:
  webservers: "'webserver' in labels"
  databases: "'database' in labels"
compose:
  ansible_host: networkInterfaces[0].networkIP
  db_tier: labels.tier
  environment: labels.environment

Variable Precedence (Low → High, 22 levels)

1.  role defaults (defaults/main.yml)
2.  inventory file or script group vars
3.  inventory group_vars/all
4.  playbook group_vars/all
5.  inventory group_vars/*
6.  playbook group_vars/*
7.  inventory file or script host vars
8.  inventory host_vars/*
9.  playbook host_vars/*
10. host facts / cached set_facts
11. play vars
12. play vars_prompt
13. play vars_files
14. role vars (vars/main.yml)
15. block vars
16. task vars (only for that task)
17. include_vars
18. set_facts / registered vars
19. role (and include_role) params
20. include params
21. extra vars (command line -e) ← ALWAYS WINS

Practical rule: Use defaults/main.yml for role defaults (easily overridden). Use vars/main.yml
only for role-internal constants that must not be overridden. Use group_vars for environment config.

Playbook Structure Best Practices

# site.yml — top-level playbook
---
- import_playbook: playbooks/common.yml
- import_playbook: playbooks/webservers.yml
- import_playbook: playbooks/databases.yml

# playbooks/webservers.yml
---
- name: Configure web servers
  hosts: webservers
  become: yes                    # sudo
  gather_facts: yes              # Set to no for speed in large inventories
  
  pre_tasks:
    - name: Update apt cache (once per play)
      apt:
        update_cache: yes
        cache_valid_time: 3600   # Skip if cache is < 1 hour old
      when: ansible_os_family == "Debian"
      tags: [always]             # Run even when --tags is specified
  
  roles:
    - role: common
      tags: [common]
    - role: nginx
      tags: [nginx]
      vars:
        nginx_worker_processes: 4
  
  post_tasks:
    - name: Verify nginx is running
      service:
        name: nginx
        state: started
      check_mode: yes            # Test without changing
      tags: [verify]

Idempotency Patterns

# ✅ Use modules instead of shell — modules are idempotent
- name: Install packages
  apt:
    name: [nginx, git, python3-pip]
    state: present                 # Ensures installed; doesn't reinstall if present

# ❌ Shell is NOT idempotent
- name: Install nginx
  shell: apt-get install -y nginx  # Runs every time, no idempotency check

# ✅ When you MUST use shell: add changed_when / failed_when
- name: Run custom migration script
  shell: /opt/app/migrate.sh
  args:
    creates: /opt/app/.migration_complete  # Skip if this file exists (idempotent!)
  register: migration_result
  changed_when: migration_result.rc == 0 and 'already up to date' not in migration_result.stdout
  failed_when: migration_result.rc != 0 and 'already up to date' not in migration_result.stdout

# ✅ Lineinfile for config file modification (idempotent)
- name: Set kernel parameter
  lineinfile:
    path: /etc/sysctl.conf
    regexp: '^net.core.somaxconn'
    line: 'net.core.somaxconn = 65535'
    state: present
  notify: Apply sysctl

# ✅ Template for full file management (idempotent, diffs on change)
- name: Deploy nginx config
  template:
    src: nginx.conf.j2
    dest: /etc/nginx/nginx.conf
    owner: root
    group: root
    mode: '0644'
    validate: 'nginx -t -c %s'   # Validate before deploying!
  notify: Reload nginx

Role Structure

roles/nginx/
├── defaults/
│   └── main.yml          # Default variables (lowest priority, easily overridden)
├── vars/
│   └── main.yml          # Role-internal constants (high priority, rarely override)
├── tasks/
│   ├── main.yml          # Entry point (import other task files)
│   ├── install.yml
│   ├── configure.yml
│   └── ssl.yml
├── handlers/
│   └── main.yml          # Handlers (triggered by notify)
├── templates/
│   └── nginx.conf.j2     # Jinja2 templates
├── files/
│   └── dhparam.pem       # Static files
├── meta/
│   └── main.yml          # Role metadata and dependencies
└── molecule/             # Molecule tests
    └── default/
        ├── molecule.yml
        ├── converge.yml
        └── verify.yml
# roles/nginx/defaults/main.yml
nginx_version: "1.24"
nginx_worker_processes: "auto"
nginx_worker_connections: 4096
nginx_keepalive_timeout: 65
nginx_server_tokens: "off"
nginx_gzip_enabled: true
nginx_ssl_protocols: "TLSv1.2 TLSv1.3"

# roles/nginx/meta/main.yml
galaxy_info:
  author: platform-team
  description: Nginx web server installation and configuration
  license: MIT
  min_ansible_version: "2.12"
  platforms:
    - name: Ubuntu
      versions: ["20.04", "22.04"]

dependencies:
  - role: common              # Ensure common role runs first
  - role: certbot             # Install certbot before nginx tries to use certs
    when: nginx_ssl_enabled | default(false)

Handlers

# roles/nginx/handlers/main.yml
---
- name: Reload nginx
  service:
    name: nginx
    state: reloaded

- name: Restart nginx
  service:
    name: nginx
    state: restarted

- name: Apply sysctl
  command: sysctl -p /etc/sysctl.conf

# Handlers run ONCE at the end of a play, even if notified multiple times
# Tasks notify handlers like this:
# - name: Deploy config
#   template:
#     src: nginx.conf.j2
#     dest: /etc/nginx/nginx.conf
#   notify: Reload nginx

Jinja2 Templating

# templates/nginx.conf.j2
worker_processes {{ nginx_worker_processes }};
worker_rlimit_nofile {{ nginx_worker_rlimit_nofile | default(65535) }};

events {
    worker_connections {{ nginx_worker_connections }};
    use epoll;
    multi_accept on;
}

http {
    # Generate upstream block for each server in the group
    upstream app_backend {
        least_conn;
        {% for host in groups['webservers'] %}
        server {{ hostvars[host]['ansible_host'] }}:{{ app_port | default(8080) }};
        {% endfor %}
        keepalive 32;
    }
    
    {% if nginx_gzip_enabled %}
    gzip on;
    gzip_comp_level {{ nginx_gzip_level | default(6) }};
    gzip_types {% for type in nginx_gzip_types %}{{ type }}{% if not loop.last %} {% endif %}{% endfor %};
    {% endif %}
    
    # Conditionally include SSL config
    {% if nginx_ssl_enabled | default(false) %}
    ssl_protocols {{ nginx_ssl_protocols }};
    ssl_certificate {{ nginx_ssl_cert_path }};
    ssl_certificate_key {{ nginx_ssl_key_path }};
    {% endif %}
    
    # Template from variable map
    {% for key, value in nginx_headers.items() %}
    add_header {{ key }} "{{ value }}" always;
    {% endfor %}
}
# Useful Jinja2 filters in Ansible
{{ my_list | join(', ') }}
{{ my_string | upper | trim }}
{{ my_dict | to_json }}
{{ my_dict | to_nice_yaml }}
{{ my_path | basename }}
{{ my_path | dirname }}
{{ my_var | default('fallback') }}
{{ my_var | default(omit) }}     # Omit key entirely if undefined
{{ my_list | selectattr('enabled', 'equalto', true) | list }}
{{ my_list | map(attribute='name') | list }}
{{ my_string | regex_replace('^prefix_', '') }}
{{ 1024 * 1024 | human_readable }}
{{ lookup('env', 'HOME') }}      # Lookup from controller environment
{{ lookup('file', '/etc/hosts') }}  # Read file on controller

Ansible Vault

# Encrypt a variables file
ansible-vault encrypt group_vars/production/vault.yaml

# Encrypt a single value (for embedding in plaintext files)
ansible-vault encrypt_string 'my-secret-password' --name 'db_password'
# Result:
# db_password: !vault |
#   $ANSIBLE_VAULT;1.1;AES256
#   38623435...

# Edit encrypted file
ansible-vault edit group_vars/production/vault.yaml

# Run playbook with vault password from file
ansible-playbook site.yml --vault-password-file ~/.vault_password

# Or use environment variable
export ANSIBLE_VAULT_PASSWORD_FILE=~/.vault_password
ansible-playbook site.yml

# vault.yaml example
db_password: "production-db-password"
api_secret_key: "production-api-key"
ssl_private_key: |
  -----BEGIN RSA PRIVATE KEY-----
  ...
  -----END RSA PRIVATE KEY-----

Molecule Testing

# molecule/default/molecule.yml
driver:
  name: docker

platforms:
  - name: ubuntu-22
    image: ubuntu:22.04
    pre_build_image: true
    command: /lib/systemd/systemd
    privileged: true
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:rw
    cgroupns_mode: host

provisioner:
  name: ansible
  playbooks:
    converge: converge.yml
    verify: verify.yml
  inventory:
    host_vars:
      ubuntu-22:
        nginx_ssl_enabled: false
        nginx_worker_processes: 2

verifier:
  name: ansible
# molecule/default/verify.yml
---
- name: Verify nginx role
  hosts: all
  gather_facts: false
  
  tasks:
    - name: Check nginx is running
      service_facts:
    
    - name: Assert nginx is active
      assert:
        that:
          - "'nginx' in services"
          - "services['nginx'].state == 'running'"
          - "services['nginx'].status == 'enabled'"
    
    - name: Check nginx port is open
      wait_for:
        port: 80
        timeout: 10
    
    - name: Verify nginx config is valid
      command: nginx -t
      changed_when: false
    
    - name: Make HTTP request to verify response
      uri:
        url: http://localhost/health
        status_code: 200
      register: health_response
    
    - name: Assert health response
      assert:
        that: health_response.status == 200

Anti-Patterns

shell: or command: without creates: or changed_when: — breaks idempotency
ignore_errors: yes everywhere — hide failures until they're catastrophic
Hardcoded passwords in tasks — use ansible-vault encrypted group_vars
become: yes on every task — only elevate where actually needed
gather_facts: no everywhere for speed — facts are needed for OS-conditional tasks
No handlers for service restarts — tasks that change config should notify handlers
Huge playbooks instead of roles — roles make logic reusable and testable
No molecule tests — untested roles break in production when you change the base image
--extra-vars in CI for secrets — use vault-encrypted vars with vault password from CI secret

Quick Reference

Ansible commands:
  ansible all -m ping -i inventory/                → Test connectivity
  ansible-playbook site.yml -i inventory/ --check  → Dry run (check mode)
  ansible-playbook site.yml --tags nginx,ssl        → Run specific tags
  ansible-playbook site.yml --skip-tags common      → Skip tags
  ansible-playbook site.yml --limit webservers       → Limit to host group
  ansible-playbook site.yml --limit web-01.example.com  → Single host
  ansible-playbook site.yml -e "nginx_version=1.25"  → Override variable
  ansible-playbook site.yml --start-at-task "Task name" → Resume from task
  ansible all -a "systemctl status nginx" -i inventory/ → Ad-hoc command

Useful modules cheat sheet:
  apt/yum/dnf: Package management
  service:     Start/stop/enable services
  template:    Deploy Jinja2 templates
  copy:        Deploy static files
  file:        Create files/dirs/symlinks, set permissions
  lineinfile:  Manage single lines in files
  blockinfile: Manage blocks of text in files
  user:        Manage Linux users
  git:         Clone/update git repos
  uri:         Make HTTP requests
  assert:      Test conditions (use in verify.yml)
  wait_for:    Wait for port/file/condition
  debug:       Print variable values (use during development)
  set_fact:    Create/update variables

Skill Information

Source
MoltbotDen
Category
DevOps & Cloud
Repository
View on GitHub

Related Skills