Linux Configuration Management with Ansible

Contents

Linux Configuration Management with Ansible

One-time Setup
Usage Examples: one-off commands
Usage Examples: Ongoing configuration management
Managing Configurations
Implementation Details
Links

One-time Setup

First, grab a working copy of the repository in your home directory. You should include both the github and eng_linux (NFS) remotes so you can keep both copies up to date.

$ cd
$ git clone git@github.com:eng-it/ansible.git
$ cd ansible
$ git remote add eng_linux /ad/eng/support/software/linux/etc/ansible.git

See the README.md file there for more details on setting up the environment.

Usage Examples: one-off commands

These examples show some mpc-like tasks to run commands, copy over files, etc. The lab names are the “host groups” specified in ~/ansible/hosts. There are four meta-groups: grid (all grid server machines), instruction (instruction lab computers), research (research lab workstations), and servers (ENG-IT Linux servers).

Basic arguments:

-m specifies what ansible module to use (default is “command”. To use shell things like pipes and redirects specify “-m shell”)
-a specifies the arguments for the module (use quotes to group spaces)
-o will keep per-host output to one line.

Check the uptime report for each compsim system:

$ ansible compsim -a uptime -o
bme-compsim-1 | success | rc=0 | (stdout)  16:23:27 up 91 days,  1:54,  4 users,  load average: 1.00, 1.00, 1.00
bme-compsim-5 | success | rc=0 | (stdout)  16:23:27 up 13 days,  2:29,  3 users,  load average: 1.03, 1.00, 1.00
bme-compsim-4 | success | rc=0 | (stdout)  16:23:27 up 134 days,  1:07,  3 users,  load average: 1.07, 1.02, 1.00
bme-compsim-7 | success | rc=0 | (stdout)  16:23:29 up 26 days,  3:30,  5 users,  load average: 1.05, 1.04, 1.01
...

Copy a configuration file to each signals system. If the file was already there it will report that as changed: false.

$ ansible signals -m copy -a "src=~/something.conf dest=/etc/something.conf" -o

signals11 | success >> {"changed": true}


signals12 | success >> {"changed": true}


signals09 | success >> {"changed": false}

...

Usage Examples: Ongoing configuration management

These examples use ansible playbooks to track how hosts should be configured in general.

Basic arguments:

--check will do its best to figure out what would be changed if the playbook were actually run.
--diff shows diffs of any changing files
--tags restricts the run to just tasks that have that tag listed.
--skip-tags does the reverse.

Have all compsim machines re-join AD as needed:

$ kinit jesse08-adm -c adm_ticket
Password for jesse08-adm@AD.BU.EDU:
$ ansible-playbook grid.yml -l compsim --tags ad-client

Check what would be changed to bring all bungee systems up to date for the grid settings:

$ ansible-playbook grid.yml -l bungee-nodes --tags grid-computenode --check

Apply all configuration steps defined for each of Ultra’s Linux workstations, from start to finish, but leaving out any lists of package installs. (This will still check and install packages for specific things, like AD, NFS, SSSD, etc., just won’t do all the locally-installed packages in the list.) In this example there are two workstations, with one currently unplugged.

$ ansible-playbook research.yml -l ultra-workstations --skip-tags packages

PLAY [ultra-workstations] *****************************************************

GATHERING FACTS ***************************************************************
fatal: [ece-pho810-02] => SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh
ok: [ece-pho810-03]

TASK: [common | Ensure that all SSH keys in our list are present in /root/.ssh/authorized_keys] ***
ok: [ece-pho810-03] => (item=/ad/eng/users/j/e/jesse08/ansible/roles/common/files/ssh-keys/batista.pub)
ok: [ece-pho810-03] => (item=/ad/eng/users/j/e/jesse08/ansible/roles/common/files/ssh-keys/jaredb.key)
ok: [ece-pho810-03] => (item=/ad/eng/users/j/e/jesse08/ansible/roles/common/files/ssh-keys/jesse08.key)
ok: [ece-pho810-03] => (item=/ad/eng/users/j/e/jesse08/ansible/roles/common/files/ssh-keys/jkgoebel.key)
ok: [ece-pho810-03] => (item=/ad/eng/users/j/e/jesse08/ansible/roles/common/files/ssh-keys/mskramer.pub)

TASK: [common | Delete the annoying message for root logins] ******************
ok: [ece-pho810-03]

TASK: [common | Set SELinux to permissive mode.] ******************************
ok: [ece-pho810-03]

...


PLAY RECAP ********************************************************************
           to retry, use: --limit @/home/jesse08/research-ece-ultra.retry

ece-pho810-02              : ok=0    changed=0    unreachable=1    failed=0
ece-pho810-03              : ok=53   changed=0    unreachable=0    failed=0

Managing Configurations

Here’s a general overview of how to update and manage the configuration files themselves, and coordinate any changes with others in ENG-IT. The descriptions here about how Ansible works just scratch the surface, but the official documentation is very thorough:

Playbooks, Roles, Tasks, and Modules

All configuration data for Ansible is stored in YAML format, with strings, lists, dictionaries, and so on stored in a compact tree structure of simple text. At the lowest level Ansible uses a set of modules called within config files to actually implement tasks. Some common modules are command, shell, copy, mount, service, and yum (which all do basically what the names imply).

To actually use the modules, tasks give the modules arguments about what to do. For example this standalone playbook uses the “file” module to ensure that two symlinks exist:

---
- hosts: all
  tasks:
    - file: state=link src=/usr/lib64/libGL.so.1 dest=/usr/lib64/libGL.so
    - file: state=link src=/usr/lib64/libGL.so.1.2.0 dest=/usr/lib64/libGL.so.1

Playbooks can include other playbooks (see next section below) and some more parameters aside from tasks, but mainly group sets of tasks together.

The last major concept is that of roles. A role is a directory containing all the configuration data and files defining a particular role a computer is meant to play or a service it provides. Ansible uses some conventions for paths inside the role directories that make it easy to refer to other files within the role, and roles can be linked into playbooks easily. All of this means roles are good for compartmentalizing all the details for each service. For example, all the details related to eng-grid-monitor.bu.edu’s configuration, from start to finish, can be summarized in servers.yml as shown below, with five roles in total. First there are roles that apply to all servers, and separately there is an entry to apply “ganglia-server” only to eng-grid-monitor.

- hosts: servers
  roles:
    - common
    - networking
    - sshd
    - ad-client

# (... skipping ...)

- hosts: eng-grid-monitor
  roles:
    - ganglia-server

For the full description of how the roles directories are organized see the Ansible role documentation, and a brief example in the following section.

How Our Files are Organized

Start here for an overview of what’s actually inside our ansible configuration repository.

The top level directory

At the highest level there’s a site.yml file that is just a list containing all of the per-hostgroup playbooks.

The inventory directory contains definitions for what hosts belong to what groups. Along with specific playbooks like research.yml this defines how specific hosts should be configured. Each role listed in a playbook corresponds to one of the sub-directories in the roles directory. If a dictionary is used for a role instead of just a role name, we can set variables that the role can use, like for the sshd role in this example.

---
- hosts: research
  roles:
    - common
    - networking
    - kace-client
    - { role: sshd, sshd_restrict_to_admins: true }
    - ad-client
    - kerberized-nfs
    - sssd
    - software-admin
    - software-scitech
    - eng-shell-modules
    - workstation
    - printing

The roles directory

For a quick reference of how a role is set up, look at roles/sshd:

$ ls -1 roles/sshd/*
roles/sshd/defaults:
main.yml

roles/sshd/handlers:
main.yml

roles/sshd/tasks:
main.yml
sshd_config.yml

roles/sshd/templates:
sshd_config

tasks/main.yml contains a block of comments describing what the role does, and a short block of include lines that include more specific sub-tasks. Tagging the includes with the role name make it easy to select specific roles when running ansible-playbook with the --tags argument, as shown below. sshd_config.yml contains tasks to actually enable and configure sshd.

- include: sshd_config.yml
  tags: [ sshd, sshd_config ]

templates/sshd_config is a sshd config file, but using Ansible’s jinja2 templating support. (It can insert variables into configuration files dynamically and contain flow control and loops and things.) Here we just use it to decide whether or not to include “AllowGroups root wheel” based on the variable sshd_restrict_to_admins.

defaults/main.yml is a set of default variables the role will use. This way sshd_restrict_to_admins can be set to false by default, and then enabled for just those hosts where it’s needed.

handlers/main.yml defines the “handlers” that tasks can “notify” (see sshd_config.yml) that something has changed and a service should be restarted. It’s done separately from the tasks so that multiple changes might notify a handler, but it runs just once after the tasks are complete.

---
- name: reload sshd
  service: name=sshd state=reloaded

Coordinating Changes within ENG-IT

There are two main branches always in the git repository: master (for production use, and the default for auto-installs) and testing:

$ git branch -a
  master
* testing
  remotes/eng_linux/master
  remotes/eng_linux/testing

One strategy is to branch off from testing to add specific features, merge them back in as they’re finished, and then merge testing back into master when it’s stable. This article presents the idea nicely. With that approach we can add new roles or features in specifically-named branches, and only impact others’ use when we’re sure the code is ready and agree to merge it in.

Quick tip: before committing anything to anywhere, check the YAML syntax across all playbooks included via site.yml. This won’t actually connect to anywhere or run anything but it will sanity-check your YAML syntax.

$ ansible-playbook --check-syntax site.yml

A contrived example for adding an AFS client configuration

(First create a new branch off of testing for the new feature.)

$ git checkout -b afs testing
Switched to a new branch 'afs'
$ git branch -a
* afs
  master
  testing
  remotes/eng_linux/master
  remotes/eng_linux/testing

(So, the “afs” branch only exists in this copy of the repo. Now add files and commit them.)

$ git status
# On branch afs
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       roles/afs/
nothing added to commit but untracked files present (use "git add" to track)
$ git add roles/afs/
$ git status
# On branch afs
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       new file:   roles/afs/files/CellAlias
#       new file:   roles/afs/files/CellServDB
#       new file:   roles/afs/files/ThisCell
#       new file:   roles/afs/tasks/afs.yml
#       new file:   roles/afs/tasks/main.yml
#
$ git commit -m 'AFS client configuration for BU network'
[afs c32e9c4] AFS client configuration for BU network
 5 files changed, 16 insertions(+), 0 deletions(-)
 create mode 100644 roles/afs/files/CellAlias
 create mode 100644 roles/afs/files/CellServDB
 create mode 100644 roles/afs/files/ThisCell
 create mode 100644 roles/afs/tasks/afs.yml
 create mode 100644 roles/afs/tasks/main.yml

(Now merge that local branch into the real testing branch, and delete the local branch.)

$ git checkout testing
Switched to branch 'testing'
$ git merge --no-ff afs
$ git branch -d afs
$ git push eng_linux

Implementation Details

A self-contained Ansible installation is on the network, installed to /ad/eng/opt/64/ansible. This was done with a dedicated miniconda install, with ansible on top of that. This can be updated by running /ad/eng/support/software/linux/opt/64/ansible/bin/pip install --upgrade ansible (note the long path for the read/write mountpoint). The definitive configuration data is kept as a bare git repository in /ad/eng/support/software/linux/etc/ansible.git and at https://github.com/eng-it/ansible.