Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from ansible python api to ansible-runner #428

Closed
wants to merge 10 commits into from
Closed

Switch from ansible python api to ansible-runner #428

wants to merge 10 commits into from

Conversation

jctanner
Copy link

@jctanner jctanner commented Apr 22, 2019

Supercedes #410

fixes #401
fixes ansible/molecule#1727

There's a couple reasons for this pullrequest ...

  1. The ansible python api is unstable, and upkeep for testinfra will continue to be a pain point.
  2. ansible-runner is how Ansible Tower and AWX communicate with ansible and will be perpetually maintained
  3. testinfra is Apache licensed and imports ansible which is GPL licensed. This violates GPL for most people's interpretation of GPL https://opensource.stackexchange.com/a/1641

NOTE: Molecule can not yet use this version until an incompatibility with colorama is fixed ansible/molecule#2001

@jctanner jctanner changed the title Use runner v1 Switch from ansible python api to ansible-runner Apr 22, 2019
Copy link
Contributor

@philpep philpep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jctanner , many thanks for this ! I think we need this in testinfra asap, at least before the ansible 2.8 release for which support is currently broken.
I left some comments but nothing really blocking, thanks!



__all__ = ['AnsibleRunner', 'to_bytes']
def to_bytes(data):
'''Why!?!?'''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, IIRC this was a workaround of a weird encoding issue in ansible. I think you can drop this and also drop testinfra.backend.ansible.encode() method and drop the skip for ansible on test_backends.test_encoding and see if it pass.

testinfra/utils/ansible_runner.py Outdated Show resolved Hide resolved

# runner must have a directory based payload
data_dir = tempfile.mkdtemp(prefix='runner.data.%s.' \
% datetime.datetime.now().isoformat().replace(':', '-'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to use tempfile.TemporaryDirectory() context manager instead to avoid leaving the directory in case of Exception.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why using a private data dir for ansible runner ? I think users expect to have all their files (envvars, groupvars, ansible.cfg, inventory, etc) availables in testinfra tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another architecture carryover from how runner is an evolution of the internal code for tower/awx and then isolated nodes. The idea being (if i'm doing it justice), the tower job template data gets exposed to a proot'ed/bublewrap'ed environment where all the "isolation" for the execution of ansible occurs.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also point out that a lot of the inputs can be provided as direct inputs to the run method (obviating the need for a private data dir).

Runner does need (and will get) a mechanism for not needing a private data dir or artifacts dir.

'failed': True,
'msg': 'Skipped. You might want to try check=False',
'item': item,
# molecule inventories use lookups
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what is this envvars file about ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure ... so molecule exports a lot of environment variables which are picked up by the yaml based inventory it wants to give to ansible. Here's some examples ...

MOLECULE_DEBUG: 'False'
MOLECULE_DEPENDENCY_NAME: galaxy
MOLECULE_DRIVER_NAME: docker
MOLECULE_ENV_FILE: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole/.env.yml
MOLECULE_EPHEMERAL_DIRECTORY: /tmp/molecule/foorole/default
MOLECULE_FILE: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole/molecule/default/molecule.yml
MOLECULE_INSTANCE_CONFIG: /tmp/molecule/foorole/default/instance_config.yml
MOLECULE_INVENTORY_FILE: /tmp/molecule/foorole/default/inventory/ansible_inventory.yml
MOLECULE_LINT_NAME: yamllint
MOLECULE_PROJECT_DIRECTORY: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole
MOLECULE_PROVISIONER_LINT_NAME: ansible-lint
MOLECULE_PROVISIONER_NAME: ansible
MOLECULE_SCENARIO_DIRECTORY: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole/molecule/default
MOLECULE_SCENARIO_NAME: default
MOLECULE_VERIFIER_LINT_NAME: flake8
MOLECULE_VERIFIER_NAME: testinfra
MOLECULE_VERIFIER_TEST_DIRECTORY: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole/molecule/default/tests

Here's a sample inventory ...

---
all:
  hosts:
    instance: &id001
      ansible_connection: docker
  vars:
    molecule_ephemeral_directory: '{{ lookup(''env'', ''MOLECULE_EPHEMERAL_DIRECTORY'')
      }}'
    molecule_file: '{{ lookup(''env'', ''MOLECULE_FILE'') }}'
    molecule_instance_config: '{{ lookup(''env'', ''MOLECULE_INSTANCE_CONFIG'') }}'
    molecule_no_log: '{{ lookup(''env'', ''MOLECULE_NO_LOG'') or not molecule_yml.provisioner.log|default(False)
      | bool }}'
    molecule_scenario_directory: '{{ lookup(''env'', ''MOLECULE_SCENARIO_DIRECTORY'')
      }}'
    molecule_yml: '{{ lookup(''file'', molecule_file) | molecule_from_yaml }}'
ungrouped:
  hosts:
    instance: *id001
  vars: {}

Ansible-runner makes heavy use of setting environment variables in the pexect spawn environment it creates, so users have to supply any custom env vars they care about. Since the vars molecule exports won't make it down through pytest and then testinfra and then runner and then pexect, I have to tell runner about those vars and their values.

testinfra/utils/ansible_runner.py Show resolved Hide resolved
__all__ = ['AnsibleRunner', 'to_bytes']
def to_bytes(data):
'''Why!?!?'''
return b'%s' % data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will most likely fail when trying to put Unicode into bytes.

cc @abadger

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. I realize that the current version of the code has removed this function but just wanted to confirm what webknjaz said.

Any to_bytes style method has to be built around the encode method for the conversion with additional conditional logic around it to protect from trying to convert the wrong types and to set the correct encoding.


# a "significant" event is the event that has the task result
significant_event = None
if events:
Copy link
Member

@webknjaz webknjaz Apr 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is unnecessary: iterating through an empty iterable will just skip the loop automatically.

Suggested change
if events:

(filter will return either loop or an iterator: https://github.com/ansible/ansible-runner/blob/795ee7b/ansible_runner/runner.py#L317)

@jctanner
Copy link
Author

@philpep I think I've implemented all the requested changes. However, there seems to be intermittent mismatches between what values the tests are asserting come back from the alpine container(s) and what they actually have.

@philpep
Copy link
Contributor

philpep commented Apr 26, 2019

Hi @jctanner ,

I fixed the alpine test and found a quite huge bug in your PR because it drop ansible from the requirements and all ansible tests are skipped, that's probably why they seems to pass :D I fixed this in 0dc8649
I restarted tests on your PR and you should have real failures.

Then I've been going more in deep into ansible-runner (from which I'm new) and from my current point of view of ansible-runner it add more complexity than it solve the ansible interface issue, at least for our usage :/ The principal issue for me is handling of the "input directory", while I understand it can make sense for AWX I cannot figure how to use it in a ansible project with playbooks and roles and vars written as flat files. But maybe I'm missing something ?

Maybe we could consider interfacing with ansible cli directly, call ansible in a subprocess with json output, I think it can provide what we need (have a maintainable code and fix license issue).

I cannot figure actually how it can work with ansible-runner with reasonable code and without breaking existing usages (users expect testinfra to behave just like the ansible cli in their repository, loading ansible.cfg, vars, playbooks, roles, vault etc).

I known you spent time on this PR and I'm sorry to not have seen theses issues before, hope you understand, I'll have some time next week (french labor day \o/) to go more in deep into this.

@philpep philpep mentioned this pull request Apr 29, 2019
@@ -5,4 +5,4 @@ paramiko
tornado<5
salt
pywinrm
ansible>=2.4,<3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think mentioned elsewhere, this change needs to leave ansible in here, otherwise it's not installed. I'm seeing this trying to run this change with tox

raise AnsibleInventoryException(e)

try:
inv = json.loads(so)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is failing with python3 as check_output returns bytes, which messes up json.loads()

E           TypeError: the JSON object must be str, not 'bytes'

ianw@821bf58

@ianw
Copy link
Collaborator

ianw commented Apr 30, 2019

I also have a little time to look into this; as you say with 2.8 it's a more pressing issue.

Personally, I think using ansible-runner seems like a more stable option. I'm context switching back into this just to see what I can glean and if I can help ...

To avoid polluting this with too much intermediate debugging that gets very confusing, I've started an etherpad at https://etherpad.openstack.org/p/testinfra-ansible-runner with some notes on where I'm at with it.

https://github.com/ianw/testinfra/tree/ansible-runner-v2

As described, I'm currently seeing test_ansible_module[ansible://debian_stretch] failing; it seems to be similar to what I saw before in that there seems to be half-consumed output (http://paste.openstack.org/show/750075/). I'll continue debugging to see what's up.

Note that the openstack infra project has a test that runs our CI with the master branch of ansible and master branch of testinfra -- http://zuul.openstack.org/job/system-config-run-base-ansible-devel. This can be instructed to pull in any changes too; we can use this as an additional validation.

@ianw
Copy link
Collaborator

ianw commented Apr 30, 2019

@philpep I also have a question about a weird behaviour when running tests -- when I just run "tox" a bunch of these containers seem to actually create a new VT; my X11 session switches out to a console prompt and I have to ctl-alt-fX to get away from it; sometimes it works and sometimes my X11 session has been logged out. It particularly seems to happen with Alpine? It's so weird, I've never seen it before -- does this ring any bells? Is this starting containers with any weird options? It could certainly be a local mis-configuration ...

@jctanner
Copy link
Author

#432 is clearly a simpler approach and taught me something new about ansible.

@jctanner jctanner closed this Apr 30, 2019
@philpep
Copy link
Contributor

philpep commented May 1, 2019

@ianw

About your test, yeah it's weird, I run them on my laptop and it work fine (debian with docker 18.09). What version of docker are you using ? Maybe since we run systemd as pid 1 with --privileged (yes it's somewhat unsatisfying...), this can interract with systemd running on your machine ?

About ansible-runner, from what I see of the current implementation, it have several drawbacks:

This require to create the "input directory", copying inventory and pass environment variables through a yaml file. I think users expect all environment variables to be passed to ansible (including their own).

Also the inventory itself cannot be copied from self.host_list because it can come from ANSIBLE_INVENTORY environment variable or be referenced by ansible config (ansible.cfg). For these reasons I use "ansible-inventory" command in my implementation.

I don't see any advantage of ansible-runner over ansible --tree which directly dump the output as json (including error handling).

Also I think we should not use ansible to run all commands because it's slow (python + ansible cost). For instance when we write in a test: assert host.package("nginx").is_installed, testinfra will run several commands to detect OS, package manager, encoding informations etc. Sure there's a cache, but running all commands required by testinfra is very slow when calling ansible in a subprocess. For this reason I only run through ansible for the ansible module (host.ansible(...)) in my implementation.

I we want to use ansible as a transport to run commands, I think we should wait kind of "ansible --server" feature where we could send commands through stdin or unix socket and get result directly without having the cost of python + ansible initialization.

@matburt
Copy link

matburt commented May 1, 2019

Howdy! I'm the primary dev on ansible-runner and I just wanted to clarify a few things...

The module interface (see here: https://ansible-runner.readthedocs.io/en/latest/source/ansible_runner.html#ansible_runner.interface.run) actually allows you to pass all parameters that would normally require the input directory hierarchy as parameters to the module itself. See inventory and envvars in the last (as per your example). You can totally define ANSIBLE_INVENTORY here, or pass the inventory in.

ansible --tree is indeed really great, but I want to point out that ansible-runner puts a lot more information in the event data structure than what it is returned by --tree. It also presents it to you as a event handler and as a module property. Behind the scenes it caches the data on the filesystem so you aren't holding all of that information in memory.

ansible-runner also has built-in support for directory isolation (see: https://ansible-runner.readthedocs.io/en/latest/standalone.html#running-with-directory-isolation) and process isolation (see: https://ansible-runner.readthedocs.io/en/latest/standalone.html#running-with-process-isolation). There's also built-in fact-caching that can work intuitively between runs.

One of the upcoming roadmap features of ansible-runner is a server/service mode just as you describe. The interface with which we implement this will also be supported through the module interface shown in this PR.

I'm obviously a little biased, as I think Runner is a great tool for exactly what you are trying to do. It's also maintained by the team at Red Hat/Ansible that develops Tower/AWX (and Runner itself is used directly by Tower/AWX). I'm always happy to discuss this in detail... including adding features, fixing bugs, etc. You can always find me here, in #ansible (or #ansible-runner or #ansible-awx).

@philpep
Copy link
Contributor

philpep commented May 1, 2019

Hi @matburt , thanks for the inputs !

I tested a bit more ansible-runner and it does not work like I thought by reading this PR, seems ansible_runner.run(host_pattern=host, module=module_name, module_args=module_args, inventory=self.host_lists) actually works pretty well.

  • private_data_dir seems optional (what does mean ?)
  • environment variables are passed through ansible call (tested with module_name="debug", module_args="msg={{ lookup('env', 'MYENVVAR') }}").
  • inventory can point to a file outside of non given "private_data_dir"

One thing failed, I cannot pass the dictionary given by ansible-inventory --list as inventory here.
Example:

inventory = {'_meta': {'hostvars': {'debian_stretch': {'ansible_host': 'localhost',
                                           'ansible_port': 32902,
                                           'ansible_ssh_private_key_file': '/tmp/pytest-of-phil/pytest-156/1398040088806160/ssh_key',                                                        
                                           'ansible_user': 'root',
                                           'mygroupvar': 'qux',
                                           'myhostvar': 'bar',
                                           'myvar': 'foo'}}},
 'all': {'children': ['testgroup', 'ungrouped']},
 'testgroup': {'hosts': ['debian_stretch']},
 'ungrouped': {}}

ansible-runner does not raise but does not suceed:

 [WARNING]: Skipping unexpected key (hostvars) in group (_meta), only "vars",
"children" and "hosts" are valid                                        
 [WARNING]:  * Failed to parse /tmp/tmpus16ohbj/inventory/hosts.json with yaml
plugin: Invalid "children" entry for "all" group, requires a dictionary, found
"<class 'list'>" instead.                                                 
 [WARNING]:  * Failed to parse /tmp/tmpus16ohbj/inventory/hosts.json with ini
plugin: /tmp/tmpus16ohbj/inventory/hosts.json:1: Expected key=value host        
variable assignment, got: {hostvars:                     
 [WARNING]: Unable to parse /tmp/tmpus16ohbj/inventory/hosts.json as an
inventory source                             
 [WARNING]: No inventory was parsed, only implicit localhost is available
 [WARNING]: provided hosts list is empty, only localhost is available. Note
that the implicit localhost does not match 'all'                      
 [WARNING]: Could not match supplied host pattern, ignoring: debian_stretch

I see ansible-runner provide more usable output (ansible call output, statistics, per hosts events), but in our case we only execute a single module on a single host, so I'm not sure it's worth.

You can see my implementation for interfacing directly with ansible in #432
I think ansible runner could replace the run_module() method and maybe error handling could be simplified.
Do you have or plan APIs for working with inventories ? What I need is list host matching a host/group pattern (this parametrize automatically the test suite for each host).

I'm glad you're planning to add a performant connection wrapper for ansible ! We definitivelly need this !

@ianw ianw mentioned this pull request May 2, 2019
@ianw
Copy link
Collaborator

ianw commented May 2, 2019

I've updated this pull request with a few changes in #410 and it now passes CI. Tomorrow I will work out testing it "live" in the OpenStack CI environment for further confirmation it works. I just wanted to make sure we have enough to show that ansible-runner really does work.

I think the right people are probably involved here to decide the best way forward (between some variation of) #410 and #432. Personally, I like the idea that ansible-runner is a generic project, and so by being a wrapper around that testinfra can easily take advantage of changes, such as connection pooling, as they come.

@matburt
Copy link

matburt commented May 2, 2019

private_data_dir seems optional (what does mean ?)

The private data dir is really meant to support using Runner as an alternate entry point to wrapping ansible and ansible-playbook from something like a systemd unit or container launch entry point. It's really there to let you provide all of the inputs and get the outputs in a super stable location. It's entirely optional from the module use standpoint. Though when launching via the module it does need a place on the filesystem where results of the execution can be cached (the artifact directory).

One thing failed, I cannot pass the dictionary given by ansible-inventory --list as inventory here.

The inventory generated here doesn't look like a valid inventory that ansible would expect. This is what I would expect to see generated as an inventory script but not as an inventory input file which is the format that we'll expect there since we're going to pass it over to ansible as an inventory input json file. You'll get this error if you try to send that exact same output into ansible-inventory itself.

It seems like you are calling ansible-inventory on some inventory source? If that's the case just pass that inventory source json directly to runner than than passing it through ansible-inventory?

Right now I don't directly plan on anything for interfacing with inventory, but it does seem like something I'd like to have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants