Switch from ansible python api to ansible-runner #428

jctanner · 2019-04-22T21:16:32Z

Supercedes #410

There's a couple reasons for this pullrequest ...

The ansible python api is unstable, and upkeep for testinfra will continue to be a pain point.
ansible-runner is how Ansible Tower and AWX communicate with ansible and will be perpetually maintained
testinfra is Apache licensed and imports ansible which is GPL licensed. This violates GPL for most people's interpretation of GPL https://opensource.stackexchange.com/a/1641

NOTE: Molecule can not yet use this version until an incompatibility with colorama is fixed ansible/molecule#2001

philpep

Hi @jctanner , many thanks for this ! I think we need this in testinfra asap, at least before the ansible 2.8 release for which support is currently broken.
I left some comments but nothing really blocking, thanks!

philpep · 2019-04-22T22:59:04Z

testinfra/utils/ansible_runner.py



-__all__ = ['AnsibleRunner', 'to_bytes']
+def to_bytes(data):
+    '''Why!?!?'''


Hmm, IIRC this was a workaround of a weird encoding issue in ansible. I think you can drop this and also drop testinfra.backend.ansible.encode() method and drop the skip for ansible on test_backends.test_encoding and see if it pass.

testinfra/utils/ansible_runner.py

philpep · 2019-04-22T23:02:52Z

testinfra/utils/ansible_runner.py

+
+        # runner must have a directory based payload
+        data_dir = tempfile.mkdtemp(prefix='runner.data.%s.' \
+            % datetime.datetime.now().isoformat().replace(':', '-'))


I'd suggest to use tempfile.TemporaryDirectory() context manager instead to avoid leaving the directory in case of Exception.

Also why using a private data dir for ansible runner ? I think users expect to have all their files (envvars, groupvars, ansible.cfg, inventory, etc) availables in testinfra tests.

This is another architecture carryover from how runner is an evolution of the internal code for tower/awx and then isolated nodes. The idea being (if i'm doing it justice), the tower job template data gets exposed to a proot'ed/bublewrap'ed environment where all the "isolation" for the execution of ansible occurs.

I'll also point out that a lot of the inputs can be provided as direct inputs to the run method (obviating the need for a private data dir).

Runner does need (and will get) a mechanism for not needing a private data dir or artifacts dir.

philpep · 2019-04-22T23:11:19Z

testinfra/utils/ansible_runner.py

-            'failed': True,
-            'msg': 'Skipped. You might want to try check=False',
-            'item': item,
+        # molecule inventories use lookups


Can you explain what is this envvars file about ?

Sure ... so molecule exports a lot of environment variables which are picked up by the yaml based inventory it wants to give to ansible. Here's some examples ...

MOLECULE_DEBUG: 'False' MOLECULE_DEPENDENCY_NAME: galaxy MOLECULE_DRIVER_NAME: docker MOLECULE_ENV_FILE: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole/.env.yml MOLECULE_EPHEMERAL_DIRECTORY: /tmp/molecule/foorole/default MOLECULE_FILE: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole/molecule/default/molecule.yml MOLECULE_INSTANCE_CONFIG: /tmp/molecule/foorole/default/instance_config.yml MOLECULE_INVENTORY_FILE: /tmp/molecule/foorole/default/inventory/ansible_inventory.yml MOLECULE_LINT_NAME: yamllint MOLECULE_PROJECT_DIRECTORY: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole MOLECULE_PROVISIONER_LINT_NAME: ansible-lint MOLECULE_PROVISIONER_NAME: ansible MOLECULE_SCENARIO_DIRECTORY: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole/molecule/default MOLECULE_SCENARIO_NAME: default MOLECULE_VERIFIER_LINT_NAME: flake8 MOLECULE_VERIFIER_NAME: testinfra MOLECULE_VERIFIER_TEST_DIRECTORY: /home/jtanner/workspace/github/jctanner/2019-04-16-ecosystem/foorole/molecule/default/tests

Here's a sample inventory ...

--- all: hosts: instance: &id001 ansible_connection: docker vars: molecule_ephemeral_directory: '{{ lookup(''env'', ''MOLECULE_EPHEMERAL_DIRECTORY'') }}' molecule_file: '{{ lookup(''env'', ''MOLECULE_FILE'') }}' molecule_instance_config: '{{ lookup(''env'', ''MOLECULE_INSTANCE_CONFIG'') }}' molecule_no_log: '{{ lookup(''env'', ''MOLECULE_NO_LOG'') or not molecule_yml.provisioner.log|default(False) | bool }}' molecule_scenario_directory: '{{ lookup(''env'', ''MOLECULE_SCENARIO_DIRECTORY'') }}' molecule_yml: '{{ lookup(''file'', molecule_file) | molecule_from_yaml }}' ungrouped: hosts: instance: *id001 vars: {}

Ansible-runner makes heavy use of setting environment variables in the pexect spawn environment it creates, so users have to supply any custom env vars they care about. Since the vars molecule exports won't make it down through pytest and then testinfra and then runner and then pexect, I have to tell runner about those vars and their values.

testinfra/utils/ansible_runner.py

webknjaz · 2019-04-23T12:49:30Z

testinfra/utils/ansible_runner.py

-__all__ = ['AnsibleRunner', 'to_bytes']
+def to_bytes(data):
+    '''Why!?!?'''
+    return b'%s' % data


This will most likely fail when trying to put Unicode into bytes.

cc @abadger

. I realize that the current version of the code has removed this function but just wanted to confirm what webknjaz said.

Any to_bytes style method has to be built around the encode method for the conversion with additional conditional logic around it to protect from trying to convert the wrong types and to set the correct encoding.

webknjaz · 2019-04-23T12:57:58Z

testinfra/utils/ansible_runner.py

+
+        # a "significant" event is the event that has the task result
+        significant_event = None
+        if events:


This check is unnecessary: iterating through an empty iterable will just skip the loop automatically.

Suggested change

if events:

(filter will return either loop or an iterator: https://github.com/ansible/ansible-runner/blob/795ee7b/ansible_runner/runner.py#L317)

jctanner · 2019-04-23T21:33:18Z

@philpep I think I've implemented all the requested changes. However, there seems to be intermittent mismatches between what values the tests are asserting come back from the alpine container(s) and what they actually have.

philpep · 2019-04-26T00:44:33Z

Hi @jctanner ,

I fixed the alpine test and found a quite huge bug in your PR because it drop ansible from the requirements and all ansible tests are skipped, that's probably why they seems to pass :D I fixed this in 0dc8649
I restarted tests on your PR and you should have real failures.

Then I've been going more in deep into ansible-runner (from which I'm new) and from my current point of view of ansible-runner it add more complexity than it solve the ansible interface issue, at least for our usage :/ The principal issue for me is handling of the "input directory", while I understand it can make sense for AWX I cannot figure how to use it in a ansible project with playbooks and roles and vars written as flat files. But maybe I'm missing something ?

Maybe we could consider interfacing with ansible cli directly, call ansible in a subprocess with json output, I think it can provide what we need (have a maintainable code and fix license issue).

I cannot figure actually how it can work with ansible-runner with reasonable code and without breaking existing usages (users expect testinfra to behave just like the ansible cli in their repository, loading ansible.cfg, vars, playbooks, roles, vault etc).

I known you spent time on this PR and I'm sorry to not have seen theses issues before, hope you understand, I'll have some time next week (french labor day \o/) to go more in deep into this.

ianw · 2019-04-30T00:38:40Z

test-requirements.txt

@@ -5,4 +5,4 @@ paramiko
 tornado<5
 salt
 pywinrm
-ansible>=2.4,<3


As I think mentioned elsewhere, this change needs to leave ansible in here, otherwise it's not installed. I'm seeing this trying to run this change with tox

ianw · 2019-04-30T06:50:30Z

testinfra/utils/ansible_runner.py

+            raise AnsibleInventoryException(e)
+
+        try:
+            inv = json.loads(so)


This is failing with python3 as check_output returns bytes, which messes up json.loads()

E TypeError: the JSON object must be str, not 'bytes'

ianw@821bf58

ianw · 2019-04-30T07:34:06Z

I also have a little time to look into this; as you say with 2.8 it's a more pressing issue.

Personally, I think using ansible-runner seems like a more stable option. I'm context switching back into this just to see what I can glean and if I can help ...

To avoid polluting this with too much intermediate debugging that gets very confusing, I've started an etherpad at https://etherpad.openstack.org/p/testinfra-ansible-runner with some notes on where I'm at with it.

https://github.com/ianw/testinfra/tree/ansible-runner-v2

As described, I'm currently seeing test_ansible_module[ansible://debian_stretch] failing; it seems to be similar to what I saw before in that there seems to be half-consumed output (http://paste.openstack.org/show/750075/). I'll continue debugging to see what's up.

Note that the openstack infra project has a test that runs our CI with the master branch of ansible and master branch of testinfra -- http://zuul.openstack.org/job/system-config-run-base-ansible-devel. This can be instructed to pull in any changes too; we can use this as an additional validation.

ianw · 2019-04-30T07:36:54Z

@philpep I also have a question about a weird behaviour when running tests -- when I just run "tox" a bunch of these containers seem to actually create a new VT; my X11 session switches out to a console prompt and I have to ctl-alt-fX to get away from it; sometimes it works and sometimes my X11 session has been logged out. It particularly seems to happen with Alpine? It's so weird, I've never seen it before -- does this ring any bells? Is this starting containers with any weird options? It could certainly be a local mis-configuration ...

jctanner · 2019-04-30T12:16:33Z

#432 is clearly a simpler approach and taught me something new about ansible.

philpep · 2019-05-01T01:27:14Z

@ianw

About your test, yeah it's weird, I run them on my laptop and it work fine (debian with docker 18.09). What version of docker are you using ? Maybe since we run systemd as pid 1 with --privileged (yes it's somewhat unsatisfying...), this can interract with systemd running on your machine ?

About ansible-runner, from what I see of the current implementation, it have several drawbacks:

This require to create the "input directory", copying inventory and pass environment variables through a yaml file. I think users expect all environment variables to be passed to ansible (including their own).

Also the inventory itself cannot be copied from self.host_list because it can come from ANSIBLE_INVENTORY environment variable or be referenced by ansible config (ansible.cfg). For these reasons I use "ansible-inventory" command in my implementation.

I don't see any advantage of ansible-runner over ansible --tree which directly dump the output as json (including error handling).

Also I think we should not use ansible to run all commands because it's slow (python + ansible cost). For instance when we write in a test: assert host.package("nginx").is_installed, testinfra will run several commands to detect OS, package manager, encoding informations etc. Sure there's a cache, but running all commands required by testinfra is very slow when calling ansible in a subprocess. For this reason I only run through ansible for the ansible module (host.ansible(...)) in my implementation.

I we want to use ansible as a transport to run commands, I think we should wait kind of "ansible --server" feature where we could send commands through stdin or unix socket and get result directly without having the cost of python + ansible initialization.

matburt · 2019-05-01T02:00:12Z

Howdy! I'm the primary dev on ansible-runner and I just wanted to clarify a few things...

The module interface (see here: https://ansible-runner.readthedocs.io/en/latest/source/ansible_runner.html#ansible_runner.interface.run) actually allows you to pass all parameters that would normally require the input directory hierarchy as parameters to the module itself. See inventory and envvars in the last (as per your example). You can totally define ANSIBLE_INVENTORY here, or pass the inventory in.

ansible --tree is indeed really great, but I want to point out that ansible-runner puts a lot more information in the event data structure than what it is returned by --tree. It also presents it to you as a event handler and as a module property. Behind the scenes it caches the data on the filesystem so you aren't holding all of that information in memory.

ansible-runner also has built-in support for directory isolation (see: https://ansible-runner.readthedocs.io/en/latest/standalone.html#running-with-directory-isolation) and process isolation (see: https://ansible-runner.readthedocs.io/en/latest/standalone.html#running-with-process-isolation). There's also built-in fact-caching that can work intuitively between runs.

One of the upcoming roadmap features of ansible-runner is a server/service mode just as you describe. The interface with which we implement this will also be supported through the module interface shown in this PR.

I'm obviously a little biased, as I think Runner is a great tool for exactly what you are trying to do. It's also maintained by the team at Red Hat/Ansible that develops Tower/AWX (and Runner itself is used directly by Tower/AWX). I'm always happy to discuss this in detail... including adding features, fixing bugs, etc. You can always find me here, in #ansible (or #ansible-runner or #ansible-awx).

philpep · 2019-05-01T21:25:57Z

Hi @matburt , thanks for the inputs !

I tested a bit more ansible-runner and it does not work like I thought by reading this PR, seems ansible_runner.run(host_pattern=host, module=module_name, module_args=module_args, inventory=self.host_lists) actually works pretty well.

private_data_dir seems optional (what does mean ?)
environment variables are passed through ansible call (tested with module_name="debug", module_args="msg={{ lookup('env', 'MYENVVAR') }}").
inventory can point to a file outside of non given "private_data_dir"

One thing failed, I cannot pass the dictionary given by ansible-inventory --list as inventory here.
Example:

inventory = {'_meta': {'hostvars': {'debian_stretch': {'ansible_host': 'localhost',
                                           'ansible_port': 32902,
                                           'ansible_ssh_private_key_file': '/tmp/pytest-of-phil/pytest-156/1398040088806160/ssh_key',                                                        
                                           'ansible_user': 'root',
                                           'mygroupvar': 'qux',
                                           'myhostvar': 'bar',
                                           'myvar': 'foo'}}},
 'all': {'children': ['testgroup', 'ungrouped']},
 'testgroup': {'hosts': ['debian_stretch']},
 'ungrouped': {}}

ansible-runner does not raise but does not suceed:

 [WARNING]: Skipping unexpected key (hostvars) in group (_meta), only "vars",
"children" and "hosts" are valid                                        
 [WARNING]:  * Failed to parse /tmp/tmpus16ohbj/inventory/hosts.json with yaml
plugin: Invalid "children" entry for "all" group, requires a dictionary, found
"<class 'list'>" instead.                                                 
 [WARNING]:  * Failed to parse /tmp/tmpus16ohbj/inventory/hosts.json with ini
plugin: /tmp/tmpus16ohbj/inventory/hosts.json:1: Expected key=value host        
variable assignment, got: {hostvars:                     
 [WARNING]: Unable to parse /tmp/tmpus16ohbj/inventory/hosts.json as an
inventory source                             
 [WARNING]: No inventory was parsed, only implicit localhost is available
 [WARNING]: provided hosts list is empty, only localhost is available. Note
that the implicit localhost does not match 'all'                      
 [WARNING]: Could not match supplied host pattern, ignoring: debian_stretch

I see ansible-runner provide more usable output (ansible call output, statistics, per hosts events), but in our case we only execute a single module on a single host, so I'm not sure it's worth.

You can see my implementation for interfacing directly with ansible in #432
I think ansible runner could replace the run_module() method and maybe error handling could be simplified.
Do you have or plan APIs for working with inventories ? What I need is list host matching a host/group pattern (this parametrize automatically the test suite for each host).

I'm glad you're planning to add a performant connection wrapper for ansible ! We definitivelly need this !

ianw · 2019-05-02T06:39:50Z

I've updated this pull request with a few changes in #410 and it now passes CI. Tomorrow I will work out testing it "live" in the OpenStack CI environment for further confirmation it works. I just wanted to make sure we have enough to show that ansible-runner really does work.

I think the right people are probably involved here to decide the best way forward (between some variation of) #410 and #432. Personally, I like the idea that ansible-runner is a generic project, and so by being a wrapper around that testinfra can easily take advantage of changes, such as connection pooling, as they come.

matburt · 2019-05-02T14:33:06Z

private_data_dir seems optional (what does mean ?)

The private data dir is really meant to support using Runner as an alternate entry point to wrapping ansible and ansible-playbook from something like a systemd unit or container launch entry point. It's really there to let you provide all of the inputs and get the outputs in a super stable location. It's entirely optional from the module use standpoint. Though when launching via the module it does need a place on the filesystem where results of the execution can be cached (the artifact directory).

One thing failed, I cannot pass the dictionary given by ansible-inventory --list as inventory here.

The inventory generated here doesn't look like a valid inventory that ansible would expect. This is what I would expect to see generated as an inventory script but not as an inventory input file which is the format that we'll expect there since we're going to pass it over to ansible as an inventory input json file. You'll get this error if you try to send that exact same output into ansible-inventory itself.

It seems like you are calling ansible-inventory on some inventory source? If that's the case just pass that inventory source json directly to runner than than passing it through ansible-inventory?

Right now I don't directly plan on anything for interfacing with inventory, but it does seem like something I'd like to have.

jctanner added 2 commits April 22, 2019 17:18

Use upstream ansible-runner instead of ansible python api

cf0fbe5

Add runner to the test requirements

f47c4b9

jctanner changed the title ~~Use runner v1~~ Switch from ansible python api to ansible-runner Apr 22, 2019

jctanner added 2 commits April 22, 2019 18:24

remove unused import

3dc52d3

fix pylint assertions

aab1d67

philpep requested changes Apr 22, 2019

View reviewed changes

jctanner added 3 commits April 22, 2019 20:07

fix flake8 errors

5470603

Use a contextmanager for the datadir

09301d7

fix flake8 and unused import

b7b59db

jctanner mentioned this pull request Apr 23, 2019

Move colorama init to a location outside the logger. ansible/molecule#2001

Merged

switch from Popen to check_output

79b8e9d

webknjaz reviewed Apr 23, 2019

View reviewed changes

jctanner added 2 commits April 23, 2019 11:11

Remove uncecessary check

9498154

remove to_bytes function

db99198

philpep mentioned this pull request Apr 29, 2019

Ansible runner issues #431

Closed

ianw reviewed Apr 30, 2019

View reviewed changes

jctanner closed this Apr 30, 2019

ianw mentioned this pull request May 2, 2019

Switch to ansible-runner #410

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch from ansible python api to ansible-runner #428

Switch from ansible python api to ansible-runner #428

jctanner commented Apr 22, 2019 •

edited

Loading

philpep left a comment

philpep Apr 22, 2019

philpep Apr 22, 2019

philpep Apr 22, 2019

jctanner Apr 23, 2019

matburt Apr 23, 2019

philpep Apr 22, 2019

jctanner Apr 23, 2019

webknjaz Apr 23, 2019

abadger Apr 24, 2019

webknjaz Apr 23, 2019 •

edited

Loading

jctanner commented Apr 23, 2019

philpep commented Apr 26, 2019

ianw Apr 30, 2019

ianw Apr 30, 2019

ianw commented Apr 30, 2019

ianw commented Apr 30, 2019

jctanner commented Apr 30, 2019

philpep commented May 1, 2019

matburt commented May 1, 2019

philpep commented May 1, 2019

ianw commented May 2, 2019

matburt commented May 2, 2019

Switch from ansible python api to ansible-runner #428

Switch from ansible python api to ansible-runner #428

Conversation

jctanner commented Apr 22, 2019 • edited Loading

philpep left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

webknjaz Apr 23, 2019 • edited Loading

Choose a reason for hiding this comment

jctanner commented Apr 23, 2019

philpep commented Apr 26, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianw commented Apr 30, 2019

ianw commented Apr 30, 2019

jctanner commented Apr 30, 2019

philpep commented May 1, 2019

matburt commented May 1, 2019

philpep commented May 1, 2019

ianw commented May 2, 2019

matburt commented May 2, 2019

jctanner commented Apr 22, 2019 •

edited

Loading

webknjaz Apr 23, 2019 •

edited

Loading