Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedded Ansible role worker #13551

Merged
merged 3 commits into from
Jan 31, 2017
Merged

Conversation

jrafanie
Copy link
Member

@jrafanie jrafanie commented Jan 17, 2017

This PR runs a "monitoring" worker thread in the server process that monitors another process on the appliance. It implements the following:

  • a new role embedded_ansible limited to 1 per region
  • a new "worker" using our existing worker mechanics but living in a Thread running on the server process

What works:

  • the worker and thread do not run unless you enable the role, which is not enabled by default
  • when the role is enabled, we instantiate a worker and run the monitoring thread using the existing worker mechanics but without fork/spawn process creation since we can do the work in a thread.

What needs to be done:

  • configure nginx
  • enable supervisord service
  • unconfiguring(if applicable) and disabling/stopping supervisord when the role is removed
  • check supervisord's status
  • restart supervisord if it's not in a good state
  • handle stop/restart/kill in the normal worker way
  • add tests

@carbonin @Fryguy @gtanzillo Please review/throw 🍅


def supervisord_ok?
# fake method, we'll think it's not ok 5% of the time
rand(20) > 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just so I can run it locally and sometimes have it return false.

@@ -336,7 +336,7 @@ def self.close_pg_sockets_inherited_from_parent
end
end

def start
def start_runner
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i extracted the fork/process creation to a method so the subclass can override and NOT create a fork/process

@@ -1188,6 +1188,8 @@
:poll_method: :normal
:restart_interval: 0.hours
:starting_timeout: 10.minutes
:embedded_ansible_worker:
:poll: 30.seconds
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the do_work is really only monitoring supervisord, I have it doing it every 30 seconds

@@ -2,6 +2,7 @@ name,description,max_concurrent,external_failover,role_scope
automate,Automation Engine,0,false,region
database_operations,Database Operations,0,false,region
database_owner,Database Owner,1,false,database
embedded_ansible,Embedded Ansible,1,false,zone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the role scope be region initially?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we were thinking of having an ansible zone in the region as the first step. Limiting it to the region might box us into a corner when we want to deal with ansible clustering and multiple independent ansibles. Maybe?

Copy link
Member

@Fryguy Fryguy Jan 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 💭 Is the plural of ansible, ansibles?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seriously, this is what you ask about... 😒

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What color is the 🚲 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought we can only support 1 per region, initially.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct @chessbyte we can only support 1 active ansible at a time in a region. So this needs to be scoped to region

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the only downside I can see to using an explicit region role is that it's different than all of the other provider type roles. So, while we're trying to treat it as a provider, it's not quite one. I can't think of any other problems with that though, migrating to zonal role should be straight forward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that keeping this zone will introduce more tech debt (in the form of preventing the role from starting in multiple zones) than the trouble of possibly having the role fail over into a different zone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 👍

@@ -0,0 +1,20 @@
class ManageIQ::Providers::AnsibleTower::EmbeddedAnsibleWorker < MiqWorker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about namespacing this under the provider.

Maybe directly under the model would be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it should live with the "new" provider right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fryguy I was thinking it would be more of a separate thing because it's related to running the service on the appliance rather than with talking to the provider's API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, I can see how convinced all of us are to where it belongs. This is because namespacing it is part of naming it. And we know how easy naming is...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll change this to:

class ManageIQ::Providers::EmbeddedAnsible::MonitoringWorker < MiqWorker

in manageiq/providers/embedded_ansible/monitoring_worker.rb or something.

Until, we decide to move it again... The important part is it will be in the namespace of a separate provider specific to the embedded ansible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would resort to ManageIQ::Providers::EmbeddedAnsible only as a last option, because it adds a new Provider.

Why not make this a ManageIQ::Providers::AnsibleTower::TowerWorker which monitors a tower? First is embedded, next could be external?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@durandom Running ansible locally on the server (embedded) and talking to it through an endpoint are fundamentally different and, in my opinion, not actually related at all. They will share no common functionality.

I think what we need to do is come up with a new template for what we call a thing that actually is responsible for running a service locally (because to me, that's not a provider at all).

@@ -60,6 +62,18 @@ def url=(new_url)
default_endpoint.url = uri.to_s
end

def self.seed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about seeding via the rest api? as a post install script. Or maybe something with the appliance console? This could be re-used to seed other providers too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@durandom sorry, didn't see this comment. Maybe? This is just a placeholder for now. I'll be integrating with NickC's work and we'll be addressing how we find/seed the default tower per zone.

Copy link
Member

@carbonin carbonin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -60,6 +62,18 @@ def url=(new_url)
default_endpoint.url = uri.to_s
end

def self.seed
embedded_in_zone ||
create_with(:url => "https://localhost:8443", :zone => Zone.default_zone)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need a reliable way to determine how all the other appliances will be able to access this one.
Maybe something like MiqRegion#remote_ws_url will be required?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think this zone should by MiqServer.my_server.my_zone. That way the provider's work will follow the server that is running the role.

.find_or_create_by(:name=> DEFAULT_EMBEDDED_NAME)
# do we need to create an endpoint/authentication here?
# We probably need to tie this provider to a miq_servers row
end
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this seeding and do this when the worker starts.

@jrafanie jrafanie force-pushed the ansible_role_worker branch 2 times, most recently from 25d62f9 to 91995f5 Compare January 27, 2017 21:37

def do_work
if EmbeddedAnsible.running?
_log.info("#{log_prefix} supervisord is ok!")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE, drop this line before merging... it's helpful during testing...

@jrafanie jrafanie force-pushed the ansible_role_worker branch 2 times, most recently from ad19008 to 40c90bb Compare January 30, 2017 19:21
# we need to intercept SystemExit and exit our thread,
# not the main server thread!
def do_exit(*args)
# ensure this doesn't fail or that we can still get to the super call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work?

ensure
  super
end


def stop
super
# stop supervisord service
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's drop this method since do_exit handles this for us

end

def kill
# Does the base class's kill -9 work on the supervisord process as we want?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and setting the right worker pid is not handled in this PR. For now, the server's pid is stored in the worker row but we have disabled monitoring of this pid's memory.

@jrafanie jrafanie force-pushed the ansible_role_worker branch 2 times, most recently from f18ba22 to f670204 Compare January 30, 2017 20:34
Right now, this is a worker that doesn't have a separate process.
Instead, it's just a thread running in the server that monitors the
supervisord process and handles restarting it when it goes down.

https://www.pivotaltracker.com/story/show/135450841
@miq-bot
Copy link
Member

miq-bot commented Jan 30, 2017

Some comments on commits jrafanie/manageiq@1773711~...23af660

spec/models/embedded_ansible_worker/runner_spec.rb

  • ⚠️ - 12 - Detected allow_any_instance_of. This RSpec method is highly discouraged, please only use when absolutely necessary.

@miq-bot
Copy link
Member

miq-bot commented Jan 30, 2017

Checked commits jrafanie/manageiq@1773711~...23af660 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
6 files checked, 12 offenses detected

app/models/embedded_ansible_worker.rb

app/models/embedded_ansible_worker/runner.rb

spec/models/embedded_ansible_worker/runner_spec.rb

@jrafanie jrafanie changed the title [WIP] Embedded Ansible role worker Embedded Ansible role worker Jan 31, 2017
@gtanzillo gtanzillo added this to the Sprint 53 Ending Jan 30, 2017 milestone Jan 31, 2017
@gtanzillo gtanzillo self-assigned this Jan 31, 2017
Copy link
Member

@gtanzillo gtanzillo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jrafanie jrafanie removed the wip label Jan 31, 2017
@gtanzillo gtanzillo merged commit 51ea8b3 into ManageIQ:master Jan 31, 2017
@jrafanie jrafanie deleted the ansible_role_worker branch February 6, 2017 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants