-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embedded Ansible role worker #13551
Embedded Ansible role worker #13551
Conversation
|
||
def supervisord_ok? | ||
# fake method, we'll think it's not ok 5% of the time | ||
rand(20) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just so I can run it locally and sometimes have it return false.
@@ -336,7 +336,7 @@ def self.close_pg_sockets_inherited_from_parent | |||
end | |||
end | |||
|
|||
def start | |||
def start_runner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i extracted the fork/process creation to a method so the subclass can override and NOT create a fork/process
@@ -1188,6 +1188,8 @@ | |||
:poll_method: :normal | |||
:restart_interval: 0.hours | |||
:starting_timeout: 10.minutes | |||
:embedded_ansible_worker: | |||
:poll: 30.seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the do_work is really only monitoring supervisord, I have it doing it every 30 seconds
@@ -2,6 +2,7 @@ name,description,max_concurrent,external_failover,role_scope | |||
automate,Automation Engine,0,false,region | |||
database_operations,Database Operations,0,false,region | |||
database_owner,Database Owner,1,false,database | |||
embedded_ansible,Embedded Ansible,1,false,zone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the role scope be region initially?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were thinking of having an ansible zone in the region as the first step. Limiting it to the region might box us into a corner when we want to deal with ansible clustering and multiple independent ansibles. Maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 💭 Is the plural of ansible, ansibles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seriously, this is what you ask about... 😒
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What color is the 🚲 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought we can only support 1 per region, initially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct @chessbyte we can only support 1 active ansible at a time in a region. So this needs to be scoped to region
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the only downside I can see to using an explicit region
role is that it's different than all of the other provider type roles. So, while we're trying to treat it as a provider, it's not quite one. I can't think of any other problems with that though, migrating to zonal role should be straight forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that keeping this zone will introduce more tech debt (in the form of preventing the role from starting in multiple zones) than the trouble of possibly having the role fail over into a different zone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good 👍
@@ -0,0 +1,20 @@ | |||
class ManageIQ::Providers::AnsibleTower::EmbeddedAnsibleWorker < MiqWorker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about namespacing this under the provider.
Maybe directly under the model would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically it should live with the "new" provider right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fryguy I was thinking it would be more of a separate thing because it's related to running the service on the appliance rather than with talking to the provider's API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha, I can see how convinced all of us are to where it belongs. This is because namespacing it is part of naming it. And we know how easy naming is...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll change this to:
class ManageIQ::Providers::EmbeddedAnsible::MonitoringWorker < MiqWorker
in manageiq/providers/embedded_ansible/monitoring_worker.rb or something.
Until, we decide to move it again... The important part is it will be in the namespace of a separate provider specific to the embedded ansible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would resort to ManageIQ::Providers::EmbeddedAnsible
only as a last option, because it adds a new Provider.
Why not make this a ManageIQ::Providers::AnsibleTower::TowerWorker
which monitors a tower? First is embedded, next could be external?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@durandom Running ansible locally on the server (embedded) and talking to it through an endpoint are fundamentally different and, in my opinion, not actually related at all. They will share no common functionality.
I think what we need to do is come up with a new template for what we call a thing that actually is responsible for running a service locally (because to me, that's not a provider at all).
@@ -60,6 +62,18 @@ def url=(new_url) | |||
default_endpoint.url = uri.to_s | |||
end | |||
|
|||
def self.seed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about seeding via the rest api? as a post install script. Or maybe something with the appliance console? This could be re-used to seed other providers too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@durandom sorry, didn't see this comment. Maybe? This is just a placeholder for now. I'll be integrating with NickC's work and we'll be addressing how we find/seed the default tower per zone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will also need to do something in here: https://github.com/ManageIQ/manageiq/blob/master/app/models/miq_server/role_management.rb#L4
@@ -60,6 +62,18 @@ def url=(new_url) | |||
default_endpoint.url = uri.to_s | |||
end | |||
|
|||
def self.seed | |||
embedded_in_zone || | |||
create_with(:url => "https://localhost:8443", :zone => Zone.default_zone) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need a reliable way to determine how all the other appliances will be able to access this one.
Maybe something like MiqRegion#remote_ws_url
will be required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this zone should by MiqServer.my_server.my_zone
. That way the provider's work will follow the server that is running the role.
bb51944
to
74f2707
Compare
.find_or_create_by(:name=> DEFAULT_EMBEDDED_NAME) | ||
# do we need to create an endpoint/authentication here? | ||
# We probably need to tie this provider to a miq_servers row | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this seeding and do this when the worker starts.
25d62f9
to
91995f5
Compare
|
||
def do_work | ||
if EmbeddedAnsible.running? | ||
_log.info("#{log_prefix} supervisord is ok!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE, drop this line before merging... it's helpful during testing...
91995f5
to
7a9ee52
Compare
ad19008
to
40c90bb
Compare
# we need to intercept SystemExit and exit our thread, | ||
# not the main server thread! | ||
def do_exit(*args) | ||
# ensure this doesn't fail or that we can still get to the super call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this work?
ensure
super
end
|
||
def stop | ||
super | ||
# stop supervisord service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's drop this method since do_exit handles this for us
end | ||
|
||
def kill | ||
# Does the base class's kill -9 work on the supervisord process as we want? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and setting the right worker pid is not handled in this PR. For now, the server's pid is stored in the worker row but we have disabled monitoring of this pid's memory.
f18ba22
to
f670204
Compare
Right now, this is a worker that doesn't have a separate process. Instead, it's just a thread running in the server that monitors the supervisord process and handles restarting it when it goes down. https://www.pivotaltracker.com/story/show/135450841
f670204
to
23af660
Compare
Some comments on commits jrafanie/manageiq@1773711~...23af660 spec/models/embedded_ansible_worker/runner_spec.rb
|
Checked commits jrafanie/manageiq@1773711~...23af660 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0 app/models/embedded_ansible_worker.rb
app/models/embedded_ansible_worker/runner.rb
spec/models/embedded_ansible_worker/runner_spec.rb
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This PR runs a "monitoring" worker thread in the server process that monitors another process on the appliance. It implements the following:
embedded_ansible
limited to 1 per regionWhat works:
What needs to be done:
@carbonin @Fryguy @gtanzillo Please review/throw 🍅