Configure apache balancer with up to 10 members at startup #14007

jrafanie · 2017-02-21T20:59:45Z

https://bugzilla.redhat.com/show_bug.cgi?id=1422988

Start Ui, Web Service, Web Socket, etc. puma workers bound to a port
from STARTING_PORT to the maximum worker count port (3000 to 3009 if max
worker count is 10). Configure apache at boot with these ports as
balancer members.

Fixes a failure after we start new puma workers and try to gracefully
restart apache. The next request will fail since apache is waiting for
active connections to close before restarting. The subsequent request will
then be ok since the failure would cause the websocket connections to
close, allowing apaache to restart fully.

Previously, we would add and remove members in the balancer configuration
when starting or stopping puma workers. We would then gracefully restart
apache since the new workers wouldn't be used until apache reloaded the
configuration. Note, we didn't do anything after removing members from
the balancer configuration because apache's mod_proxy_balancer gracefully
handles dead members by marking them as in Error and not retrying them for
60 seconds by default. Therefore, it's not necessary to restart apache to
"remove" members.

The problem is when we would try to add balancer members to the
configuration and gracefully restart apache. It turns out, our web
socket workers maintain active connections to apache so apache wouldn't
restart until those connections were closed.

Now, we take the idea mentioned above of the mod_proxy_balancer
keeping track of which members are alive or in error by configuring up
to 10, maximum_workers_count, members at server startup. We can then
start and stop workers and let apache route traffic to the members that
are alive. We no longer have to update the apache configuration and
restart it when a worker starts or stops.

Note, apache has a graceful reload option that could allow us to
maintain an accurate list of balancer members as workers start and stop
and tell apache workers to gracefully reload the configuration. This
option was buggy until fixed in [1]. It also required us to keep
touching the balancer configuration which we probably shouldn't have ben
doing in the first place.

[1] https://bz.apache.org/bugzilla/show_bug.cgi?id=44736

jrafanie · 2017-02-21T21:01:20Z

app/models/miq_server/environment_management.rb

      MiqUiWorker.install_apache_proxy_config
      MiqWebServiceWorker.install_apache_proxy_config
      MiqWebsocketWorker.install_apache_proxy_config
+      MiqApache::Control.restart


Restart apache after configuring the workers/balancers.

jrafanie · 2017-02-21T21:02:21Z

app/models/mixins/miq_web_server_worker_mixin.rb

@@ -89,8 +93,6 @@ def sync_workers
        end
      end

-      modify_apache_ports(ports_hash, self::PROTOCOL) if MiqEnvironment::Command.supports_apache?
-


We no longer modify the configuration after adding/removing Ui/Web Service/Web Socket workers...

Is the selectable number of workers (for each type) limited to 10 in the UI ?

@abellotti good question. The UI shows up to 9. Although, with advanced settings, you can choose more. This is why the maximum_workers_count is set to 10, so even if you choose more, the system won't let you go beyond that max value.

jrafanie · 2017-02-21T21:02:45Z

@bdunne @carbonin @Fryguy @gtanzillo please review.

bdunne · 2017-02-21T21:39:45Z

@jrafanie Can we get rid of even more of this code if we just ship a static file containing the balancer member list?

jrafanie · 2017-02-21T21:49:33Z

@jrafanie Can we get rid of even more of this code if we just ship a static file containing the balancer member list?

@bdunne So, it looks like this:

<Proxy balancer://evmcluster_ui/ lbmethod=byrequests>
BalancerMember http://0.0.0.0:3000
BalancerMember http://0.0.0.0:3001
BalancerMember http://0.0.0.0:3002
BalancerMember http://0.0.0.0:3003
BalancerMember http://0.0.0.0:3004
BalancerMember http://0.0.0.0:3005
BalancerMember http://0.0.0.0:3006
BalancerMember http://0.0.0.0:3007
BalancerMember http://0.0.0.0:3008
BalancerMember http://0.0.0.0:3009
</Proxy>

evmcluster_ui is dynamic, it doesn't need to be.
lbmethod=byrequests is dynamic (by configuration)
The port value is dynamic based on the type of worker, it doesn't need to be.

We could ship the file if we dropped the dynamic nature of those values.

https://bugzilla.redhat.com/show_bug.cgi?id=1422988 Start Ui, Web Service, Web Socket, etc. puma workers bound to a port from STARTING_PORT to the maximum worker count port (3000 to 3009 if max worker count is 10). Configure apache at boot with these ports as balancer members. Fixes a failure after we start new puma workers and try to gracefully restart apache. The next request will fail since apache is waiting for active connections to close before restarting. The subsequent request will then be ok since the failure would cause the websocket connections to close, allowing apaache to restart fully. Previously, we would add and remove members in the balancer configuration when starting or stopping puma workers. We would then gracefully restart apache since the new workers wouldn't be used until apache reloaded the configuration. Note, we didn't do anything after removing members from the balancer configuration because apache's mod_proxy_balancer gracefully handles dead members by marking them as in Error and not retrying them for 60 seconds by default. Therefore, it's not necessary to restart apache to "remove" members. The problem is when we would try to add balancer members to the configuration and gracefully restart apache. It turns out, our web socket workers maintain active connections to apache so apache wouldn't restart until those connections were closed. Now, we take the idea mentioned above of the mod_proxy_balancer keeping track of which members are alive or in error by configuring up to 10, maximum_workers_count, members at server startup. We can then start and stop workers and let apache route traffic to the members that are alive. We no longer have to update the apache configuration and restart it when a worker starts or stops. Note, apache has a graceful reload option that could allow us to maintain an accurate list of balancer members as workers start and stop and tell apache workers to gracefully reload the configuration. This option was buggy until fixed in [1]. It also required us to keep touching the balancer configuration which we probably shouldn't have ben doing in the first place. [1] https://bz.apache.org/bugzilla/show_bug.cgi?id=44736

miq-bot · 2017-02-23T19:52:23Z

Checked commit jrafanie@da9523e with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
3 files checked, 0 offenses detected
Everything looks good. 🍪

jrafanie · 2017-03-08T21:49:09Z

@skateman Can you review this too?

skateman · 2017-03-09T13:21:03Z

@jrafanie looks like websockets are working with any possible number of workers. Not sure what else should I test... 👍

jrafanie · 2017-03-10T15:00:28Z

@gtanzillo @carbonin I think this is ready to go, what do you think?

carbonin

I'm for it! 👍

gtanzillo

I'm good with this 👍

https://bugzilla.redhat.com/show_bug.cgi?id=1422988 Fixes a regression in ManageIQ#14007 that affects the initial start of the appliance which caused a 503 error when trying to access the UI. Because adding balancer members does a validation of the configuration files and these files try to load the redirect files among others, we need to add the balancers members after all configuration files have been written by install_apache_proxy_config.

…tartup Configure apache balancer with up to 10 members at startup (cherry picked from commit 8518e63) https://bugzilla.redhat.com/show_bug.cgi?id=1432463

simaishi · 2017-03-15T20:09:18Z

Euwe backport details:

$ git log -1
commit 44c8abf7d0e22f167b2b61976f34f6d8d39eec7a
Author: Gregg Tanzillo <[email protected]>
Date:   Fri Mar 10 17:11:25 2017 -0500

    Merge pull request #14007 from jrafanie/set_apache_balancer_once_at_startup
    
    Configure apache balancer with up to 10 members at startup
    (cherry picked from commit 8518e63699d4f7223d16a5461315270e0143abdf)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1432463

In Docker image, need to leave log/apache dir and delete log/*.log only (cherry picked from commit 769e4df) Change was needed due to behavior change after #14007

Need to leave log/apache dir, delete log/*.log only (cherry picked from commit 7293dd4) Change was needed due to behavior change after ManageIQ/manageiq#14007

(which is the only caller of restart_apache) #14007

jrafanie added bug core/workers euwe/yes labels Feb 21, 2017

jrafanie commented Feb 21, 2017

View reviewed changes

jrafanie force-pushed the set_apache_balancer_once_at_startup branch from 7041b0e to bac1ce0 Compare February 21, 2017 21:10

chessbyte assigned gtanzillo Feb 22, 2017

jrafanie force-pushed the set_apache_balancer_once_at_startup branch 3 times, most recently from 8fb8266 to e9f583e Compare February 23, 2017 19:38

jrafanie force-pushed the set_apache_balancer_once_at_startup branch from e9f583e to da9523e Compare February 23, 2017 19:46

carbonin approved these changes Mar 10, 2017

View reviewed changes

gtanzillo approved these changes Mar 10, 2017

View reviewed changes

gtanzillo added this to the Sprint 56 Ending Mar 13, 2017 milestone Mar 10, 2017

gtanzillo merged commit 8518e63 into ManageIQ:master Mar 10, 2017

jrafanie deleted the set_apache_balancer_once_at_startup branch March 12, 2017 20:42

jrafanie mentioned this pull request Mar 13, 2017

Add balancer members after configs have been written #14311

Merged

simaishi added euwe/backported and removed euwe/yes labels Mar 15, 2017

bdunne mentioned this pull request May 17, 2017

The last caller of queue_restart_apache has been removed #15118

Merged

carbonin pushed a commit that referenced this pull request May 17, 2017

The last caller of queue_restart_apache has been removed

0cae3e7

(which is the only caller of restart_apache) #14007

jrafanie added the darga/no label May 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure apache balancer with up to 10 members at startup #14007

Configure apache balancer with up to 10 members at startup #14007

jrafanie commented Feb 21, 2017

jrafanie Feb 21, 2017

jrafanie Feb 21, 2017

abellotti Feb 23, 2017

jrafanie Feb 23, 2017

jrafanie commented Feb 21, 2017

bdunne commented Feb 21, 2017

jrafanie commented Feb 21, 2017

miq-bot commented Feb 23, 2017

jrafanie commented Mar 8, 2017

skateman commented Mar 9, 2017

jrafanie commented Mar 10, 2017

carbonin left a comment

gtanzillo left a comment

simaishi commented Mar 15, 2017

Configure apache balancer with up to 10 members at startup #14007

Configure apache balancer with up to 10 members at startup #14007

Conversation

jrafanie commented Feb 21, 2017

jrafanie Feb 21, 2017

Choose a reason for hiding this comment

jrafanie Feb 21, 2017

Choose a reason for hiding this comment

abellotti Feb 23, 2017

Choose a reason for hiding this comment

jrafanie Feb 23, 2017

Choose a reason for hiding this comment

jrafanie commented Feb 21, 2017

bdunne commented Feb 21, 2017

jrafanie commented Feb 21, 2017

miq-bot commented Feb 23, 2017

jrafanie commented Mar 8, 2017

skateman commented Mar 9, 2017

jrafanie commented Mar 10, 2017

carbonin left a comment

Choose a reason for hiding this comment

gtanzillo left a comment

Choose a reason for hiding this comment

simaishi commented Mar 15, 2017