-
Notifications
You must be signed in to change notification settings - Fork 24
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Writing up Evacuation Pattern and Detected Failure doc
Updating evacuating pattern docs.
- Loading branch information
Showing
1 changed file
with
23 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,31 +1,42 @@ | ||
# Masakari's Evacuation Patterns | ||
|
||
Masakari starts to evacuate VM instances basing on instance, process and/or | ||
host failures monitored by each process. This documents describes patterns about | ||
the evacuations executed by masakari-controller. | ||
host failures monitored by each process. This documents describes evacuation | ||
patterns executed by masakari-controller. | ||
|
||
Each monitor notifies indivisual failures to controller based on what they | ||
monitor. You can choose which falures of instances are rescued by Masakari, | ||
deploying which monitoring process or not. | ||
Each monitoring process notifies indivisual failures to controller based on | ||
what they monitor. | ||
You can choose which kind of failure are rescued by deploying each monitoring | ||
process or not. | ||
|
||
## Evacuation patterns | ||
|
||
The section shows events that tigger Masakari to call which nova API. | ||
The section shows events that tigger Masakari to call nova API and evacuation patterns | ||
based on the events. | ||
Conditions when the monitoring processes send an event is listed in Detected Failure section. | ||
|
||
| Events | Monitored by | Previous instance's status | Rescue steps | Post instance's status | | ||
| Events Types | Monitored by | Previous instance's status | Rescue steps | Post instance's status | | ||
| :--- | :--- | :--- | :--- | :--- | | ||
| instance down | instancemonitor | active | nova.stop -> nova.start | active | | ||
| instance down | instancemonitor | stopped | nova.reset('stopped') | stopped | | ||
| instance down | instancemonitor | stopped *1 | nova.reset('stopped') | stopped | | ||
| instance down | instancemonitor | resized | nova.reset('error') -> nova.stop -> nova.start | active | | ||
| process down | processmonitor | - *2 | nova.service-disable | - | | ||
| host down | hostmonitor | active | nova.evacuate | active | | ||
| host down | hostmonitor | stopped | nova.evacuate | stopped | | ||
| host down | hostmonitor | resized | nova.reset('error') -> nova.evacuate | active | | ||
|
||
TBD: The table doesn't show all patterns for the evacuation now. | ||
Feel free to fill it out :-) | ||
*1 Ideally speaking, stopped instances don't exist on hosts, so instancemonitor can't send it. | ||
However, it could happen in some race conditions, so Masakari implements the pattern. | ||
|
||
*2 When process down occurs, masakari doesn't change instance status. | ||
It just changes nova-compute status to disable status not to schedule new instance onto the host. | ||
|
||
## Detected Failures | ||
|
||
The section describes what event monitored by Masakari's monitoring processes. | ||
The section describes Masakari's processes monitor what failures. | ||
|
||
TBD | ||
| Monitoring Processes | Failures | Related Event Types | | ||
| :--- | :--- | :--- | | ||
| instancemonitor | VM instance process crush or killing a VM instance | instance down | | ||
| processmonitor | monitored process goes down and unable to restart it | process down | | ||
| hostmonitor | host status in pacemaker is changed to OFFLINE or RemoteOFFLINE | host down | |