Skip to content

Commit

Permalink
Merge pull request #32 from ntt-sic/doc-update
Browse files Browse the repository at this point in the history
Doc update for 1.1.0 release
  • Loading branch information
sampathP committed Apr 14, 2016
2 parents 5c3e3c7 + be06299 commit 2b32c39
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 13 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,9 @@ Try [masakari-deploy](https://github.com/ntt-sic/masakari-deploy) for all-in-one
- Deploy OpenStack Compute with a shared file system

* pacemaker
- Setup stonith resources external/ipmi
- Setup stonith resources to make sure a failed host must be in shutdown after an error
- If the host is in power-on after evacuating, it could cause double mounted volumes because of
the Nova evacuate API's spec.

* packages
- python-daemon: apt-get install python-daemon
Expand Down
35 changes: 23 additions & 12 deletions docs/evacuation_patterns.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,42 @@
# Masakari's Evacuation Patterns

Masakari starts to evacuate VM instances basing on instance, process and/or
host failures monitored by each process. This documents describes patterns about
the evacuations executed by masakari-controller.
host failures monitored by each process. This documents describes evacuation
patterns executed by masakari-controller.

Each monitor notifies indivisual failures to controller based on what they
monitor. You can choose which falures of instances are rescued by Masakari,
deploying which monitoring process or not.
Each monitoring process notifies indivisual failures to controller based on
what they monitor.
You can choose which kind of failure are rescued by deploying each monitoring
process or not.

## Evacuation patterns

The section shows events that tigger Masakari to call which nova API.
The section shows events that tigger Masakari to call nova API and evacuation patterns
based on the events.
Conditions when the monitoring processes send an event is listed in Detected Failure section.

| Events | Monitored by | Previous instance's status | Rescue steps | Post instance's status |
| Events Types | Monitored by | Previous instance's status | Rescue steps | Post instance's status |
| :--- | :--- | :--- | :--- | :--- |
| instance down | instancemonitor | active | nova.stop -> nova.start | active |
| instance down | instancemonitor | stopped | nova.reset('stopped') | stopped |
| instance down | instancemonitor | stopped *1 | nova.reset('stopped') | stopped |
| instance down | instancemonitor | resized | nova.reset('error') -> nova.stop -> nova.start | active |
| process down | processmonitor | - *2 | nova.service-disable | - |
| host down | hostmonitor | active | nova.evacuate | active |
| host down | hostmonitor | stopped | nova.evacuate | stopped |
| host down | hostmonitor | resized | nova.reset('error') -> nova.evacuate | active |

TBD: The table doesn't show all patterns for the evacuation now.
Feel free to fill it out :-)
*1 Ideally speaking, stopped instances don't exist on hosts, so instancemonitor can't send it.
However, it could happen in some race conditions, so Masakari implements the pattern.

*2 When process down occurs, masakari doesn't change instance status.
It just changes nova-compute status to disable status not to schedule new instance onto the host.

## Detected Failures

The section describes what event monitored by Masakari's monitoring processes.
The section describes Masakari's processes monitor what failures.

TBD
| Monitoring Processes | Failures | Related Event Types |
| :--- | :--- | :--- |
| instancemonitor | VM instance process crush or killing a VM instance | instance down |
| processmonitor | monitored process goes down and unable to restart it | process down |
| hostmonitor | host status in pacemaker is changed to OFFLINE or RemoteOFFLINE | host down |

0 comments on commit 2b32c39

Please sign in to comment.