diff --git a/README.md b/README.md index 65427009..fc078b3b 100644 --- a/README.md +++ b/README.md @@ -37,6 +37,91 @@ Then, run `operator-sdk run bundle quay.io/medik8s/fence-agents-remediation-oper * Follow OLM's [instructions](https://sdk.operatorframework.io/docs/building-operators/golang/tutorial/#configure-the-operators-image-registry) on how to configure the operator's image reistry (build and push the operator container). * Run FAR using one the [suggested options from OLM](https://sdk.operatorframework.io/docs/building-operators/golang/tutorial/#run-the-operator) to run it locally, in the cluster, and in the cluster using bundle container (similar to the [above installation](#deploy-the-latest-version)). +## Usage + +FAR is recommended for using with NHC to create a complete solution for unhealty nodes, since NHC detects unhelthy nodes and creates an extrenal remediation CR, e.g., FAR's CR, for unhealthy nodes. +This automated way is preferable as it gives the responsibily on FAR CRs (creation and deletion) to NHC, even though FAR can also act as standalone remediator, but it with expense from the administrator to create and delete CRs. + +Either way a user must be familier with fence agent to be used - Knowing it's parameters and any other requirements on the cluster (e.g., fence_ipmilan needs machines that support IPMI). + +### FAR with NHC + +* Install FAR using one of the above options ([Installation](#installation)). + +* Load the yaml manifest of the FAR template (see below). + +* Modify NodeHealthCheck CR to use FAR as it's remediator - +This is basically a specific use case of an External Remediation of NodeHealthCheck. +In order to set it up, please make sure that Node Health Check is running, FAR controller exists and then creates the necessary CRs (*FenceAgentsRemediationTemplate* and then *NodeHealthCheck*). + +#### Example CRs + +The FAR template, `FenceAgentsRemediationTemplate`, CR is created by the administrator and is used as a template by NHC for creating the NHC CR that represent a request for a Node to be recovered. +For better understanding please see the below example of FAR template object (see it also as the [sample FAR template](https://github.com/medik8s/fence-agents-remediation/blob/main/config/samples/fence-agents-remediation_v1alpha1_fenceagentsremediationtemplate.yaml)): + +```yaml +apiVersion: fence-agents-remediation.medik8s.io/v1alpha1 +kind: FenceAgentsRemediationTemplate +metadata: + name: fenceagentsremediationtemplate-default +spec: + template: {} +``` + +> *Note*: FenceAgentsRemediationTemplate CR must be created in the same namespace that FAR operator has been installed. + +Configuring NodeHealthCheck to use the example `fenceagentsremediationtemplate-default` template above. + +```yaml +apiVersion: remediation.medik8s.io/v1alpha1 +kind: NodeHealthCheck +metadata: + name: nodehealthcheck-sample +spec: + remediationTemplate: + apiVersion: fence-agents-remediation.medik8s.io/v1alpha1 + kind: FenceAgentsRemediationTemplate + name: fenceagentsremediationtemplate-default + namespace: default +``` + +NHC creates FAR CR using FAR Template after it detects an unhelathy node (according to NHC unhealthy conditions). +FAR CRs are deleted by NHC after it sees the Node is healthy again. + +### Standalone FAR + +* Install FAR using one of the above options ([Installation](#installation)). + +* Create FAR CR using the name of the node to be remediated, and the fence-agent parameters. + +#### Example CR + +The FAR, `FenceAgentsRemediation`, CR is created by the admin and is used to trigger the fence-agent on a specific node. The CR includes an *agent* field for the fence-agent name, *sharedparameters* field with all the shared, not specific to a node, parameters, and *nodeparameters* field to specify the parameters for the fenced node. +For better understanding please see the below example of FAR CR for node `worker-1` (see it also as the [sample FAR](https://github.com/medik8s/fence-agents-remediation/blob/main/config/samples/fence-agents-remediation_v1alpha1_fenceagentsremediation.yaml)): + +```yaml +apiVersion: fence-agents-remediation.medik8s.io/v1alpha1 +kind: FenceAgentsRemediation +metadata: + name: worker-1 +spec: + agent: fence_ipmilan + sharedparameters: + --username: "admin" + --password: "password" + --lanplus: "" + --action: "reboot" + --ip: "192.168.111.1" + nodeparameters: + --ipport: + master-0: "6230" + master-1: "6231" + master-2: "6232" + worker-0: "6233" + worker-1: "6234" + worker-2: "6235" +``` + ## Tests ### Run code checks and unit tests