From 30b2bd5d9ff1e10a9c399c754640bdfbee98e44b Mon Sep 17 00:00:00 2001 From: Dharma Bellamkonda Date: Fri, 24 Jul 2020 10:55:20 -0600 Subject: [PATCH] Add NPD+CA autohealing use case to user guide --- docs/user-guide.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/user-guide.md b/docs/user-guide.md index 335983eee..d554b20f3 100644 --- a/docs/user-guide.md +++ b/docs/user-guide.md @@ -88,3 +88,12 @@ strategies: params: maxPodLifeTimeSeconds: 604800 # pods run for a maximum of 7 days ``` + +### Autoheal Node Problems +Descheduler's `RemovePodsViolatingNodeTaints` strategy can be combined with +[Node Problem Detector](https://github.com/kubernetes/node-problem-detector/) and +[Cluster Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) to automatically remove +Nodes which have problems. Node Problem Detector can detect specific Node problems and taint any Nodes which have those +problems. The Descheduler will then deschedule workloads from those Nodes. Finally, if the descheduled Node's resource +allocation falls below the Cluster Autoscaler's scale down threshold, the Node will become a scale down candidate +and can be removed by Cluster Autoscaler. These three components form an autohealing cycle for Node problems.