1
0
mirror of https://github.com/kubernetes-sigs/descheduler.git synced 2026-01-26 21:31:18 +01:00
Commit Graph

152 Commits

Author SHA1 Message Date
googs1025
7a76d9f0d3 fix(RemovePodsViolatingNodeTaints): list only active pod
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-05-12 21:45:57 +08:00
Kubernetes Prow Robot
71746262b1 Merge pull request #1684 from googs1025/refactor_topology
chore: move namespaces filtering logic to New()
2025-05-11 12:29:14 -07:00
googs1025
0a691debfb feature: sort pods by restarts count in RemovePodsHavingTooManyRestarts plugin 2025-05-09 13:15:38 +08:00
googs1025
fbc875fac1 chore: move namespaces filtering logic to New() 2025-05-07 19:47:30 +08:00
Ricardo Maraschini
35a7178df6 feat: introduce strict eviction policy
with strict eviction policy the descheduler only evict pods if the pod
contains a request for the given threshold. for example, if using a
threshold for an extended resource called `example.com/gpu` only pods
who request such a resource will be evicted.
2025-04-09 16:48:32 +02:00
Kubernetes Prow Robot
17b90969cf Merge pull request #1629 from googs1025/fix_sortDomains
fix removepodsviolatingtopologyspreadconstraint plugin sort logic
2025-03-28 09:06:44 -07:00
googs1025
b4b203cc60 chore: add unit test for sortDomains func 2025-03-28 20:58:54 +08:00
googs1025
59d1d5d1b9 making isEvictable and hasSelectorOrAffinity invoked only once
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-03-28 20:57:02 +08:00
Ricardo Maraschini
98e6ed6587 chore: log average and assessed thresholds
when calculating the average an applying the deviations it would be nice
to also see the assessed values.

this commit makes the descheduler logs these values when using level 3.
2025-03-28 10:16:20 +01:00
googs1025
4548723dea fix(removepodsviolatingtopologyspreadconstraint): fix removepodsviolatingtopologyspreadconstraint plugin sort logic
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-03-27 23:49:45 +08:00
Kubernetes Prow Robot
9b9ae9a3be Merge pull request #1650 from ingvagabund/nodeutilization-deviation-skip-nodes-without-extended-resource
[nodeutilization]: skip nodes without extended resource when computing the average utilization
2025-03-20 05:58:31 -07:00
Jan Chaloupka
c22d773200 [nodeutilization] test nodes without extended resource when computing the average utilization 2025-03-20 13:22:28 +01:00
Ricardo Maraschini
95a631f6a5 feat: move classifier to its own package
move the classifier to its own package. introduces a generic way of
classifying usages against thresholds.
2025-03-20 11:02:47 +01:00
Ricardo Maraschini
54d0a22ad1 chore: comment the code and simplify some things
this commit comments the low and high utilization plugins. this also
simplifies a little bit where it was possible without affecting too
much.
2025-03-20 10:06:45 +01:00
Ricardo Maraschini
87ba84b2ad feat: refactoring thresholds and usage assessment
this commit refactors the thresholds and usage assessment for the node
utilization plugins. both high and low plugins are affected by this
change.
2025-03-19 17:38:41 +01:00
Jan Chaloupka
04ebdbee32 [nodeutilization]: produce node utilization of resources that are listed in the list of resources 2025-03-19 12:35:19 +01:00
Jan Chaloupka
e283c31030 [nodeutilization]: prometheus usage client with prometheus metrics 2025-03-17 16:25:17 +01:00
Kubernetes Prow Robot
be4abe1727 Merge pull request #1614 from ingvagabund/nodeutilization-metrics-source
[nodeutilization]: allow to set a metrics source as a string so it can be later extended for exclusive configuration
2025-03-14 01:57:47 -07:00
Jan Chaloupka
e14b86eb8c [nodeutilization]: allow to set a metrics source as a string so it can be later extended for exclusive configuration 2025-03-13 18:00:27 +01:00
Kubernetes Prow Robot
a4d6119bcd Merge pull request #1645 from ingvagabund/nodeutilization-refactoring
nodeutilization: make the node classification more generic
2025-03-13 03:21:46 -07:00
Jan Chaloupka
57bb31de78 nodeutilization: make the classification more generic 2025-03-12 15:02:22 +01:00
Jan Chaloupka
b935c7d82c nodeutilization: invoke ValidateLowNodeUtilizationArgs instead of validateLowNodeUtilizationThresholds to make the test more generic 2025-03-11 10:04:39 +01:00
Jan Chaloupka
5bf11813e6 lownodeutilization: evictionLimits to limit the evictions per plugin
In some cases it might be usefull to limit how many evictions per a
domain can be performed. To avoid burning the whole per descheduling
cycle budget. Limiting the number of evictions per node is a
prerequisite for evicting pods whose usage can't be easily subtracted
from overall node resource usage to predict the final usage. E.g. when a
pod is evicted due to high PSI pressure which takes into account many
factors which can be fully captured by the current predictive resource
model.
2025-03-07 15:31:02 +01:00
Jan Chaloupka
50dd3b8971 ReferencedResourceList: alias for map[v1.ResourceName]*resource.Quantity to avoid the type definition duplication 2025-03-07 13:13:00 +01:00
Ricardo Maraschini
57a04aae9f chore: add descheduler plugin example
This commit adds a sample plugin implementation as follow:

This directory provides an example plugin for the Kubernetes Descheduler,
demonstrating how to evict pods based on custom criteria. The plugin targets
pods based on:

* **Name Regex:** Pods matching a specified regular expression.
* **Age:** Pods older than a defined duration.
* **Namespace:** Pods within or outside a given list of namespaces (inclusion
  or exclusion).

To incorporate this plugin into your Descheduler build, you must register it
within the Descheduler's plugin registry. Follow these steps:

1.  **Register the Plugin:**
    * Modify the `pkg/descheduler/setupplugins.go` file.
    * Add the following registration line to the end of the
      `RegisterDefaultPlugins()` function:

    ```go
    pluginregistry.Register(
      example.PluginName,
      example.New,
      &example.Example{},
      &example.ExampleArgs{},
      example.ValidateExampleArgs,
      example.SetDefaults_Example,
      registry,
    )
    ```

2.  **Generate Code:**
    * If you modify the plugin's code, execute `make gen` before rebuilding the
      Descheduler. This ensures generated code is up-to-date.

3.  **Rebuild the Descheduler:**
    * Build the descheduler with your changes.

Configure the plugin's behavior using the Descheduler's policy configuration.
Here's an example:

```yaml
apiVersion: descheduler/v1alpha2
kind: DeschedulerPolicy
profiles:
- name: LifecycleAndUtilization
  plugins:
    deschedule:
      enabled:
        - Example
  pluginConfig:
  - name: Example
    args:
      regex: ^descheduler-test.*$
      maxAge: 3m
      namespaces:
        include:
        - default
```

- `regex: ^descheduler-test.*$`: Evicts pods whose names match the regular
  expression `^descheduler-test.*$`.
- `maxAge: 3m`: Evicts pods older than 3 minutes.
- `namespaces.include: - default`: Evicts pods within the default namespace.

This configuration will cause the plugin to evict pods that meet all three
criteria: matching the `regex`, exceeding the `maxAge`, and residing in the
specified namespace.
2025-02-24 18:36:13 +01:00
googs1025
3440abfa41 chore: add ignorePvcPods flag in default evictor filter unit test 2025-01-07 15:27:28 +08:00
Amir Alavi
48aede9fde update license to year 2025
Signed-off-by: Amir Alavi <amiralavi7@gmail.com>
2025-01-02 13:36:59 -05:00
Jan Chaloupka
6567f01e86 [nodeutilization]: actual usage client through kubernetes metrics 2024-11-20 14:30:46 +01:00
Kubernetes Prow Robot
a4c09bf560 Merge pull request #1466 from ingvagabund/eviction-in-background-code
Introduce RequestEviction feature for evicting pods in background (KEP-1397)
2024-11-19 14:54:54 +00:00
Jan Chaloupka
3a1a3ff9d8 Introduce RequestEviction feature for evicting pods in background
When the feature is enabled each pod with descheduler.alpha.kubernetes.io/request-evict-only
annotation will have the eviction API error examined for a specific
error code/reason and message. If matched eviction of such a pod will be interpreted
as initiation of an eviction in background.
2024-11-19 15:28:37 +01:00
Jan Chaloupka
d1c64c48cd nodeutilization: separate code responsible for requested resource extraction into a dedicated usage client
Turning a usage client into an interface allows to implement other kinds
of usage clients like actual usage or prometheus based resource
collection.
2024-11-15 11:23:49 +01:00
Jan Chaloupka
9950b8a55d nodeutilization: usage2KeysAndValues for constructing a key:value list for InfoS printing resource usage 2024-11-14 14:15:26 +01:00
Kubernetes Prow Robot
af8a7445a4 Merge pull request #1544 from ingvagabund/node-utilization-refactoring-II
nodeutilization: evictPodsFromSourceNodes: iterate through existing resources
2024-11-13 22:00:47 +00:00
Kubernetes Prow Robot
5ba11e09c7 Merge pull request #1543 from ingvagabund/node-utilization-refactoring-I
nodeutilization: NodeUtilization: make pod utilization extraction configurable
2024-11-13 21:34:47 +00:00
Jan Chaloupka
67d3d52de8 sortNodesByUsage: drop extended resources as they are already counted in 2024-11-13 21:31:02 +01:00
Jan Chaloupka
e9f43856a9 nodeutilization: iterate through existing resources 2024-11-13 15:31:48 +01:00
Jan Chaloupka
e655a7eb27 nodeutilization: NodeUtilization: make pod utilization extraction configurable 2024-11-13 14:21:32 +01:00
Jan Chaloupka
7eeb07d96a Update nodes sorting function to respect available resources 2024-11-11 16:26:56 +01:00
Simon Scharf
ef0c2c1c47 add ignorePodsWithoutPDB option (#1529)
* add ignoreNonPDBPods option

* take2

* add test

* poddisruptionbudgets are now used by defaultevictor plugin

* add poddisruptionbudgets to rbac

* review comments

* don't use GetPodPodDisruptionBudgets

* review comment, don't hide error
2024-10-15 21:21:04 +01:00
Jan Chaloupka
89bd188a35 hnu: move static code from Balance under plugin constructor 2024-10-11 16:49:23 +02:00
Jan Chaloupka
e3c41d6ea6 lnu: move static code from Balance under plugin constructor 2024-10-11 16:37:53 +02:00
Jan Chaloupka
e0ff750fa7 Move default LNU threshold setting under setDefaultForLNUThresholds 2024-10-11 16:31:37 +02:00
Kubernetes Prow Robot
0f1890e5cd Merge pull request #1480 from ingvagabund/omitempty-for-plugin-args
Plugin args: tag arguments with omitempty to reduce the marshalled json size
2024-09-02 12:00:56 +01:00
Jan Chaloupka
cbade38d23 [tests] de-duplicate framework handle initialization 2024-08-12 17:05:30 +02:00
Jan Chaloupka
cb0c1b660d Plugin args: tag arguments with omitempty to reduce the marshalled json size 2024-08-06 15:20:18 +02:00
Victor Gonzalez
55a0812ae6 skip eviction when pod creation time is below minPodAge threshold setting (#1475)
* skip eviction when pod creation time is below minPodAge threshold setting

In the default initialization phase of the descheduler, add a new
constraint to not evict pods that creation time is below minPodAge
threshold.

Added value:

- Avoid crazy pod movement when the autoscaler scales up and down.

- Avoid evicting pods when they are warming up.

- Decreases the overall cost of eviction as no pod will be evicted
  before doing significant amount of work.

- Guard against scheduling. Descheduling loops in situations where
  the descheduler has a different node fit logic from scheduler,
  like not considering topology spread constraints.

* Use *time.Duration instead of uint for MinPodAge type

* Remove '(in minutes)' from default evictor configuration table

* make fmt

* Add explicit name for Duration field

* Use Duration.String()
2024-07-26 05:59:21 -07:00
Adam Malcontenti-Wilson
f23967a88e feat: add init and ephemeral container checks to PodLifeTime 2024-07-17 14:36:35 +10:00
Emin Aktas
f8e128d862 refactor: replace k8s.io/utils/pointer with k8s.io/utils/ptr
Signed-off-by: Emin Aktas <eminaktas34@gmail.com>
2024-07-11 11:36:34 +03:00
zhifei92
e60f525ec6 feat: support MaxNoOfPodsToEvictTotal 2024-07-09 14:00:27 +08:00
Jan Chaloupka
18d0e4a540 PodEvictor: turn an exceeded limit into an error
When checking for node limit getting exceeded the pod eviction
never fails. Thus, ignoring the metric reporting when a pod fails
to be evicted due to node limit constrains.

The error also allows plugin to react on other limits getting
exceeded. E.g. the limit on the number of pods evicted per namespace.
2024-07-06 20:14:43 +02:00