1
0
mirror of https://github.com/kubernetes-sigs/descheduler.git synced 2026-01-26 05:14:13 +01:00
Commit Graph

642 Commits

Author SHA1 Message Date
Ricardo Maraschini
95a631f6a5 feat: move classifier to its own package
move the classifier to its own package. introduces a generic way of
classifying usages against thresholds.
2025-03-20 11:02:47 +01:00
Ricardo Maraschini
54d0a22ad1 chore: comment the code and simplify some things
this commit comments the low and high utilization plugins. this also
simplifies a little bit where it was possible without affecting too
much.
2025-03-20 10:06:45 +01:00
Ricardo Maraschini
87ba84b2ad feat: refactoring thresholds and usage assessment
this commit refactors the thresholds and usage assessment for the node
utilization plugins. both high and low plugins are affected by this
change.
2025-03-19 17:38:41 +01:00
Jan Chaloupka
04ebdbee32 [nodeutilization]: produce node utilization of resources that are listed in the list of resources 2025-03-19 12:35:19 +01:00
Jan Chaloupka
e283c31030 [nodeutilization]: prometheus usage client with prometheus metrics 2025-03-17 16:25:17 +01:00
Kubernetes Prow Robot
be4abe1727 Merge pull request #1614 from ingvagabund/nodeutilization-metrics-source
[nodeutilization]: allow to set a metrics source as a string so it can be later extended for exclusive configuration
2025-03-14 01:57:47 -07:00
Jan Chaloupka
e14b86eb8c [nodeutilization]: allow to set a metrics source as a string so it can be later extended for exclusive configuration 2025-03-13 18:00:27 +01:00
Kubernetes Prow Robot
a4d6119bcd Merge pull request #1645 from ingvagabund/nodeutilization-refactoring
nodeutilization: make the node classification more generic
2025-03-13 03:21:46 -07:00
Jan Chaloupka
57bb31de78 nodeutilization: make the classification more generic 2025-03-12 15:02:22 +01:00
Jan Chaloupka
b935c7d82c nodeutilization: invoke ValidateLowNodeUtilizationArgs instead of validateLowNodeUtilizationThresholds to make the test more generic 2025-03-11 10:04:39 +01:00
Jan Chaloupka
5bf11813e6 lownodeutilization: evictionLimits to limit the evictions per plugin
In some cases it might be usefull to limit how many evictions per a
domain can be performed. To avoid burning the whole per descheduling
cycle budget. Limiting the number of evictions per node is a
prerequisite for evicting pods whose usage can't be easily subtracted
from overall node resource usage to predict the final usage. E.g. when a
pod is evicted due to high PSI pressure which takes into account many
factors which can be fully captured by the current predictive resource
model.
2025-03-07 15:31:02 +01:00
Jan Chaloupka
50dd3b8971 ReferencedResourceList: alias for map[v1.ResourceName]*resource.Quantity to avoid the type definition duplication 2025-03-07 13:13:00 +01:00
googs1025
655ab516c7 chore: add error handle for setDefaultEvictor func 2025-02-26 22:34:57 +08:00
googs1025
0d5301ead2 chore: add continue for GetNodeWeightGivenPodPreferredAffinity func 2025-02-25 09:26:02 +08:00
Ricardo Maraschini
57a04aae9f chore: add descheduler plugin example
This commit adds a sample plugin implementation as follow:

This directory provides an example plugin for the Kubernetes Descheduler,
demonstrating how to evict pods based on custom criteria. The plugin targets
pods based on:

* **Name Regex:** Pods matching a specified regular expression.
* **Age:** Pods older than a defined duration.
* **Namespace:** Pods within or outside a given list of namespaces (inclusion
  or exclusion).

To incorporate this plugin into your Descheduler build, you must register it
within the Descheduler's plugin registry. Follow these steps:

1.  **Register the Plugin:**
    * Modify the `pkg/descheduler/setupplugins.go` file.
    * Add the following registration line to the end of the
      `RegisterDefaultPlugins()` function:

    ```go
    pluginregistry.Register(
      example.PluginName,
      example.New,
      &example.Example{},
      &example.ExampleArgs{},
      example.ValidateExampleArgs,
      example.SetDefaults_Example,
      registry,
    )
    ```

2.  **Generate Code:**
    * If you modify the plugin's code, execute `make gen` before rebuilding the
      Descheduler. This ensures generated code is up-to-date.

3.  **Rebuild the Descheduler:**
    * Build the descheduler with your changes.

Configure the plugin's behavior using the Descheduler's policy configuration.
Here's an example:

```yaml
apiVersion: descheduler/v1alpha2
kind: DeschedulerPolicy
profiles:
- name: LifecycleAndUtilization
  plugins:
    deschedule:
      enabled:
        - Example
  pluginConfig:
  - name: Example
    args:
      regex: ^descheduler-test.*$
      maxAge: 3m
      namespaces:
        include:
        - default
```

- `regex: ^descheduler-test.*$`: Evicts pods whose names match the regular
  expression `^descheduler-test.*$`.
- `maxAge: 3m`: Evicts pods older than 3 minutes.
- `namespaces.include: - default`: Evicts pods within the default namespace.

This configuration will cause the plugin to evict pods that meet all three
criteria: matching the `regex`, exceeding the `maxAge`, and residing in the
specified namespace.
2025-02-24 18:36:13 +01:00
Jandai
88af72b907 PodMatchNodeSelector: Replaced PodMatchNodeSelector implementation with k8s.io/component-helpers to reduce code size and optimize 2025-02-16 16:48:21 +08:00
Luke Carrier
d5b609b34a tracing: bump otel semconv to 1.26
Fix:

    E0122 20:09:35.824338  267288 tracing.go:130] "failed to create traceable resource" err=<
            1 errors occurred detecting resource:
                    * conflicting Schema URL: https://opentelemetry.io/schemas/1.24.0 and https://opentelemetry.io/schemas/1.26.0
     >
    E0122 20:09:35.824366  267288 server.go:108] "failed to create tracer provider" err=<
            1 errors occurred detecting resource:
                    * conflicting Schema URL: https://opentelemetry.io/schemas/1.24.0 and https://opentelemetry.io/schemas/1.26.0
     >
2025-01-22 21:14:04 +00:00
Luke Carrier
9c6604fc51 tracing: test for semconv/SDK version conflicts 2025-01-22 21:12:09 +00:00
Kubernetes Prow Robot
335c698b38 Merge pull request #1538 from googs1025/feature/grace_period_seconds
feature(descheduler): add grace_period_seconds for DeschedulerPolicy
2025-01-07 10:42:30 +01:00
googs1025
3440abfa41 chore: add ignorePvcPods flag in default evictor filter unit test 2025-01-07 15:27:28 +08:00
googs1025
e6d0caa1bc feature(descheduler): add grace_period_seconds for DeschedulerPolicy 2025-01-07 10:16:47 +08:00
googs1025
03246d6843 chore: update README.md for DeschedulerPolicy 2025-01-06 20:08:35 +08:00
googs1025
bbffb830b9 feature(eviction): add event when EvictPod failed 2024-12-07 19:38:20 +08:00
Jan Chaloupka
6567f01e86 [nodeutilization]: actual usage client through kubernetes metrics 2024-11-20 14:30:46 +01:00
Kubernetes Prow Robot
a4c09bf560 Merge pull request #1466 from ingvagabund/eviction-in-background-code
Introduce RequestEviction feature for evicting pods in background (KEP-1397)
2024-11-19 14:54:54 +00:00
Jan Chaloupka
3a1a3ff9d8 Introduce RequestEviction feature for evicting pods in background
When the feature is enabled each pod with descheduler.alpha.kubernetes.io/request-evict-only
annotation will have the eviction API error examined for a specific
error code/reason and message. If matched eviction of such a pod will be interpreted
as initiation of an eviction in background.
2024-11-19 15:28:37 +01:00
Jan Chaloupka
d1c64c48cd nodeutilization: separate code responsible for requested resource extraction into a dedicated usage client
Turning a usage client into an interface allows to implement other kinds
of usage clients like actual usage or prometheus based resource
collection.
2024-11-15 11:23:49 +01:00
Jan Chaloupka
9950b8a55d nodeutilization: usage2KeysAndValues for constructing a key:value list for InfoS printing resource usage 2024-11-14 14:15:26 +01:00
Jan Chaloupka
f115e780d8 Define EvictionsInBackground feature gate 2024-11-14 13:29:59 +01:00
Kubernetes Prow Robot
af8a7445a4 Merge pull request #1544 from ingvagabund/node-utilization-refactoring-II
nodeutilization: evictPodsFromSourceNodes: iterate through existing resources
2024-11-13 22:00:47 +00:00
Kubernetes Prow Robot
5ba11e09c7 Merge pull request #1543 from ingvagabund/node-utilization-refactoring-I
nodeutilization: NodeUtilization: make pod utilization extraction configurable
2024-11-13 21:34:47 +00:00
Jan Chaloupka
67d3d52de8 sortNodesByUsage: drop extended resources as they are already counted in 2024-11-13 21:31:02 +01:00
Jan Chaloupka
e9f43856a9 nodeutilization: iterate through existing resources 2024-11-13 15:31:48 +01:00
Jan Chaloupka
e655a7eb27 nodeutilization: NodeUtilization: make pod utilization extraction configurable 2024-11-13 14:21:32 +01:00
Jan Chaloupka
7eeb07d96a Update nodes sorting function to respect available resources 2024-11-11 16:26:56 +01:00
Simon Scharf
ef0c2c1c47 add ignorePodsWithoutPDB option (#1529)
* add ignoreNonPDBPods option

* take2

* add test

* poddisruptionbudgets are now used by defaultevictor plugin

* add poddisruptionbudgets to rbac

* review comments

* don't use GetPodPodDisruptionBudgets

* review comment, don't hide error
2024-10-15 21:21:04 +01:00
Jan Chaloupka
89bd188a35 hnu: move static code from Balance under plugin constructor 2024-10-11 16:49:23 +02:00
Jan Chaloupka
e3c41d6ea6 lnu: move static code from Balance under plugin constructor 2024-10-11 16:37:53 +02:00
Jan Chaloupka
e0ff750fa7 Move default LNU threshold setting under setDefaultForLNUThresholds 2024-10-11 16:31:37 +02:00
Simon Scharf
22d9230a67 Make sure dry runs sees all the resources a normal run would do (#1526)
* generic resource handling, so that dry run has all the expected resource types and objects

* simpler code and better names

* fix imports
2024-10-04 12:20:28 +01:00
Kubernetes Prow Robot
0f1890e5cd Merge pull request #1480 from ingvagabund/omitempty-for-plugin-args
Plugin args: tag arguments with omitempty to reduce the marshalled json size
2024-09-02 12:00:56 +01:00
Jan Chaloupka
29c0a90998 TestPodEvictorReset: replace duplicates strategy with node taints to simplify the testing 2024-08-14 11:00:20 +02:00
Kubernetes Prow Robot
640b675e86 Merge pull request #1484 from ingvagabund/test-descheduling-limits
[unit test]: test descheduling limits
2024-08-14 01:53:04 -07:00
Kubernetes Prow Robot
c0c26e762b Merge pull request #1483 from ingvagabund/dedup-framework-init
tests: de-duplicate framework handle initialization
2024-08-14 01:20:26 -07:00
Jan Chaloupka
91e5e06b5f [unit test]: test descheduling limits 2024-08-14 10:15:58 +02:00
Jan Chaloupka
cbade38d23 [tests] de-duplicate framework handle initialization 2024-08-12 17:05:30 +02:00
Jan Chaloupka
1e0b1a9840 Remove descheduler/v1alpha1 type 2024-08-09 09:49:59 +02:00
Jan Chaloupka
cb0c1b660d Plugin args: tag arguments with omitempty to reduce the marshalled json size 2024-08-06 15:20:18 +02:00
Amir Alavi
a3146a1705 fix: minor version parsing in version compatibility check
Signed-off-by: Amir Alavi <amiralavi7@gmail.com>
2024-07-28 11:44:12 -04:00
Victor Gonzalez
55a0812ae6 skip eviction when pod creation time is below minPodAge threshold setting (#1475)
* skip eviction when pod creation time is below minPodAge threshold setting

In the default initialization phase of the descheduler, add a new
constraint to not evict pods that creation time is below minPodAge
threshold.

Added value:

- Avoid crazy pod movement when the autoscaler scales up and down.

- Avoid evicting pods when they are warming up.

- Decreases the overall cost of eviction as no pod will be evicted
  before doing significant amount of work.

- Guard against scheduling. Descheduling loops in situations where
  the descheduler has a different node fit logic from scheduler,
  like not considering topology spread constraints.

* Use *time.Duration instead of uint for MinPodAge type

* Remove '(in minutes)' from default evictor configuration table

* make fmt

* Add explicit name for Duration field

* Use Duration.String()
2024-07-26 05:59:21 -07:00