1
0
mirror of https://github.com/kubernetes-sigs/descheduler.git synced 2026-01-26 13:29:11 +01:00
Commit Graph

661 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
9f918371a2 Merge pull request #1694 from ingvagabund/plugins-new-context
Extend plugin's New with a context.Context
2025-05-19 05:31:14 -07:00
Jan Chaloupka
1974c12e0f Extend plugin's New with a context.Context
The new context.Context can be later used for passing a contextualized
logger. Or, other initialization steps that require the context.
2025-05-19 12:23:44 +02:00
googs1025
b3aeca73db refactor: separate eviction constraints to constraints.go
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-05-19 18:02:41 +08:00
Kubernetes Prow Robot
d34848086c Merge pull request #1691 from googs1025/fix/listall
fix(example): list only active pod
2025-05-12 10:55:14 -07:00
Kubernetes Prow Robot
9aa6d79c21 Merge pull request #1688 from ingvagabund/plugins-taints-do-not-list-all-pods
RemovePodsViolatingNodeTaints: list only pods that are not failed/suceeded
2025-05-12 06:55:18 -07:00
googs1025
7a76d9f0d3 fix(RemovePodsViolatingNodeTaints): list only active pod
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-05-12 21:45:57 +08:00
Kubernetes Prow Robot
71746262b1 Merge pull request #1684 from googs1025/refactor_topology
chore: move namespaces filtering logic to New()
2025-05-11 12:29:14 -07:00
Jan Chaloupka
6691720da5 RemovePodsViolatingNodeTaints: list only pods that are not failed/suceeded
Listing pods was incorrectly changed to listing all pods during code
refactoring.
2025-05-10 21:12:06 +02:00
googs1025
0a691debfb feature: sort pods by restarts count in RemovePodsHavingTooManyRestarts plugin 2025-05-09 13:15:38 +08:00
googs1025
fbc875fac1 chore: move namespaces filtering logic to New() 2025-05-07 19:47:30 +08:00
googs1025
957c5bc8e0 optimize: NodeFit function by reordering checks for performance 2025-05-05 21:09:14 +08:00
Ricardo Maraschini
35a7178df6 feat: introduce strict eviction policy
with strict eviction policy the descheduler only evict pods if the pod
contains a request for the given threshold. for example, if using a
threshold for an extended resource called `example.com/gpu` only pods
who request such a resource will be evicted.
2025-04-09 16:48:32 +02:00
Kubernetes Prow Robot
17b90969cf Merge pull request #1629 from googs1025/fix_sortDomains
fix removepodsviolatingtopologyspreadconstraint plugin sort logic
2025-03-28 09:06:44 -07:00
googs1025
b4b203cc60 chore: add unit test for sortDomains func 2025-03-28 20:58:54 +08:00
googs1025
59d1d5d1b9 making isEvictable and hasSelectorOrAffinity invoked only once
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-03-28 20:57:02 +08:00
Ricardo Maraschini
98e6ed6587 chore: log average and assessed thresholds
when calculating the average an applying the deviations it would be nice
to also see the assessed values.

this commit makes the descheduler logs these values when using level 3.
2025-03-28 10:16:20 +01:00
googs1025
4548723dea fix(removepodsviolatingtopologyspreadconstraint): fix removepodsviolatingtopologyspreadconstraint plugin sort logic
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-03-27 23:49:45 +08:00
Kubernetes Prow Robot
9b9ae9a3be Merge pull request #1650 from ingvagabund/nodeutilization-deviation-skip-nodes-without-extended-resource
[nodeutilization]: skip nodes without extended resource when computing the average utilization
2025-03-20 05:58:31 -07:00
Jan Chaloupka
c22d773200 [nodeutilization] test nodes without extended resource when computing the average utilization 2025-03-20 13:22:28 +01:00
Ricardo Maraschini
95a631f6a5 feat: move classifier to its own package
move the classifier to its own package. introduces a generic way of
classifying usages against thresholds.
2025-03-20 11:02:47 +01:00
Ricardo Maraschini
54d0a22ad1 chore: comment the code and simplify some things
this commit comments the low and high utilization plugins. this also
simplifies a little bit where it was possible without affecting too
much.
2025-03-20 10:06:45 +01:00
Ricardo Maraschini
87ba84b2ad feat: refactoring thresholds and usage assessment
this commit refactors the thresholds and usage assessment for the node
utilization plugins. both high and low plugins are affected by this
change.
2025-03-19 17:38:41 +01:00
Jan Chaloupka
04ebdbee32 [nodeutilization]: produce node utilization of resources that are listed in the list of resources 2025-03-19 12:35:19 +01:00
Jan Chaloupka
e283c31030 [nodeutilization]: prometheus usage client with prometheus metrics 2025-03-17 16:25:17 +01:00
Kubernetes Prow Robot
be4abe1727 Merge pull request #1614 from ingvagabund/nodeutilization-metrics-source
[nodeutilization]: allow to set a metrics source as a string so it can be later extended for exclusive configuration
2025-03-14 01:57:47 -07:00
Jan Chaloupka
e14b86eb8c [nodeutilization]: allow to set a metrics source as a string so it can be later extended for exclusive configuration 2025-03-13 18:00:27 +01:00
Kubernetes Prow Robot
a4d6119bcd Merge pull request #1645 from ingvagabund/nodeutilization-refactoring
nodeutilization: make the node classification more generic
2025-03-13 03:21:46 -07:00
Jan Chaloupka
57bb31de78 nodeutilization: make the classification more generic 2025-03-12 15:02:22 +01:00
Jan Chaloupka
b935c7d82c nodeutilization: invoke ValidateLowNodeUtilizationArgs instead of validateLowNodeUtilizationThresholds to make the test more generic 2025-03-11 10:04:39 +01:00
Jan Chaloupka
5bf11813e6 lownodeutilization: evictionLimits to limit the evictions per plugin
In some cases it might be usefull to limit how many evictions per a
domain can be performed. To avoid burning the whole per descheduling
cycle budget. Limiting the number of evictions per node is a
prerequisite for evicting pods whose usage can't be easily subtracted
from overall node resource usage to predict the final usage. E.g. when a
pod is evicted due to high PSI pressure which takes into account many
factors which can be fully captured by the current predictive resource
model.
2025-03-07 15:31:02 +01:00
Jan Chaloupka
50dd3b8971 ReferencedResourceList: alias for map[v1.ResourceName]*resource.Quantity to avoid the type definition duplication 2025-03-07 13:13:00 +01:00
googs1025
655ab516c7 chore: add error handle for setDefaultEvictor func 2025-02-26 22:34:57 +08:00
googs1025
0d5301ead2 chore: add continue for GetNodeWeightGivenPodPreferredAffinity func 2025-02-25 09:26:02 +08:00
Ricardo Maraschini
57a04aae9f chore: add descheduler plugin example
This commit adds a sample plugin implementation as follow:

This directory provides an example plugin for the Kubernetes Descheduler,
demonstrating how to evict pods based on custom criteria. The plugin targets
pods based on:

* **Name Regex:** Pods matching a specified regular expression.
* **Age:** Pods older than a defined duration.
* **Namespace:** Pods within or outside a given list of namespaces (inclusion
  or exclusion).

To incorporate this plugin into your Descheduler build, you must register it
within the Descheduler's plugin registry. Follow these steps:

1.  **Register the Plugin:**
    * Modify the `pkg/descheduler/setupplugins.go` file.
    * Add the following registration line to the end of the
      `RegisterDefaultPlugins()` function:

    ```go
    pluginregistry.Register(
      example.PluginName,
      example.New,
      &example.Example{},
      &example.ExampleArgs{},
      example.ValidateExampleArgs,
      example.SetDefaults_Example,
      registry,
    )
    ```

2.  **Generate Code:**
    * If you modify the plugin's code, execute `make gen` before rebuilding the
      Descheduler. This ensures generated code is up-to-date.

3.  **Rebuild the Descheduler:**
    * Build the descheduler with your changes.

Configure the plugin's behavior using the Descheduler's policy configuration.
Here's an example:

```yaml
apiVersion: descheduler/v1alpha2
kind: DeschedulerPolicy
profiles:
- name: LifecycleAndUtilization
  plugins:
    deschedule:
      enabled:
        - Example
  pluginConfig:
  - name: Example
    args:
      regex: ^descheduler-test.*$
      maxAge: 3m
      namespaces:
        include:
        - default
```

- `regex: ^descheduler-test.*$`: Evicts pods whose names match the regular
  expression `^descheduler-test.*$`.
- `maxAge: 3m`: Evicts pods older than 3 minutes.
- `namespaces.include: - default`: Evicts pods within the default namespace.

This configuration will cause the plugin to evict pods that meet all three
criteria: matching the `regex`, exceeding the `maxAge`, and residing in the
specified namespace.
2025-02-24 18:36:13 +01:00
Jandai
88af72b907 PodMatchNodeSelector: Replaced PodMatchNodeSelector implementation with k8s.io/component-helpers to reduce code size and optimize 2025-02-16 16:48:21 +08:00
Luke Carrier
d5b609b34a tracing: bump otel semconv to 1.26
Fix:

    E0122 20:09:35.824338  267288 tracing.go:130] "failed to create traceable resource" err=<
            1 errors occurred detecting resource:
                    * conflicting Schema URL: https://opentelemetry.io/schemas/1.24.0 and https://opentelemetry.io/schemas/1.26.0
     >
    E0122 20:09:35.824366  267288 server.go:108] "failed to create tracer provider" err=<
            1 errors occurred detecting resource:
                    * conflicting Schema URL: https://opentelemetry.io/schemas/1.24.0 and https://opentelemetry.io/schemas/1.26.0
     >
2025-01-22 21:14:04 +00:00
Luke Carrier
9c6604fc51 tracing: test for semconv/SDK version conflicts 2025-01-22 21:12:09 +00:00
Kubernetes Prow Robot
335c698b38 Merge pull request #1538 from googs1025/feature/grace_period_seconds
feature(descheduler): add grace_period_seconds for DeschedulerPolicy
2025-01-07 10:42:30 +01:00
googs1025
3440abfa41 chore: add ignorePvcPods flag in default evictor filter unit test 2025-01-07 15:27:28 +08:00
googs1025
e6d0caa1bc feature(descheduler): add grace_period_seconds for DeschedulerPolicy 2025-01-07 10:16:47 +08:00
googs1025
03246d6843 chore: update README.md for DeschedulerPolicy 2025-01-06 20:08:35 +08:00
googs1025
bbffb830b9 feature(eviction): add event when EvictPod failed 2024-12-07 19:38:20 +08:00
Jan Chaloupka
6567f01e86 [nodeutilization]: actual usage client through kubernetes metrics 2024-11-20 14:30:46 +01:00
Kubernetes Prow Robot
a4c09bf560 Merge pull request #1466 from ingvagabund/eviction-in-background-code
Introduce RequestEviction feature for evicting pods in background (KEP-1397)
2024-11-19 14:54:54 +00:00
Jan Chaloupka
3a1a3ff9d8 Introduce RequestEviction feature for evicting pods in background
When the feature is enabled each pod with descheduler.alpha.kubernetes.io/request-evict-only
annotation will have the eviction API error examined for a specific
error code/reason and message. If matched eviction of such a pod will be interpreted
as initiation of an eviction in background.
2024-11-19 15:28:37 +01:00
Jan Chaloupka
d1c64c48cd nodeutilization: separate code responsible for requested resource extraction into a dedicated usage client
Turning a usage client into an interface allows to implement other kinds
of usage clients like actual usage or prometheus based resource
collection.
2024-11-15 11:23:49 +01:00
Jan Chaloupka
9950b8a55d nodeutilization: usage2KeysAndValues for constructing a key:value list for InfoS printing resource usage 2024-11-14 14:15:26 +01:00
Jan Chaloupka
f115e780d8 Define EvictionsInBackground feature gate 2024-11-14 13:29:59 +01:00
Kubernetes Prow Robot
af8a7445a4 Merge pull request #1544 from ingvagabund/node-utilization-refactoring-II
nodeutilization: evictPodsFromSourceNodes: iterate through existing resources
2024-11-13 22:00:47 +00:00
Kubernetes Prow Robot
5ba11e09c7 Merge pull request #1543 from ingvagabund/node-utilization-refactoring-I
nodeutilization: NodeUtilization: make pod utilization extraction configurable
2024-11-13 21:34:47 +00:00