Skip to content

Releases: koordinator-sh/koordinator

v0.6.1

05 Aug 10:11
Compare
Choose a tag to compare

Changelog

  • 54ed9a5 Add pod uid to pod meta when failover (#344)
  • 1328009 Use the structure as the key of the map instead of string. (#349)
  • f81c89c [koord-runtime-proxy]: fix panic when no hook registered (#355)
  • 42d695f add PodMigrationJob CRD proposal (#358)
  • d1fb8c5 add descheduler framework proposal (#371)
  • 7d46fad add fine-grained device scheduling proposal (#322)
  • 82dc2ac add koord-descheduler (#425)
  • 37a3aec add logs for proxy server (#329)
  • 05a8c11 add pod annotations and labels to container request and cache (#362)
  • 827bd6b add reservation plugin (#353)
  • 78a4ebb add schedule gang md (#333)
  • 993fc21 add scheduling framework extender (#365)
  • 1cf37d0 add xiaohongshu as koordinator adopter (#424)
  • c9cf1a4 api: add PodMigrationJob API (#375)
  • 91cacc4 api: add device crd in scheduling group (#376)
  • dab5a92 api: add device info into NodeMetric CRD (#378)
  • 47e7189 api: update PodMigrationJob and Reservation CRD (#399)
  • 74de8bd api: update reservation api (#384)
  • bb3065a apis: add Gang api definition (#409)
  • 0faf65e bugfix: always need to reset cpuset when cpu supress (#403)
  • f0daee1 bugfix: avoid pod terminating in docker (#445)
  • 1c44a0a bugfix: skip when pod sandbox not found (#444)
  • fbf4d97 change qos func name for old format adaption reason (#418)
  • 5b1ce9d clear cpuset of BE container to avoid conflict with kubelet static policy, using the value of besteffort dir (#412)
  • 6e0d88f cri-runtime-proxy: fix containerErr error when failOver pods and containers (#414)
  • 6918290 feat(deps): bump github.com/stretchr/testify from 1.7.5 to 1.8.0 (#326)
  • 3fce836 feat(deps): bump google.golang.org/protobuf from 1.28.0 to 1.28.1 (#419)
  • d763879 feat(deps): bump gorm.io/driver/sqlite from 1.3.4 to 1.3.6 (#347)
  • f32a0ba feat(deps): bump gorm.io/gorm from 1.23.6 to 1.23.8 (#351)
  • bed2191 feat(deps): bump sigs.k8s.io/yaml from 1.2.0 to 1.3.0 (#427)
  • 5b320c0 feat: add gpu metrics to crd (#397)
  • 4301cc9 feat: collect gpu metrics (#361)
  • 488f8d5 feature: report pod alloc of Guaranteed pod and cpu manager policy (#386)
  • b54bb0c fix auditor test in MacOS (#379)
  • 5bcb7a7 fix koord-descheduler initialize profile error (#432)
  • ecead7c fix reservation on mutil-scheduler (#431)
  • 9e8fc01 fix reservation on pod patch failed (#428)
  • b2fcc22 fix the loss of new updated resources from UpdateContainerResources request (#363)
  • 0523d60 fix: consider lse/lsr when cpu suppress (#234) (#372)
  • bf308ed fix: remove inline tag for corev1.ResourceList to fix #390 (#391)
  • 6ac04d4 improve koordlet log verbosity (#338)
  • a89cd98 koord-descheduler: implement PodMigrationJob controller (#404)
  • 78afa0a koord-descheduler: implement descheduling configuration (#422)
  • 49fa42c koord-descheduler: implement descheduling framework (#423)
  • 3ed131c koord-descheduler: release Reservation when PodMigrationJob completes or is deleted (#438)
  • 9eb7b7d koord-scheduler: compatible with Pods using kubelet static CPU manager policy (#433)
  • c9ad604 koord-scheduler: improve reservation validation (#442)
  • b78243b koord-scheduler: support CPU exclusive policy (#359)
  • 8179245 koord-scheduler: support Node CPU orchestration API (#360)
  • 1ab5c99 koord-scheduler: support default preferredCPUBindPolicy for LSE/LSR Pod if not specified (#354)
  • 1e77f1f koord-scheduler: support kubelet cpu manager policy (#434)
  • 171ad3e koordlet: define GPU metric struct (#343)
  • 7442bc5 koordlet: fix build error on macOS caused by GPU (#413)
  • 779ac80 koordlet: introduce Accelerators feature gate for GPU related features (#393)
  • 91d2a4b koordlet: optimize auditor UT with httptest.Server (#382)
  • 283c883 koordlet: refine initJiffies with default value (#367)
  • 7510a3a make slo configmap name configurable (#415)
  • b8dd567 rename resourceQoS to resourceQOS (#339)
  • 0d9d9d4 style: unify the command parameter style of koordlet (#348)
  • d0194b2 turn on pleg (#394)

v0.6.0

04 Aug 11:02
Compare
Choose a tag to compare

What's Changed

  • add logs for proxy server by @zwzhang0107 in #329
  • chore: remove useless feature-gates by @saintube in #336
  • ci: enable CGO when GoReleaser compiles binaries by @jasonliu747 in #334
  • rename resourceQoS to resourceQOS by @zwzhang0107 in #339
  • improve koordlet log verbosity by @saintube in #338
  • Add pod uid to pod meta when failover by @cheimu in #344
  • cleanup: Use the structure as the key of the map instead of string by @novahe in #349
  • koordlet: define GPU metric struct by @jasonliu747 in #343
  • koord-scheduler: support default preferredCPUBindPolicy for LSE/LSR P… by @eahydra in #354
  • style: unify the command parameter style of koordlet by @jasonliu747 in #348
  • add fine-grained device scheduling proposal by @buptcozy in #322
  • [koord-runtime-proxy]: fix panic when no hook registered by @cheimu in #355
  • koord-scheduler: support CPU exclusive policy by @eahydra in #359
  • [koord-runtime-proxy] Add pod annotations and labels to container request and cache by @cheimu in #362
  • [koord-runtime-proxy] fix the loss of new updated resources from UpdateContainerResources request by @cheimu in #363
  • add scheduling framework extender by @saintube in #365
  • koordlet: refine initJiffies with default value by @jasonliu747 in #367
  • add PodMigrationJob CRD proposal by @eahydra in #358
  • add proposal for gang scheduling by @buptcozy in #333
  • Support node cpu orchestration api by @eahydra in #360
  • chore: update dockerfile for each module by @jasonliu747 in #364
  • feat(deps): bump github.com/stretchr/testify from 1.7.5 to 1.8.0 by @dependabot in #326
  • feat(deps): bump gorm.io/driver/sqlite from 1.3.4 to 1.3.6 by @dependabot in #347
  • chore: supply UT for pkg/util and pkg/util/system by @ZiMengSheng in #374
  • api: add PodMigrationJob API by @eahydra in #375
  • docs: remove redundant field in Device CRD by @jasonliu747 in #377
  • api: add device CRD in scheduling group by @jasonliu747 in #376
  • fix auditor test in MacOS by @hormes in #379
  • koordlet: optimize auditor UT with httptest.Server by @ZiMengSheng in #382
  • docs: add chinese version readme.md by @ZiMengSheng in #380
  • fix: consider lse/lsr when cpu suppress (#234) by @ZYecho in #372
  • api: add device info into NodeMetric CRD by @jasonliu747 in #378
  • koordlet: support collecting GPU metrics from node/pod/container by @LambdaHJ in #361
  • chore: cleanup resmanager by @saintube in #383
  • api: update reservation api by @saintube in #384
  • add descheduler framework proposal by @eahydra in #371
  • feat(deps): bump gorm.io/gorm from 1.23.6 to 1.23.8 by @dependabot in #351
  • fix: remove inline tag for corev1.ResourceList to fix #390 by @jasonliu747 in #391
  • koordlet: Turn on pleg by @cheimu in #394
  • feat: update GPU metrics in NodeMetric CRD by @LambdaHJ in #397
  • bugfix: always need to reset cpuset when cpu supress by @ZYecho in #403
  • feature: report pod alloc of Guaranteed pod and cpu manager policy by @ZYecho in #386
  • api: update PodMigrationJob and Reservation CRD by @eahydra in #399
  • koordlet: introduce Accelerators feature gate for GPU related features by @jasonliu747 in #393
  • koordlet: fix build error caused by GPU by @eahydra in #413
  • cri-runtime-proxy: fix containerErr error when failOver pods and cont… by @lx1036 in #414
  • make slo configmap name configurable by @zwzhang0107 in #415
  • clear cpuset of BE container to avoid conflict with kubelet static po… by @zwzhang0107 in #412
  • change qos func name for old format adaption reason by @zwzhang0107 in #418
  • docs: add ADOPTERS.md of Koordinator by @jasonliu747 in #392
  • koord-descheduler: implement descheduling configuration by @eahydra in #422
  • chore: execute staticcheck instead of github action by running golang… by @eahydra in #421
  • koord-scheduler: add reservation plugin by @saintube in #353
  • koord-descheduler: implement descheduling framework by @eahydra in #423
  • [adopter] add xiaohongshu as koordinator adopter by @cheimu in #424
  • add koord-descheduler by @eahydra in #425
  • fix reservation on pod patch failed by @saintube in #428
  • koord-descheduler: implement PodMigrationJob controller by @eahydra in #404
  • fix reservation on mutil-scheduler by @saintube in #431
  • fix koord-descheduler initialize profile error by @eahydra in #432
  • api: add Gang api by @Wenshiqi222 in #409
  • koord-scheduler: compatible with Pods using kubelet static CPU manager policy by @eahydra in #433
  • koord-scheduler: support kubelet cpu manager policy by @eahydra in #434
  • docs: add maturity level in adopters.md by @jasonliu747 in #426
  • feat(deps): bump google.golang.org/protobuf from 1.28.0 to 1.28.1 by @dependabot in #419
  • feat(deps): bump sigs.k8s.io/yaml from 1.2.0 to 1.3.0 by @dependabot in #427
  • koord-descheduler: release Reservation when PodMigrationJob completes or is deleted by @eahydra in #438
  • koord-scheduler: improve reservation validation by @saintube in #442

New Contributors

Full Changelog: v0.5.0...v0.6.0

v0.5.0

30 Jun 03:48
1fa6ec8
Compare
Choose a tag to compare

Changelog

New Contributors

Full Changelog: v0.4.1...v0.5.0

v0.4.1

17 Jun 10:08
e6d2498
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.4.1

v0.4.0

31 May 12:57
49850ce
Compare
Choose a tag to compare

✨ Features and improvements:

🐛 Fixed bugs:

⏫ Merged pull requests:

🎉 New Contributors:

Full Changelog: v0.3.1...0.4.0

v0.3.1

09 May 08:21
54d52c7
Compare
Choose a tag to compare

🐛 Fixed bugs:

  • Activate rdt res_ctrl in resmanager by @cheimu in #127

🎉 New Contributors

Full Changelog: v0.3.0...v0.3.1

v0.3.0

07 May 07:10
5b00789
Compare
Choose a tag to compare

Full Changelog

✨ Features and improvements :

  • Support CPU burst strategy #52
  • Support Memory QoS strategy #55
  • Support LLC and MBA isolation strategy #56
  • Protocol design between runtime-manager and hook server #62
  • Improve overall code coverage from 39% to 56% #69

🐛 Fixed bugs:

  • when deploy on ACK 1.18.1 koord-manager pod always crash #49
  • Handle unexpected CPU info in case of koordlet panic #90

⏫ Merged pull requests:

🎉 New Contributors

v0.2.0

20 Apr 10:09
Compare
Choose a tag to compare

Isolate resources for best-effort workloads

In Koodinator v0.2.0, we refined the ability to isolate resources for best-effort worklods.

koordlet will set the cgroup parameters according to the resources described in the Pod Spec. Currently supports setting CPU Request/Limit, and Memory Limit.

For CPU resources, only the case of request == limit is supported, and the support for the scenario of request <= limit will be supported in the next version.

Active eviction mechanism based on memory safety thresholds

When latency-sensitiv applications are serving, memory usage may increase due to bursty traffic. Similarly, there may be similar scenarios for best-effort workloads, for example, the current computing load exceeds the expected resource Request/Limit.

These scenarios will lead to an increase in the overall memory usage of the node, which will have an unpredictable impact on the runtime stability of the node side. For example, it can reduce the quality of service of latency-sensitiv applications or even become unavailable. Especially in a co-location environment, it is more challenging.

We implemented an active eviction mechanism based on memory safety thresholds in Koodinator.

koordlet will regularly check the recent memory usage of node and Pods to check whether the safty threshold is exceeded. If it exceeds, it will evict some best-effort Pods to release memory. This mechanism can better ensure the stability of node and latency-sensitiv applications.

koordlet currently only evicts best-effort Pods, sorted according to the Priority specified in the Pod Spec. The lower the priority, the higher the priority to be evicted, the same priority will be sorted according to the memory usage rate (RSS), the higher the memory usage, the higher the priority to be evicted. This eviction selection algorithm is not static. More dimensions will be considered in the future, and more refined implementations will be implemented for more scenarios to achieve more reasonable evictions.

The current memory utilization safety threshold default value is 70%. You can modify the memoryEvictThresholdPercent in ConfigMap slo-controller-config according to the actual situation,

apiVersion: v1
kind: ConfigMap
metadata:
  name: slo-controller-config
  namespace: koordinator-system
data:
  colocation-config: |
    {
      "enable": true
    }
  resource-threshold-config: |
    {
      "clusterStrategy": {
        "enable": true,
        "memoryEvictThresholdPercent": 70
      }
    }

v0.1.0

02 Apr 07:51
ffb8196
Compare
Choose a tag to compare

Node Metrics

Koordinator defines the NodeMetrics CRD, which is used to record the resource utilization of a single node and all Pods on the node. koordlet will regularly report and update NodeMetrics. You can view NodeMetrics with the following commands.

$ kubectl get nodemetrics node-1 -o yaml
apiVersion: slo.koordinator.sh/v1alpha1
kind: NodeMetric
metadata:
  creationTimestamp: "2022-03-30T11:50:17Z"
  generation: 1
  name: node-1
  resourceVersion: "2687986"
  uid: 1567bb4b-87a7-4273-a8fd-f44125c62b80
spec: {}
status:
  nodeMetric:
    nodeUsage:
      resources:
        cpu: 138m
        memory: "1815637738"
  podsMetric:
  - name: storage-service-6c7c59f868-k72r5
    namespace: default
    podUsage:
      resources:
        cpu: "300m"
        memory: 17828Ki

Colocation Resources

After the Koordinator is deployed in the K8s cluster, the Koordinator will calculate the CPU and Memory resources that have been allocated but not used according to the data of NodeMetrics. These resources are updated in Node in the form of extended resources.

koordinator.sh/batch-cpu represents the CPU resources for Best Effort workloads,
koordinator.sh/batch-memory represents the Memory resources for Best Effort workloads.

You can view these resources with the following commands.

$ kubectl describe node node-1
Name:               node-1
....
Capacity:
  cpu:                          8
  ephemeral-storage:            103080204Ki
  koordinator.sh/batch-cpu:     4541
  koordinator.sh/batch-memory:  17236565027
  memory:                       32611012Ki
  pods:                         64
Allocatable:
  cpu:                          7800m
  ephemeral-storage:            94998715850
  koordinator.sh/batch-cpu:     4541
  koordinator.sh/batch-memory:  17236565027
  memory:                       28629700Ki
  pods:                         64

Cluster-level Colocation Profile

In order to make it easier for everyone to use Koordinator to co-locate different workloads, we defined ClusterColocationProfile to help gray workloads use co-location resources. A ClusterColocationProfile is CRD like the one below. Please do edit each parameter to fit your own use cases.

apiVersion: config.koordinator.sh/v1alpha1
kind: ClusterColocationProfile
metadata:
  name: colocation-profile-example
spec:
  namespaceSelector:
    matchLabels:
      koordinator.sh/enable-colocation: "true"
  selector:
    matchLabels:
      sparkoperator.k8s.io/launched-by-spark-operator: "true"
  qosClass: BE
  priorityClassName: koord-batch
  koordinatorPriority: 1000
  schedulerName: koord-scheduler
  labels:
    koordinator.sh/mutated: "true"
  annotations: 
    koordinator.sh/intercepted: "true"
  patch:
    spec:
      terminationGracePeriodSeconds: 30

Various Koordinator components ensure scheduling and runtime quality through labels koordinator.sh/qosClass, koordinator.sh/priority and kubernetes native priority.

With the webhook mutating mechanism provided by Kubernetes, koord-manager will modify Pod resource requirements to co-located resources, and inject the QoS and Priority defined by Koordinator into Pod.

Taking the above Profile as an example, when the Spark Operator creates a new Pod in the namespace with the koordinator.sh/enable-colocation=true label, the Koordinator QoS label koordinator.sh/qosClass will be injected into the Pod. According to the Profile definition PriorityClassName, modify the Pod's PriorityClassName and the corresponding Priority value. Users can also set the Koordinator Priority according to their needs to achieve more fine-grained priority management, so the Koordinator Priority label koordinator.sh/priority is also injected into the Pod. Koordinator provides an enhanced scheduler koord-scheduler, so you need to modify the Pod's scheduler name koord-scheduler through Profile.

If you expect to integrate Koordinator into your own system, please learn more about the core concepts.

CPU Suppress

In order to ensure the runtime quality of different workloads in co-located scenarios, Koordinator uses the CPU Suppress mechanism provided by koordlet on the node side to suppress workloads of the Best Effort type when the load increases. Or increase the resource quota for Best Effort type workloads when the load decreases.

When installing through the helm chart, the ConfigMap slo-controller-config will be created in the koordinator-system namespace, and the CPU Suppress mechanism is enabled by default. If it needs to be closed, refer to the configuration below, and modify the configuration of the resource-threshold-config section to take effect.

apiVersion: v1
kind: ConfigMap
metadata:
  name: slo-controller-config
  namespace: {{ .Values.installation.namespace }}
data:
  ...
  resource-threshold-config: |
    {
      "clusterStrategy": {
        "enable": false
      }
    }

Colocation Resources Balance

Koordinator currently adopts a strategy for node co-location resource scheduling, which prioritizes scheduling to machines with more resources remaining in co-location to avoid Best Effort workloads crowding together. More rich scheduling capabilities are on the way.