DRS: A Deep Reinforcement Learning enhanced Kubernetes Scheduler for
Microservice-based System
Abstract
Recently, Kubernetes is widely used to manage and schedule the resources
of microservices in cloud-native distributed applications, as the most
famous container orchestration framework. However, Kubernetes
preferentially schedules microservices to nodes with rich and balanced
CPU and memory resources on a single node. The native scheduler of
Kubernetes, called Kube-scheduler, may cause resource fragmentation and
decrease resource utilization. In this paper, we propose a deep
reinforcement learning enhanced Kubernetes scheduler named DRS. To
improve resource utilization and reduce load imbalance, we first present
the Kubernetes scheduling problem as a Markov decision process and
elaborately designed the state, action, and reward.
Then, we design and implement DRS mointor to perceive six metrics about
resource utilization to construct a comprehensive global resource view.
Finally, DRS can automatically learn the scheduling policy through
interaction with the Kubernetes cluster, without relying on expert
knowledge about workload and cluster status. We implement a prototype of
DRS in a Kubernetes cluster with five nodes and evaluate its
performance. Experimental results highlight that DRS overcomes the
shortcomings of Kube-scheduler and achieve the expected scheduling
target with three workloads. Compared with Kube-scheduler, DRS brings an
improvement of 27.29% in resource utilization and reduce the load
imbalance by 2 .90× on average, with only 3.27% CPU overhead and
0.648% communication latency.