Deep Learning Research and Development Platform: Characterizing and Scheduling with QoS Guarantees on GPU Clusters
Published in TPDS, 2019
Recommended citation: Zhaoyun Chen, Wei Quan, Mei Wen, Jianbin Fang, Jie Yu, Chunyuan Zhang, Lei Luo. "Deep Learning Research and Development Platform: Characterizing and Scheduling with QoS Guarantees on GPU Clusters." TPDS. 2019. http://jianbinfang.github.io/files/2019-07-29-tpds.pdf
This paper proposes GENIE, a QoS-aware dynamic scheduling framework for a shared GPU cluster, which achieves users QoS guarantee and high system utilization.
Recommended citation: Zhaoyun Chen, Wei Quan, Mei Wen, Jianbin Fang, Jie Yu, Chunyuan Zhang, Lei Luo. (2019). “Deep Learning Research and Development Platform: Characterizing and Scheduling with QoS Guarantees on GPU Clusters.” TPDS. 2019.