Deep Learning Research and Development Platform: Characterizing and Scheduling with QoS Guarantees on GPU Clusters

Published in TPDS, 2019

Recommended citation: Zhaoyun Chen, Wei Quan, Mei Wen, Jianbin Fang, Jie Yu, Chunyuan Zhang, Lei Luo. "Deep Learning Research and Development Platform: Characterizing and Scheduling with QoS Guarantees on GPU Clusters." TPDS. 2019. http://jianbinfang.github.io/files/2019-07-29-tpds.pdf

This paper proposes GENIE, a QoS-aware dynamic scheduling framework for a shared GPU cluster, which achieves users QoS guarantee and high system utilization.

Download paper here

Recommended citation: Zhaoyun Chen, Wei Quan, Mei Wen, Jianbin Fang, Jie Yu, Chunyuan Zhang, Lei Luo. (2019). “Deep Learning Research and Development Platform: Characterizing and Scheduling with QoS Guarantees on GPU Clusters.” TPDS. 2019.