1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs

Frank Seide, Hao Fu, Jasha Droppo, Gang Li, Dong Yu. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs. In Haizhou Li, Helen M. Meng, Bin Ma, Eng Siong Chng, Lei Xie, editors, INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014. pages 1058-1062, ISCA, 2014. [doi]


Abstract is missing.
