Minjia Zhang
Minjia Zhang
Microsoft Research
Verified email at - Homepage
Cited by
Cited by
Memcached design on high performance rdma capable interconnects
J Jose, H Subramoni, M Luo, M Zhang, J Huang, M Wasi-ur-Rahman, ...
2011 International Conference on Parallel Processing, 743-752, 2011
Learning intrinsic sparse structures within long short-term memory
W Wen, Y He, S Rajbhandari, M Zhang, W Wang, F Liu, B Hu, Y Chen, ...
arXiv preprint arXiv:1709.05027, 2017
Deepcpu: Serving rnn-based deep learning models 10x faster
M Zhang, S Rajbhandari, W Wang, Y He
2018 {USENIX} Annual Technical Conference ({USENIX}{ATC} 18), 951-965, 2018
ZeRO-Offload: Democratizing Billion-Scale Model Training.
J Ren, S Rajbhandari, RY Aminabadi, O Ruwase, S Yang, M Zhang, D Li, ...
USENIX Annual Technical Conference, 551-564, 2021
Valor: Efficient, software-only region conflict exceptions
S Biswas, M Zhang, MD Bond, B Lucia
ACM SIGPLAN Notices 50 (10), 241-259, 2015
Bloom: A 176b-parameter open-access multilingual language model
TL Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
arXiv preprint arXiv:2211.05100, 2022
Octet: Capturing and controlling cross-thread dependences efficiently
MD Bond, M Kulkarni, M Cao, M Zhang, M Fathi Salmi, S Biswas, ...
ACM SIGPLAN Notices 48 (10), 693-712, 2013
Deepspeed-moe: Advancing mixture-of-experts inference and training to power next-generation ai scale
S Rajbhandari, C Li, Z Yao, M Zhang, RY Aminabadi, AA Awan, J Rasley, ...
International Conference on Machine Learning, 18332-18346, 2022
Hybrid static–dynamic analysis for statically bounded region serializability
A Sengupta, S Biswas, M Zhang, MD Bond, M Kulkarni
ACM SIGPLAN Notices 50 (4), 561-575, 2015
Accelerating training of transformer-based language models with progressive layer dropping
M Zhang, Y He
Advances in Neural Information Processing Systems 33, 14011-14023, 2020
Navigating with graph representations for fast and scalable decoding of neural language models
M Zhang, W Wang, X Liu, J Gao, Y He
Advances in neural information processing systems 31, 2018
VirtCFT: A transparent VM-level fault-tolerant system for virtual clusters
M Zhang, H Jin, X Shi, S Wu
2010 IEEE 16th International Conference on Parallel and Distributed Systems …, 2010
Low-overhead software transactional memory with progress guarantees and strong semantics
M Zhang, J Huang, M Cao, MD Bond
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of …, 2015
Lightweight data race detection for production runs
S Biswas, M Cao, M Zhang, MD Bond, BP Wood
Proceedings of the 26th International Conference on Compiler Construction, 11-21, 2017
Hm-ann: Efficient billion-point nearest neighbor search on heterogeneous memory
J Ren, M Zhang, D Li
Advances in Neural Information Processing Systems 33, 10672-10684, 2020
Improving approximate nearest neighbor search through learned adaptive early termination
C Li, M Zhang, DG Andersen, Y He
Proceedings of the 2020 ACM SIGMOD International Conference on Management of …, 2020
Sentinel: Efficient tensor migration and allocation on heterogeneous memory systems for deep learning
J Ren, J Luo, K Wu, M Zhang, H Jeon, D Li
2021 IEEE International Symposium on High-Performance Computer Architecture …, 2021
Grip: Multi-store capacity-optimized high-performance nearest neighbor search for vector search engine
M Zhang, Y He
Proceedings of the 28th ACM International Conference on Information and …, 2019
LSTM-sharp: An adaptable, energy-efficient hardware accelerator for long short-term memory
R Yazdani, O Ruwase, M Zhang, Y He, JM Arnau, A González
arXiv preprint arXiv:1911.01258, 2019
Drinking from both glasses: Combining pessimistic and optimistic tracking of cross-thread dependences
M Cao, M Zhang, A Sengupta, MD Bond
ACM SIGPLAN Notices 51 (8), 1-13, 2016
The system can't perform the operation now. Try again later.
Articles 1–20