Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups G Hinton, L Deng, D Yu, GE Dahl, A Mohamed, N Jaitly, A Senior, ... IEEE Signal processing magazine 29 (6), 82-97, 2012 | 13936 | 2012 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint arXiv:2312.11805, 2023 | 2144 | 2023 |
Deep convolutional neural networks for large-scale speech tasks TN Sainath, B Kingsbury, G Saon, H Soltau, A Mohamed, G Dahl, ... Neural networks 64, 39-48, 2015 | 2056 | 2015 |
Convolutional, long short-term memory, fully connected deep neural networks TN Sainath, O Vinyals, A Senior, H Sak 2015 IEEE international conference on acoustics, speech and signal …, 2015 | 2050 | 2015 |
Improving deep neural networks for LVCSR using rectified linear units and dropout GE Dahl, TN Sainath, GE Hinton 2013 IEEE international conference on acoustics, speech and signal …, 2013 | 1921 | 2013 |
State-of-the-art speech recognition with sequence-to-sequence models CC Chiu, TN Sainath, Y Wu, R Prabhavalkar, P Nguyen, Z Chen, ... 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 1472 | 2018 |
Deep learning for audio signal processing H Purwins, B Li, T Virtanen, J Schlüter, SY Chang, T Sainath IEEE Journal of Selected Topics in Signal Processing 13 (2), 206-219, 2019 | 893 | 2019 |
Low-rank matrix factorization for deep neural network training with high-dimensional output targets TN Sainath, B Kingsbury, V Sindhwani, E Arisoy, B Ramabhadran 2013 IEEE international conference on acoustics, speech and signal …, 2013 | 809 | 2013 |
Streaming end-to-end speech recognition for mobile devices Y He, TN Sainath, R Prabhavalkar, I McGraw, R Alvarez, D Zhao, ... ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 747 | 2019 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ... arXiv preprint arXiv:2403.05530, 2024 | 671* | 2024 |
Convolutional neural networks for small-footprint keyword spotting. TN Sainath, C Parada Interspeech, 1478-1482, 2015 | 654 | 2015 |
Learning the speech front-end with raw waveform CLDNNs. TN Sainath, RJ Weiss, AW Senior, KW Wilson, O Vinyals Interspeech, 1-5, 2015 | 623 | 2015 |
Deep belief networks using discriminative features for phone recognition A Mohamed, TN Sainath, G Dahl, B Ramabhadran, GE Hinton, ... 2011 IEEE international conference on acoustics, speech and signal …, 2011 | 410 | 2011 |
A Comparison of sequence-to-sequence models for speech recognition. R Prabhavalkar, K Rao, TN Sainath, B Li, L Johnson, N Jaitly Interspeech, 939-943, 2017 | 398 | 2017 |
Self-supervised speech representation learning: A review A Mohamed, H Lee, L Borgholt, JD Havtorn, J Edin, C Igel, K Kirchhoff, ... IEEE Journal of Selected Topics in Signal Processing 16 (6), 1179-1210, 2022 | 374 | 2022 |
Improvements to deep convolutional neural networks for LVCSR TN Sainath, B Kingsbury, A Mohamed, GE Dahl, G Saon, H Soltau, ... 2013 IEEE workshop on automatic speech recognition and understanding, 315-320, 2013 | 316 | 2013 |
Deep neural network language models E Arisoy, TN Sainath, B Kingsbury, B Ramabhadran Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the …, 2012 | 313 | 2012 |
Multilingual speech recognition with a single end-to-end model S Toshniwal, TN Sainath, RJ Weiss, B Li, P Moreno, E Weinstein, K Rao 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 297 | 2018 |
Structured transforms for small-footprint deep learning V Sindhwani, T Sainath, S Kumar Advances in Neural Information Processing Systems 28, 2015 | 289 | 2015 |
An analysis of incorporating an external language model into a sequence-to-sequence model A Kannan, Y Wu, P Nguyen, TN Sainath, Z Chen, R Prabhavalkar 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 288 | 2018 |