Hierarchical transformer-based large-context end-to-end asr with large-context knowledge distillation R Masumura, N Makishima, M Ihori, A Takashima, T Tanaka, S Orihashi ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 34 | 2021 |
Sequence-level consistency training for semi-supervised end-to-end automatic speech recognition R Masumura, M Ihori, A Takashima, T Moriya, A Ando, Y Shinohara ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 21 | 2020 |
Large-context pointer-generator networks for spoken-to-written style conversion M Ihori, A Takashima, R Masumura ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 13 | 2020 |
Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition. R Masumura, N Makishima, M Ihori, A Takashima, T Tanaka, S Orihashi INTERSPEECH, 2822-2826, 2020 | 9 | 2020 |
Cross-modal transformer-based neural correction models for automatic speech recognition T Tanaka, R Masumura, M Ihori, A Takashima, T Moriya, T Ashihara, ... arXiv preprint arXiv:2107.01569, 2021 | 8 | 2021 |
Audio-visual speech separation using cross-modal correspondence loss N Makishima, M Ihori, A Takashima, T Tanaka, S Orihashi, R Masumura ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 8 | 2021 |
Strategies to improve robustness of target speech extraction to enrollment variations H Sato, T Ochiai, M Delcroix, K Kinoshita, T Moriya, N Makishima, M Ihori, ... arXiv preprint arXiv:2206.08174, 2022 | 7 | 2022 |
Enrollment-less training for personalized voice activity detection N Makishima, M Ihori, T Tanaka, A Takashima, S Orihashi, R Masumura arXiv preprint arXiv:2106.12132, 2021 | 7 | 2021 |
End-to-end automatic speech recognition with deep mutual learning R Masumura, M Ihori, A Takashima, T Tanaka, T Ashihara 2020 Asia-Pacific Signal and Information Processing Association Annual …, 2020 | 7 | 2020 |
Parallel corpus for Japanese spoken-to-written style conversion M Ihori, A Takashima, R Masumura Proceedings of the Twelfth Language Resources and Evaluation Conference …, 2020 | 7 | 2020 |
Improving speech-based end-of-turn detection via cross-modal representation learning with punctuated text data R Masumura, M Ihori, T Tanaka, A Ando, R Ishii, T Oba, R Higashinaka 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 7 | 2019 |
Unified autoregressive modeling for joint end-to-end multi-talker overlapped speech recognition and speaker attribute estimation R Masumura, D Okamura, N Makishima, M Ihori, A Takashima, T Tanaka, ... arXiv preprint arXiv:2107.01549, 2021 | 5 | 2021 |
Generalized large-context language models based on forward-backward hierarchical recurrent encoder-decoder models R Masumura, M Ihori, T Tanaka, I Saito, K Nishida, T Oba 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 5 | 2019 |
End-to-end rich transcription-style automatic speech recognition with semi-supervised learning T Tanaka, R Masumura, M Ihori, A Takashima, S Orihashi, N Makishima arXiv preprint arXiv:2107.05382, 2021 | 4 | 2021 |
Mapgn: Masked pointer-generator network for sequence-to-sequence pre-training M Ihori, N Makishima, T Tanaka, A Takashima, S Orihashi, R Masumura ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 4 | 2021 |
Memory attentive fusion: External language model integration for transformer-based sequence-to-sequence model M Ihori, R Masumura, N Makishima, T Tanaka, A Takashima, S Orihashi arXiv preprint arXiv:2010.15437, 2020 | 4 | 2020 |
Zero-shot joint modeling of multiple spoken-text-style conversion tasks using switching tokens M Ihori, N Makishima, T Tanaka, A Takashima, S Orihashi, R Masumura arXiv preprint arXiv:2106.12131, 2021 | 3 | 2021 |
Unsupervised Domain Adaptation for Dialogue Sequence Labeling Based on Hierarchical Adversarial Training. S Orihashi, M Ihori, T Tanaka, R Masumura INTERSPEECH, 1575-1579, 2020 | 3 | 2020 |
Downstream task agnostic speech enhancement with self-supervised representation loss H Sato, R Masumura, T Ochiai, M Delcroix, T Moriya, T Ashihara, ... arXiv preprint arXiv:2305.14723, 2023 | 2 | 2023 |
Transcribing speech as spoken and written dual text using an autoregressive model M Ihori, H Sato, T Tanaka, R Masumura, S Mizuno, N Hojo Proc. INTERSPEECH 2023, 461-465, 2023 | 2 | 2023 |