Folgen
Tom Ko
Tom Ko
ByteDance AI Lab Hong Kong
Bestätigte E-Mail-Adresse bei bytedance.com - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Audio augmentation for speech recognition.
T Ko, V Peddinti, D Povey, S Khudanpur
Interspeech 2015, 3586, 2015
14632015
A study on data augmentation of reverberant speech for robust speech recognition
T Ko, V Peddinti, D Povey, ML Seltzer, S Khudanpur
2017 IEEE international conference on acoustics, speech and signal …, 2017
11732017
Self-attentive speaker embeddings for text-independent speaker verification.
Y Zhu, T Ko, D Snyder, B Mak, D Povey
Interspeech 2018, 3573-3577, 2018
3112018
Speecht5: Unified-modal encoder-decoder pre-training for spoken language processing
J Ao, R Wang, L Zhou, C Wang, S Ren, Y Wu, S Liu, T Ko, Q Li, Y Zhang, ...
arXiv preprint arXiv:2110.07205, 2021
2462021
Wavcaps: A chatgpt-assisted weakly-labelled audio captioning dataset for audio-language multimodal research
X Mei, C Meng, H Liu, Q Kong, T Ko, C Zhao, MD Plumbley, Y Zou, ...
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
1462024
Jhu aspire system: Robust lvcsr with tdnns, ivector adaptation and rnn-lms
V Peddinti, G Chen, V Manohar, T Ko, D Povey, S Khudanpur
2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU …, 2015
1382015
An empirical exploration of CTC acoustic models
Y Miao, M Gowayyed, X Na, T Ko, F Metze, A Waibel
2016 IEEE international conference on acoustics, speech and signal …, 2016
1072016
Lighthubert: Lightweight and configurable speech representation learning with once-for-all hidden-unit bert
R Wang, Q Bai, J Ao, L Zhou, Z Xiong, Z Wei, Y Zhang, T Ko, H Li
arXiv preprint arXiv:2203.15610, 2022
602022
An encoder-decoder based audio captioning system with transfer and reinforcement learning
X Mei, Q Huang, X Liu, G Chen, J Wu, Y Wu, J Zhao, S Li, T Ko, HL Tang, ...
arXiv preprint arXiv:2108.02752, 2021
532021
M3ST: Mix at Three Levels for Speech Translation
X Cheng, Q Dong, F Yue, T Ko, M Wang, Y Zou
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
522023
Findings of the iwslt 2023 evaluation campaign
M Agarwal, S Agarwal, A Anastasopoulos, L Bentivogli, O Bojar, C Borg, ...
Association for Computational Linguistics, 2023
522023
Multi-view self-attention based transformer for speaker recognition
R Wang, J Ao, L Zhou, S Liu, Z Wei, T Ko, Q Li, Y Zhang
ICASSP 2022-2022 IEEE international conference on acoustics, speech and …, 2022
502022
Mixup Learning Strategies for Text-Independent Speaker Verification.
Y Zhu, T Ko, B Mak
Interspeech, 4345-4349, 2019
382019
CL4AC: A contrastive loss for audio captioning
X Liu, Q Huang, X Mei, T Ko, HL Tang, MD Plumbley, W Wang
arXiv preprint arXiv:2107.09990, 2021
342021
An investigation of few-shot learning in spoken term classification
Y Chen, T Ko, L Shang, X Chen, X Jiang, Q Li
arXiv preprint arXiv:1812.10233, 2018
32*2018
Token-level supervised contrastive learning for punctuation restoration
Q Huang, T Ko, HL Tang, X Liu, B Wu
arXiv preprint arXiv:2107.09099, 2021
282021
Polyvoice: Language models for speech to speech translation
Q Dong, Z Huang, Q Tian, C Xu, T Ko, Y Zhao, S Feng, T Li, K Wang, ...
arXiv preprint arXiv:2306.02982, 2023
232023
Dub: Discrete unit back-translation for speech translation
D Zhang, R Ye, T Ko, M Wang, Y Zhou
arXiv preprint arXiv:2305.11411, 2023
232023
Prototypical networks for small footprint text-independent speaker verification
T Ko, Y Chen, Q Li
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
232020
Repcodec: A speech representation codec for speech tokenization
Z Huang, C Meng, T Ko
arXiv preprint arXiv:2309.00169, 2023
212023
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20