Investigating local and global information for automated audio captioning with transfer learning X Xu, H Dinkel, M Wu, Z Xie, K Yu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 57 | 2021 |
Can audio captions be evaluated with image caption metrics? Z Zhou, Z Zhang, X Xu, Z Xie, M Wu, KQ Zhu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 33 | 2022 |
The SJTU system for DCASE2022 challenge task 6: Audio captioning with audio-text retrieval pre-training X Xu, Z Xie, M Wu, K Yu DCASE 2022 Challenge, Tech. Rep., 2022 | 29 | 2022 |
The SJTU system for DCASE2021 challenge task 6: Audio captioning based on encoder pre-training and reinforcement learning X Xu, Z Xie, M Wu, K Yu DCASE2021 Challenge, Tech. Rep, Tech. Rep, 2021 | 16 | 2021 |
Blat: Bootstrapping language-audio pre-training based on audioset tag-guided synthetic data X Xu, Z Zhang, Z Zhou, P Zhang, Z Xie, M Wu, KQ Zhu Proceedings of the 31st ACM International Conference on Multimedia, 2756-2764, 2023 | 8 | 2023 |
Enhance temporal relations in audio captioning with sound event detection Z Xie, X Xu, M Wu, K Yu arXiv preprint arXiv:2306.01533, 2023 | 7 | 2023 |
The X-LANCE system for DCASE2023 challenge task 7: Foley sound synthesis track b Z Xie, X Xu, B Li, M Wu, K Yu Tech. Rep., June, 2023 | 1 | 2023 |
A Detailed Audio-Text Data Simulation Pipeline Using Single-Event Sounds X Xu, X Xu, Z Xie, P Zhang, M Wu, K Yu ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |
Enhancing Audio Generation Diversity with Visual Information Z Xie, B Li, X Xu, M Wu, K Yu ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |
Phonetic and Lexical Discovery of a Canine Language using HuBERT X Li, S Wang, Z Xie, M Wu, KQ Zhu arXiv preprint arXiv:2402.15985, 2024 | | 2024 |
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning X Xu, Z Xie, M Wu, K Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023 | | 2023 |
Improving Audio Caption Fluency with Automatic Error Correction H Zhang, Z Xie, X Xu, M Wu, K Yu arXiv preprint arXiv:2306.10090, 2023 | | 2023 |