Yupan Huang

Zitiert von

	Alle	Seit 2019
Zitate	924	924
h-index	10	10
i10-index	10	10

400

200

100

300

2020202120222023202413 50 177 399 282

Öffentlicher Zugriff

Alle anzeigen

5 Artikel

1 Artikel

verfügbar

nicht verfügbar

Basierend auf Fördermandaten

Koautoren

Bei LiuMicrosoft ResearchBestätigte E-Mail-Adresse bei microsoft.com
Furu WeiPartner Research Manager, Microsoft ResearchBestätigte E-Mail-Adresse bei microsoft.com
Lei CuiMicrosoft Research AsiaBestätigte E-Mail-Adresse bei microsoft.com
Jianlong FuMicrosoft ResearchBestätigte E-Mail-Adresse bei microsoft.com
Qi DaiMicrosoft ResearchBestätigte E-Mail-Adresse bei microsoft.com
Nigel CollierProfessor of Natural Language Processing, University of CambridgeBestätigte E-Mail-Adresse bei cam.ac.uk

Folgen

Yupan Huang

Microsoft Research

Bestätigte E-Mail-Adresse bei microsoft.com - Startseite

Multimodal AI Computer Vision Natural Language Processing


Titel Nach Zitationen sortieren Nach Jahr sortieren Nach Titel sortieren	Zitiert von Zitiert von	Jahr
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Y Huang, T Lv, L Cui, Y Lu, F Wei Proceedings of the 30th ACM International Conference on Multimedia, 2022	333	2022
Seeing out of the box: End-to-end pre-training for vision-language representation learning Z Huang, Z Zeng, Y Huang*, B Liu, D Fu, J Fu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021	269	2021
Probing inter-modality: Visual parsing with self-attention for vision-and-language pre-training H Xue, Y Huang, B Liu, H Peng, J Fu, H Li, J Luo Advances in Neural Information Processing Systems 34, 4514-4528, 2021	82	2021
Decoupling localization and classification in single shot temporal action detection Y Huang, Q Dai, Y Lu 2019 IEEE International Conference on Multimedia and Expo (ICME), 1288-1293, 2019	57	2019
Unifying multimodal transformer for bi-directional image and text generation Y Huang, H Xue, B Liu, Y Lu Proceedings of the 29th ACM International Conference on Multimedia, 1138-1147, 2021	56	2021
Textdiffuser: Diffusion models as text painters J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei Advances in Neural Information Processing Systems 36, 2024	37	2024
Kosmos-2.5: A Multimodal Literate Model T Lv, Y Huang, J Chen, L Cui, S Ma, Y Chang, S Huang, W Wang, ... arXiv preprint arXiv:2309.11419, 2023	23	2023
Reinforced short-length hashing X Liu, X Nie, Q Dai, Y Huang, L Lian, Y Yin IEEE Transactions on Circuits and Systems for Video Technology 31 (9), 3655-3668, 2020	23	2020
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei arXiv preprint arXiv:2311.16465, 2023	16	2023
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models Y Huang, Z Meng, F Liu, Y Su, N Collier, Y Lu arXiv preprint arXiv:2308.16463, 2023	15	2023
A picture is worth a thousand words: A unified system for diverse captions and rich images generation Y Huang, B Liu, J Fu, Y Lu Proceedings of the 29th ACM International Conference on Multimedia, 2792-2794, 2021	7	2021
Be specific, be clear: Bridging machine and human captions by scene-guided transformer Y Huang, Z Zeng, Y Lu Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia …, 2021	6	2021

Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.

Artikel 1–12

Zitate pro Jahr

Doppelte Zitate

Zusammengeführte Zitate

Koautor hinzufügenKoautoren

Folgen

Zitiert von

Koautoren