Zhengyuan Yang

Zitiert von

	Alle	Seit 2019
Zitate	4844	4830
h-index	29	29
i10-index	36	36

2000

1000

500

1500

20192020202120222023202461 130 303 629 1710 1991

Öffentlicher Zugriff

Alle anzeigen

16 Artikel

0 Artikel

verfügbar

nicht verfügbar

Basierend auf Fördermandaten

Koautoren

Lijuan WangMicrosoft GenAIBestätigte E-Mail-Adresse bei microsoft.com
Jianfeng WangMicrosoftBestätigte E-Mail-Adresse bei microsoft.com
Zicheng LiuMicrosoftBestätigte E-Mail-Adresse bei microsoft.com
Linjie (Lindsey) LiSenior Researcher, MicrosoftBestätigte E-Mail-Adresse bei microsoft.com
Jiebo LuoAlbert Arendt Hopeman Professor of Engineering, University of RochesterBestätigte E-Mail-Adresse bei cs.rochester.edu
Kevin LinMicrosoftBestätigte E-Mail-Adresse bei microsoft.com
Zhe GanResearch Scientist, AppleBestätigte E-Mail-Adresse bei apple.com
Ce LiuAI Research Scientist Director, Meta GenAI; IEEE FellowBestätigte E-Mail-Adresse bei meta.com
Liwei WangAssistant Professor at The Chinese University of Hong KongBestätigte E-Mail-Adresse bei cse.cuhk.edu.hk
Jinsong SuXiamen UniversityBestätigte E-Mail-Adresse bei xmu.edu.cn
Jianwei YangPrincipal Researcher, Microsoft Research, RedmondBestätigte E-Mail-Adresse bei microsoft.com
Jiajun Deng (邓家俊)University of Adelaide, Australian Institute for Machine LearningBestätigte E-Mail-Adresse bei adelaide.edu.au
Yuncheng LiGoogleBestätigte E-Mail-Adresse bei google.com
Chenglei SiStanford UniversityBestätigte E-Mail-Adresse bei stanford.edu
Boqing GongResearch Scientist, GoogleBestätigte E-Mail-Adresse bei google.com

Folgen

Zhengyuan Yang

Researcher, Microsoft

Bestätigte E-Mail-Adresse bei microsoft.com - Startseite

Computer Vision Multimedia Vision + Language Multimodal


Titel Nach Zitationen sortieren Nach Jahr sortieren Nach Titel sortieren	Zitiert von Zitiert von	Jahr
Git: A generative image-to-text transformer for vision and language J Wang, Z Yang, X Hu, L Li, K Lin, Z Gan, Z Liu, C Liu, L Wang Transactions on Machine Learning Research (TMLR), 2022	419	2022
A fast and accurate one-stage approach to visual grounding Z Yang, B Gong, L Wang, W Huang, D Yu, J Luo IEEE International Conference on Computer Vision (ICCV), 4683-4693, 2019	345	2019
An empirical study of gpt-3 for few-shot knowledge-based vqa Z Yang, Z Gan, J Wang, X Hu, Y Lu, Z Liu, L Wang Proceedings of the AAAI conference on artificial intelligence 36 (3), 3081-3089, 2022	335	2022
The dawn of lmms: Preliminary explorations with gpt-4v (ision) Z Yang, L Li, K Lin, J Wang, CC Lin, Z Liu, L Wang arXiv preprint arXiv:2309.17421 9 (1), 1, 2023	332	2023
TransVG: End-to-End Visual Grounding with Transformers J Deng, Z Yang, T Chen, W Zhou, H Li IEEE International Conference on Computer Vision (ICCV), 2021	281	2021
Scaling up vision-language pre-training for image captioning X Hu, Z Gan, J Wang, Z Yang, Z Liu, Y Lu, L Wang Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022	242	2022
Mm-react: Prompting chatgpt for multimodal reasoning and action Z Yang, L Li, J Wang, K Lin, E Azarnasab, F Ahmed, Z Liu, C Liu, M Zeng, ... arXiv preprint arXiv:2303.11381, 2023	240	2023
Improving One-stage Visual Grounding by Recursive Sub-query Construction Z Yang, T Chen, L Wang, J Luo European Conference on Computer Vision (ECCV), 2020	211	2020
Mm-vet: Evaluating large multimodal models for integrated capabilities W Yu, Z Yang, L Li, J Wang, K Lin, Z Liu, X Wang, L Wang The 41st International Conference on Machine Learning (ICML), 2024	208	2024
Prompting gpt-3 to be reliable C Si, Z Gan, Z Yang, S Wang, J Wang, J Boyd-Graber, L Wang International Conference on Learning Representations (ICLR 23), 2022	186	2022
End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions Z Yang, Y Zhang, J Yu, J Cai, J Luo 2018 24th international conference on pattern recognition (ICPR), 2289-2294, 2018	186	2018
Action recognition with spatio–temporal visual attention on skeleton image sequences Z Yang, Y Li, J Yang, J Luo IEEE Transactions on Circuits and Systems for Video Technology 29 (8), 2405-2415, 2018	184	2018
Attentive relational networks for mapping images to scene graphs M Qi, W Li, Z Yang, Y Wang, J Luo IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 3957-3966, 2019	170	2019
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption Z Yang, Y Lu, J Wang, X Yin, D Florencio, L Wang, C Zhang, L Zhang, ... IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021	155	2021
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation Y Yin, F Meng, J Su, C Zhou, Z Yang, J Zhou, J Luo Annual Meeting of the Association for Computational Linguistics (ACL), 2020	140	2020
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling Z Yang, Z Gan, J Wang, X Hu, F Ahmed, Z Liu, Y Lu, L Wang European Conference on Computer Vision (ECCV), 521--539, 2022	130*	2022
Multimodal foundation models: From specialists to general-purpose assistants C Li, Z Gan, Z Yang, J Yang, L Li, L Wang, J Gao Foundations and Trends® in Computer Graphics and Vision 16 (1-2), 1-214, 2024	106	2024
Promptcap: Prompt-guided image captioning for vqa with gpt-3 Y Hu, H Hua, Z Yang, W Shi, NA Smith, J Luo Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	90*	2023
SAT: 2D Semantics Assisted Training for 3D Visual Grounding Z Yang, S Zhang, L Wang, J Luo IEEE International Conference on Computer Vision (ICCV), 2021	89	2021
ReCo: Region-Controlled Text-to-Image Generation Z Yang, J Wang, Z Gan, L Li, K Lin, C Wu, N Duan, Z Liu, C Liu, M Zeng, ... IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023	82	2023

Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.

Artikel 1–20

Zitate pro Jahr

Doppelte Zitate

Zusammengeführte Zitate

Koautor hinzufügenKoautoren

Folgen

Zitiert von

Koautoren