Visualbert: A simple and performant baseline for vision and language LH Li, M Yatskar, D Yin, CJ Hsieh, KW Chang arXiv preprint arXiv:1908.03557, 2019 | 2037 | 2019 |
Grounded language-image pre-training LH Li, P Zhang, H Zhang, J Yang, C Li, Y Zhong, L Wang, L Yuan, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 1063 | 2022 |
Regionclip: Region-based language-image pretraining Y Zhong, J Yang, P Zhang, C Li, N Codella, LH Li, L Zhou, X Dai, L Yuan, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 540 | 2022 |
How much can clip benefit vision-and-language tasks? S Shen, LH Li, H Tan, M Bansal, A Rohrbach, KW Chang, Z Yao, ... arXiv preprint arXiv:2107.06383, 2021 | 435 | 2021 |
Glipv2: Unifying localization and vision-language understanding H Zhang, P Zhang, X Hu, YC Chen, L Li, X Dai, L Wang, L Yuan, ... Advances in Neural Information Processing Systems 35, 36067-36080, 2022 | 291 | 2022 |
What Does BERT with Vision Look At? LH Li, M Yatskar, D Yin, CJ Hsieh, KW Chang | 157* | |
Elevater: A benchmark and toolkit for evaluating language-augmented visual models C Li, H Liu, L Li, P Zhang, J Aneja, J Yang, P Jin, H Hu, Z Liu, YJ Lee, ... Advances in Neural Information Processing Systems 35, 9287-9301, 2022 | 137 | 2022 |
On the paradox of learning to reason from data H Zhang, LH Li, T Meng, KW Chang, GV Broeck arXiv preprint arXiv:2205.11502, 2022 | 105 | 2022 |
Symbolic chain-of-thought distillation: Small models can also" think" step-by-step LH Li, J Hessel, Y Yu, X Ren, KW Chang, Y Choi arXiv preprint arXiv:2306.14050, 2023 | 102 | 2023 |
Unsupervised vision-and-language pre-training without parallel images and captions LH Li, H You, Z Wang, A Zareian, SF Chang, KW Chang arXiv preprint arXiv:2010.12831, 2020 | 77* | 2020 |
Geomlama: Geo-diverse commonsense probing on multilingual pre-trained language models D Yin, H Bansal, M Monajatipoor, LH Li, KW Chang arXiv preprint arXiv:2205.12247, 2022 | 50 | 2022 |
Broaden the vision: Geo-diverse visual commonsense reasoning D Yin, LH Li, Z Hu, N Peng, KW Chang arXiv preprint arXiv:2109.06860, 2021 | 48 | 2021 |
Point precisely: Towards ensuring the precision of data in generated texts using delayed copy mechanism L Li, X Wan Proceedings of the 27th International Conference on Computational …, 2018 | 30 | 2018 |
Berthop: An effective vision-and-language model for chest x-ray disease diagnosis M Monajatipoor, M Rouhsedaghat, LH Li, CC Jay Kuo, A Chien, ... International Conference on Medical Image Computing and Computer-Assisted …, 2022 | 23 | 2022 |
SGEITL: Scene graph enhanced image-text learning for visual commonsense reasoning Z Wang, H You, LH Li, A Zareian, S Park, Y Liang, KW Chang, SF Chang Proceedings of the AAAI conference on artificial intelligence 36 (5), 5914-5922, 2022 | 22 | 2022 |
Desco: Learning object recognition with rich language descriptions L Li, ZY Dou, N Peng, KW Chang Advances in Neural Information Processing Systems 36, 2024 | 19 | 2024 |
Metavl: Transferring in-context learning ability from language models to vision-language models M Monajatipoor, LH Li, M Rouhsedaghat, LF Yang, KW Chang arXiv preprint arXiv:2306.01311, 2023 | 16 | 2023 |
Berthop: An effective vision-and-language model for chest x-ray disease diagnosis M Monajatipoor, M Rouhsedaghat, LH Li, A Chien, CCJ Kuo, F Scalzo, ... arXiv preprint arXiv:2108.04938, 2021 | 12 | 2021 |
Efficient contextual representation learning with continuous outputs LH Li, PH Chen, CJ Hsieh, KW Chang Transactions of the Association for Computational Linguistics 7, 611-624, 2019 | 10* | 2019 |
Matryoshka Query Transformer for Large Vision-Language Models W Hu, ZY Dou, LH Li, A Kamath, N Peng, KW Chang arXiv preprint arXiv:2405.19315, 2024 | 4 | 2024 |