Xiao Wang

Zitiert von

	Alle	Seit 2019
Zitate	454	454
h-index	7	7
i10-index	5	5

240

120

180

202120222023202411 42 161 239

Öffentlicher Zugriff

Alle anzeigen

1 Artikel

0 Artikel

verfügbar

nicht verfügbar

Basierend auf Fördermandaten

Koautoren

Qi Zhang (张奇)Professor of Computer Science, Fudan UniversityBestätigte E-Mail-Adresse bei fudan.edu.cn
Tao Gui （桂韬）复旦大学Bestätigte E-Mail-Adresse bei fudan.edu.cn
Huang Xuanjing (黄萱菁)Professor of Computer Science, Fudan UniversityBestätigte E-Mail-Adresse bei fudan.edu.cn
Xianjun YangUCSBBestätigte E-Mail-Adresse bei ucsb.edu
Yicheng ZouShanghai AI LaboratoryBestätigte E-Mail-Adresse bei pjlab.org.cn
Rui ZhengFudan UniversityBestätigte E-Mail-Adresse bei fudan.edu.cn
Dahua LinThe Chinese University of Hong KongBestätigte E-Mail-Adresse bei ie.cuhk.edu.hk
Jie ZhouEast China Normal UniversityBestätigte E-Mail-Adresse bei cs.ecnu.edu.cn
Xipeng Qiu（邱锡鹏）Professor of Computer Science, Fudan UniversityBestätigte E-Mail-Adresse bei fudan.edu.cn

Folgen

Xiao Wang

Fudan University

Bestätigte E-Mail-Adresse bei fudan.edu.cn - Startseite

Large Language Model Security of LLMs Continual Learning


Titel Nach Zitationen sortieren Nach Jahr sortieren Nach Titel sortieren	Zitiert von Zitiert von	Jahr
The rise and potential of large language model based agents: A survey Z Xi, W Chen, X Guo, W He, Y Ding, B Hong, M Zhang, J Wang, S Jin, ... arXiv preprint arXiv:2309.07864, 2023	232	2023
Textflint: Unified multilingual robustness evaluation toolkit for natural language processing X Wang, Q Liu, T Gui, Q Zhang, Y Zou, X Zhou, J Ye, Y Zhang, R Zheng, ... Proceedings of the 59th Annual Meeting of the Association for Computational …, 2021	111*	2021
Shadow alignment: The ease of subverting safely-aligned language models X Yang, X Wang, Q Zhang, L Petzold, WY Wang, X Zhao, D Lin arXiv preprint arXiv:2310.02949, 2023	40	2023
MINER: Improving out-of-vocabulary named entity recognition from an information theoretic perspective X Wang, S Dou, L Xiong, Y Zou, Q Zhang, T Gui, L Qiao, Z Cheng, ... Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022	23	2022
Orthogonal Subspace Learning for Language Model Continual Learning X Wang, T Chen, Q Ge, H Xia, R Bao, R Zheng, Q Zhang, T Gui, X Huang EMNLP 2023 findings, 2023	10	2023
Secrets of rlhf in large language models part ii: Reward modeling B Wang, R Zheng, L Chen, Y Liu, S Dou, C Huang, W Shen, S Jin, E Zhou, ... arXiv preprint arXiv:2401.06080, 2024	9	2024
InstructUIE: multi-task instruction tuning for unified information extraction X Wang, W Zhou, C Zu, H Xia, T Chen, Y Zhang, R Zheng, J Ye, Q Zhang, ... arXiv preprint arXiv:2304.08085, 2023	9	2023
LoRAMoE: Revolutionizing mixture of experts for maintaining world knowledge in language model alignment S Dou, E Zhou, Y Liu, S Gao, J Zhao, W Shen, Y Zhou, Z Xi, X Wang, ... arXiv preprint arXiv:2312.09979, 2023	6	2023
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model X Wang, W Zhou, Q Zhang, J Zhou, S Gao, J Wang, M Zhang, X Gao, ... ACL 2023 findings, 2023	3	2023
Navigating the OverKill in Large Language Models C Shi, X Wang, Q Ge, S Gao, X Yang, T Gui, Q Zhang, X Huang, X Zhao, ... arXiv preprint arXiv:2401.17633, 2024	2	2024
Improving generalization of alignment with human preferences through group invariant learning R Zheng, W Shen, Y Hua, W Lai, S Dou, Y Zhou, Z Xi, X Wang, H Huang, ... ICLR 2024, 2023	2	2023
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization S Gao, S Dou, Y Liu, X Wang, Q Zhang, Z Wei, J Ma, Y Shan ACL 2023, 2023	2	2023
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models H Lv, X Wang, Y Zhang, C Huang, S Dou, J Ye, T Gui, Q Zhang, ... arXiv preprint arXiv:2402.16717, 2024	1	2024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning Z Xi, W Chen, B Hong, S Jin, R Zheng, W He, Y Ding, S Liu, X Guo, ... arXiv preprint arXiv:2402.05808, 2024	1	2024
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback S Dou, Y Liu, H Jia, L Xiong, E Zhou, J Shan, C Huang, X Wang, W Shen, ... arXiv preprint arXiv:2402.01391, 2024	1	2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback S Gao, Q Ge, W Shen, S Dou, J Ye, X Wang, R Zheng, Y Zou, Z Chen, ... arXiv preprint arXiv:2401.11458, 2024	1	2024
TRACE: A comprehensive benchmark for continual learning in large language models X Wang, Y Zhang, T Chen, S Gao, S Jin, X Yang, Z Xi, R Zheng, Y Zou, ... arXiv preprint arXiv:2310.06762, 2023	1	2023
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models W Zhou, X Wang, L Xiong, H Xia, Y Gu, M Chai, F Zhu, C Huang, S Dou, ... arXiv preprint arXiv:2403.12171, 2024		2024
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions Y Zhang, X Wang, Z Xi, H Xia, T Gui, Q Zhang, X Huang COLING 2024, 2024		2024
A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition L Xiong, J Zhou, Q Zhu, X Wang, Y Wu, Q Zhang, T Gui, X Huang, J Ma, ... ACL 2023 findings, 2023		2023

Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.

Artikel 1–20

Zitate pro Jahr

Doppelte Zitate

Zusammengeführte Zitate

Koautor hinzufügenKoautoren

Folgen

Zitiert von

Koautoren