人工智能安全实验室·上海交通大学
人工智能安全实验室·上海交通大学
在读研究生
近期事件
科研成果
联系我们
浅色
深色
自动
Conference
Rethinking the Fragility and Robustness of Fingerprints of Deep Neural Networks
Fingerprints characterize deep neural networks that are deployed as black-boxes. To achieve copyright tracing and integrity …
李方圻
,
杨磊
,
王士林
Gracefully Filtering Backdoor Samples for Generative Large Language Models without Retraining
Backdoor attacks remain significant security threats to generative large language models (LLMs). Since generative LLMs output sequences …
吴宗儒
,
程彭洲
,
方岭永
,
张倬胜
,
刘功申
PDF
Cite
DOI
ALIS: Aligned LLM Instruction Security Strategy for Unsafe Input Prompt
In large language models, existing instruction tuning methods may fail to balance the performance with robustness against attacks from …
宋鑫浩
,
段苏峰
,
刘功申
PDF
Cite
DOI
Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space
Despite the notable success of language models (LMs) in various natural language processing (NLP) tasks, the reliability of LMs is …
吴宗儒
,
张倬胜
,
程彭洲
,
刘功申
PDF
Cite
DOI
Personatalk: Preserving Personalized Dynamic Speech Style in Talking Face Generation
Recent visual speaker authentication methods claimed their effectiveness against deepfake attacks. However, the success is attributed …
陆千禧
,
何怡
,
王士林
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study
Previous work has showcased the intriguing capability of large language models (LLMs) in retrieving facts and processing context …
鞠天杰
,
杜巍
,
刘功申
PDF
Cite
DOI
Backdoor NLP Models via AI-Generated Text
Backdoor attacks pose a critical security threat to natural language processing (NLP) models by establishing covert associations …
杜巍
,
鞠天杰
,
刘功申
PDF
Multi-Grained Multimodal Interaction Network for Sentiment Analysis
Multimodal sentiment analysis aims to utilize different modalities including language, visual, and audio to identify human emotions in …
方岭永
,
刘功申
PDF
Speaker-Adaptive Lipreading via Spatio-Temporal Information Learning
Lipreading has been rapidly developed recently with the help of large-scale datasets and big models. Despite the significant progress …
何怡
,
杨磊
,
王晗亦
,
王士林
PDF
Data-Free Watermark for Deep Neural Networks by Truncated Adversarial Distillation
Model watermarking secures ownership verification and copyright protection of deep neural networks. In the black-box scenario, …
闫超博
,
李方圻
,
王士林
PDF
«
»
Cite
×