Membership Encoding for Black-Box Neural Network Watermarking

五月 2025

摘要

Deep neural network watermarking is an emerging technique for protecting the copyright of models. Most existing black-box watermarking methods leverage the backdoor, making them inherently vulnerable to backdoor removal attacks. In this paper, we propose a novel watermark removal attack, Misleading Fine-tuning, which effectively eliminates backdoor-based watermarks with limited data. To counter this threat, we present a novel black-box watermarking method based on membership encoding. This method overfits the protected model on a subset of training data that serve as triggers, thereby making it resistant to backdoor removal attacks. Extensive experiments demonstrate its fidelity and robustness against adversarial modifications, whether applied to the model or the inputs.

类型

Conference

出版物

In IEEE International Conference on Acoustics, Speech and Signal Processing 2025

Membership Encoding for Black-Box Neural Network Watermarking

摘要

章杭炜

硕士研究生

李方圻

博士研究生

王士林

教授