Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models

Haritz Puerto, Martin Gubri, Sangdoo Yun, Seong Joon Oh

Abstract

People have tried to check if certain copyrighted material is used by LLMs by analysing their characteristic reactions to it, a task known as Membership Inference Attack (MIA). Research so far has mostly reported negative results, finding barely any statistically significant signals. In our paper, we show that meaningful signals only appear at scale: not in sentences or paragraphs, but at the level of documents and more.

Publication

Annual Conference of the North American Chapter of the Association for Computational Linguistics 2025

Links

arXiv PDF RTAI