The good, the bad and the ugly: watermarks, transferable attacks and adversarial defenses
Proceedings of the ICLR Workshop on GenAI Watermarking, 2025
Grzegorz Głuch, Berkant Turan, Sai Ganesh Nagarajan, Sebastian PokuttaBibTeX
Interpretability Guarantees with Merlin-Arthur Classifiers
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, pp. 1963-1971, Vol.238, PMLR, 2024
Stephan Wäldchen, Kartikey Sharma, Berkant Turan, Max Zimmer, Sebastian PokuttaBibTeX