Meta’s AI safety system defeated by the space bar

July 29, 2024 at 05:09PM Meta’s machine-learning model designed to detect prompt injection attacks, known as Prompt-Guard-86M, has ironically been found vulnerable to such attacks. This model, introduced by Meta in conjunction with its Llama 3.1 generative model, aims to catch problematic inputs for AI models. However, a recent discovery by bug hunter Aman Priyanshu … Read more