Meta’s AI safety system defeated by the space bar

July 29, 2024 at 05:09PM Meta’s machine-learning model designed to detect prompt injection attacks, known as Prompt-Guard-86M, has ironically been found vulnerable to such attacks. This model, introduced by Meta in conjunction with its Llama 3.1 generative model, aims to catch problematic inputs for AI models. However, a recent discovery by bug hunter Aman Priyanshu … Read more

New Mindset Needed for Large Language Models

May 23, 2024 at 10:08AM The commentary highlights the growing use of large language models (LLMs) and the associated security risks. An incident involving a compromised chatbot raises concerns about the potential exploitation of LLMs for extracting sensitive data. The author provides best practices for securing LLMs, emphasizing the need for proactive monitoring, hardened prompts, … Read more