Microsoft Bets $10,000 on Prompt Injection Protections of LLM Email Client

December 10, 2024 at 08:27AM Microsoft has launched the LLMail-Inject hacking challenge, offering $10,000 in prizes for breaking defenses in a simulated email client using an instruction-tuned large language model. The challenge runs until January 20, 2025, featuring 40 unique scenarios. Participants can form teams of up to five and must register via GitHub. ### … Read more

Researchers Uncover Prompt Injection Vulnerabilities in DeepSeek and Claude AI

December 9, 2024 at 07:07AM A patched security flaw in DeepSeek AI allows prompt injection attacks, enabling account takeover via cross-site scripting (XSS). Researcher Johann Rehberger demonstrated this vulnerability, revealing similar risks in other AI tools. Techniques like ZombAIs and Terminal DiLLMa exploit these weaknesses, raising concerns about security in generative AI applications. ### Meeting … Read more

Microsoft dangles $10K for hackers to hijack LLM email service

December 9, 2024 at 06:08AM Microsoft has launched the LLMail-Inject challenge, inviting teams to exploit a simulated email client integrated with a large language model. Participants aim to bypass defenses and carry out prompt injection attacks for prizes totaling $10,000. The competition runs from December 9 to January 20, 2024. ### Meeting Takeaways: 1. **Challenge … Read more

AI About-Face: ‘Mantis’ Turns LLM Attackers Into Prey

November 19, 2024 at 06:35AM A new defensive system, Mantis, has been developed to counter cyberattacks by large-language models (LLMs). It uses deceptive techniques to mislead attackers, embedding prompt-injection commands within responses. Mantis has shown a success rate exceeding 95% in redirecting and thwarting LLM-based exploits using active and passive defense strategies. ### Meeting Takeaways … Read more

ChatGPT Exposes Its Instructions, Knowledge & OS Files

November 15, 2024 at 05:24PM ChatGPT’s architecture may expose sensitive data and internal instructions, raising security concerns. Despite OpenAI’s claim of intentional design, experts warn this could enable malicious users to reverse-engineer vulnerabilities and access confidential information stored in custom GPTs. Users are cautioned to avoid uploading sensitive data due to potential leaks. ### Meeting … Read more

Mozilla: ChatGPT Can Be Manipulated Using Hex Code

October 28, 2024 at 03:58PM A new prompt-injection technique demonstrates vulnerabilities in OpenAI’s GPT-4o, allowing users to bypass its safety guardrails. By encoding malicious instructions in unconventional formats, bad actors can manipulate the model to create exploit code. The model’s inability to analyze context and prevent harmful outputs raises concerns about security in AI development. … Read more

AI Chatbots Ditch Guardrails After ‘Deceptive Delight’ Cocktail

October 24, 2024 at 11:44AM Palo Alto Networks revealed a method called “Deceptive Delight” that combines benign and malicious queries, successfully bypassing AI guardrails in chatbots 65% of the time. This advanced “multiturn” jailbreak exploits the limited attention span of language models, prompting recommendations for organizations to enhance security measures against prompt injection attacks. ### … Read more

From Misuse to Abuse: AI Risks and Attacks

October 16, 2024 at 07:45AM Cybercriminals are increasingly using AI to enhance their capabilities, although much of the hype surrounding AI in cybercrime lacks substance. Currently, AI is mainly applied to simple tasks like phishing and code generation. However, security risks exist, particularly with custom AI tools, raising concerns over sensitive data exposure. ### Meeting … Read more

From Copilot to Copirate: How data thieves could hijack Microsoft’s chatbot

August 28, 2024 at 09:08AM Microsoft fixed flaws in Copilot that allowed attackers to steal users’ emails and personal data through a series of LLM-specific attacks, including prompt injection. Red teamer Johann Rehberger disclosed the exploit, prompting Microsoft to make changes for customer protection. The exploit used prompt injection, automatic tool invocation, and ASCII smuggling … Read more

Slack Patches AI Bug That Let Attackers Steal Data From Private Channels

August 22, 2024 at 11:47AM Salesforce’s Slack AI has patched a flaw identified by security firm PromptArmor, which could have allowed attackers to steal data from private Slack channels or engage in secondary phishing within the platform. The flaw is related to the use of a language model that did not recognize malicious instructions, enabling … Read more