AI About-Face: ‘Mantis’ Turns LLM Attackers Into Prey

November 19, 2024 at 06:35AM A new defensive system, Mantis, has been developed to counter cyberattacks by large-language models (LLMs). It uses deceptive techniques to mislead attackers, embedding prompt-injection commands within responses. Mantis has shown a success rate exceeding 95% in redirecting and thwarting LLM-based exploits using active and passive defense strategies. ### Meeting Takeaways … Read more

ChatGPT Exposes Its Instructions, Knowledge & OS Files

November 15, 2024 at 05:24PM ChatGPT’s architecture may expose sensitive data and internal instructions, raising security concerns. Despite OpenAI’s claim of intentional design, experts warn this could enable malicious users to reverse-engineer vulnerabilities and access confidential information stored in custom GPTs. Users are cautioned to avoid uploading sensitive data due to potential leaks. ### Meeting … Read more

Mozilla: ChatGPT Can Be Manipulated Using Hex Code

October 28, 2024 at 03:58PM A new prompt-injection technique demonstrates vulnerabilities in OpenAI’s GPT-4o, allowing users to bypass its safety guardrails. By encoding malicious instructions in unconventional formats, bad actors can manipulate the model to create exploit code. The model’s inability to analyze context and prevent harmful outputs raises concerns about security in AI development. … Read more

AI Chatbots Ditch Guardrails After ‘Deceptive Delight’ Cocktail

October 24, 2024 at 11:44AM Palo Alto Networks revealed a method called “Deceptive Delight” that combines benign and malicious queries, successfully bypassing AI guardrails in chatbots 65% of the time. This advanced “multiturn” jailbreak exploits the limited attention span of language models, prompting recommendations for organizations to enhance security measures against prompt injection attacks. ### … Read more

From Misuse to Abuse: AI Risks and Attacks

October 16, 2024 at 07:45AM Cybercriminals are increasingly using AI to enhance their capabilities, although much of the hype surrounding AI in cybercrime lacks substance. Currently, AI is mainly applied to simple tasks like phishing and code generation. However, security risks exist, particularly with custom AI tools, raising concerns over sensitive data exposure. ### Meeting … Read more

From Copilot to Copirate: How data thieves could hijack Microsoft’s chatbot

August 28, 2024 at 09:08AM Microsoft fixed flaws in Copilot that allowed attackers to steal users’ emails and personal data through a series of LLM-specific attacks, including prompt injection. Red teamer Johann Rehberger disclosed the exploit, prompting Microsoft to make changes for customer protection. The exploit used prompt injection, automatic tool invocation, and ASCII smuggling … Read more

Slack Patches AI Bug That Let Attackers Steal Data From Private Channels

August 22, 2024 at 11:47AM Salesforce’s Slack AI has patched a flaw identified by security firm PromptArmor, which could have allowed attackers to steal data from private Slack channels or engage in secondary phishing within the platform. The flaw is related to the use of a language model that did not recognize malicious instructions, enabling … Read more

Who uses LLM prompt injection attacks IRL? Mostly unscrupulous job seekers, jokesters and trolls

August 13, 2024 at 06:51AM Various attempts at prompt injection into large language models (LLMs) have been identified, with the majority coming from job seekers seeking to manipulate automated HR screening systems. Kaspersky’s research found instances of direct and indirect prompt injections, often aiming to influence HR processes or as a form of protest against … Read more

How to Weaponize Microsoft Copilot for Cyberattackers

August 8, 2024 at 02:56PM Enterprises are rapidly adopting Microsoft’s Copilot AI-based chatbots to enhance employee productivity, but security researcher Michael Bargury demonstrated at Black Hat USA how attackers could exploit Copilot for data theft and social engineering. He also released an offensive toolset for Copilot and emphasized the need for better detection of “promptware” … Read more

Meta’s AI safety system defeated by the space bar

July 29, 2024 at 05:09PM Meta’s machine-learning model designed to detect prompt injection attacks, known as Prompt-Guard-86M, has ironically been found vulnerable to such attacks. This model, introduced by Meta in conjunction with its Llama 3.1 generative model, aims to catch problematic inputs for AI models. However, a recent discovery by bug hunter Aman Priyanshu … Read more