August 13, 2024 at 06:51AM
Various attempts at prompt injection into large language models (LLMs) have been identified, with the majority coming from job seekers seeking to manipulate automated HR screening systems. Kaspersky’s research found instances of direct and indirect prompt injections, often aiming to influence HR processes or as a form of protest against generative AI.
Based on the meeting notes, it appears that Kaspersky has been conducting research on prompt injection attacks targeting large language models (LLMs). These attacks involve feeding the model with specific input to make it ignore its prior instructions and perform actions it’s not supposed to do.
The research identified various categories of prompt injections, including HR-related injections aimed at manipulating automated systems to favor job candidates, attempts to influence product reviews or search results, injection as a form of protest, and humorous prompts instructing LLMs to perform non-serious tasks.
Kaspersky emphasized that the prompt injection attacks they found did not result in serious destructive actions and did not uncover any instances of malicious uses such as spam emails or scam web pages.
The research also highlighted efforts to hide prompt injections from human detection using tactics like small text, matching text color with the background, and negative coordinates to make the text invisible to human readers while influencing the LLM’s recommendations.
Furthermore, the research revealed that prompt injection attacks were found in human resources and job recruiting contexts, where there is a strong incentive to manipulate automated systems for job placement. The attacks aimed to influence resume-scanning software by including text to favorably position job candidates.
In addition to HR-related injections, product websites were also identified using similar tactics to influence automated systems in providing positive reviews or synopses. Moreover, individuals were found to include instructions on their websites and social media profiles as a form of protest against generative AI-related concerns.
While the research highlighted potential malicious uses of prompt injections, such as in spear phishing campaigns or data exfiltration, it concluded that the current threat is largely theoretical due to the limited capabilities of existing LLM systems.
Overall, Kaspersky’s research provides valuable insights into the various forms and potential impacts of prompt injection attacks on LLMs, underscoring the need for continued vigilance and security measures to mitigate such threats.