December 4, 2023 at 04:55PM
Lasso researchers found over 1500 unsecured API tokens on GitHub and Hugging Face, allowing potential full access to major tech companies’ (including Meta, Google, Microsoft) large language model repositories. This vulnerability could permit data poisoning, model theft, and malicious activities, exposing millions to security risks.
Meeting Takeaways:
1. Researchers at Lasso AI security startup identified significant supply chain risks by gaining full read and write access to Meta’s Bloom, Meta-Llama, and Pythia large language model (LLM) repositories.
2. The access granted by exposed tokens would allow potential adversaries to manipulate training data, steal models and datasets, and execute various malicious activities impacting millions of users.
3. Unsecured API tokens were found on GitHub and the Hugging Face platform, leading to access to repositories of 722 organizations, including major firms like Google, Microsoft, and VMware.
4. Over 1,500 tokens were discovered, with 1,976 found across GitHub and Hugging Face; 1,681 of these were active and usable, with 655 having write permissions on the Hugging Face platform.
5. Researchers warn that developers and organizations using platforms like Hugging Face must take responsibility for securing their exposed tokens, as the platforms may not provide adequate security measures.
6. The finding highlights the need for training on integrating AI and LLM tools to ensure security and awareness of potential vulnerabilities.
7. Hugging Face, widely used in LLM projects, hosts a vast collection of AI models and datasets but has been shown to have security issues in its token management.
8. The Lasso team was surprised by the ease of discovering these tokens and the potential control over high-security organization repositories they provided.
9. Among the compromised tokens, several from Meta allowed write access to their respective LLM repositories, while tokens from Microsoft and VMware had read-only access but still allowed visibility into private data.
10. Lasso reached out to all impacted entities advising them to revoke and remove the exposed tokens, with many organizations taking swift action to mitigate the risk.
Action Items:
– Organizations must review and secure their API tokens on platforms like GitHub and Hugging Face.
– Increased security training and protocols should be introduced when working with generative AI and LLM tools.
– A further investigation into the security mechanisms of Hugging Face and other LLM resource platforms is necessary.
– Following Lasso’s recommendations, all compromised access tokens should be completely revoked and any public exposure of these tokens should be eradicated from the repositories.