Blog

Understanding OWASP Top 10 for generative AI

Wednesday, May 8, 2024

·

5 min read

owasp top 10 for genAI

Ruchir Patwa

Co-founder, SydeLabs

The Open Web Application Security Project (OWASP) has recognized the burgeoning impact of Large Language Models (LLMs) on the tech landscape and their inherent vulnerabilities. With the advent of LLMs in business and consumer applications, new security concerns have surfaced, demanding a tailored approach to safeguard these technologies. This necessity led to the creation of the OWASP Top 10 for LLM Applications, a comprehensive guide aimed at developers, data scientists, and security experts to navigate the nuanced terrain of LLM application security.

The Genesis of OWASP's LLM Security Initiative

The widespread adoption of LLMs post-2022 highlighted a critical gap in security practices. Businesses, in their rush to integrate LLMs into their systems, often overlooked robust security measures, leaving their applications exposed to potential threats. Recognizing this, OWASP mobilized an international team of nearly 500 experts, drawing from various sectors such as AI, security, academia, and cloud technologies, to consolidate a list of the ten most critical vulnerabilities specific to LLM applications.

Deciphering LLM Vulnerabilities

The OWASP Top 10 for LLM Applications delves deep into unique vulnerabilities that LLMs face, distinguishing them from general application security threats. The list is not a mere reiteration of existing vulnerabilities but a specialized exploration of how these issues manifest uniquely in LLM-integrated applications. It serves as a bridge, connecting general application security principles with the specific challenges posed by LLMs. Given the rate at which the LLM landscape is evolving the OWASP top 10 for LLMs have also been evolving with the space. At the time this blog was written version 1.1.0 was the latest.

Top 10 LLM Vulnerabilities

Prompt Injection

A critical vulnerability where attackers manipulate LLMs through crafted inputs, causing unintended actions or disclosures. This can be direct, affecting system prompts, or indirect, through manipulated external inputs, leading to data exfiltration or social engineering attacks. This is the most prevalent form of attack on LLMs at the moment. It draws a lot of parallels from various traditional security attacks like sql injections (over all premise of injecting untrusted input to take control over a trusted system), social engineering (convincing using natural language), XSS (indirect prompt injections via external data sources etc. Prompt injections can be either direct or indirect. In direct prompt injections the user interacting with the LLM is the adversary that is trying to craft a malicious input prompt. Indirect prompt injections are when the adversary is trying to impact a genuine user via approaches such as RAG (Retrieval Augmented Generation) and poisoning the data source where the data is being fetched by the AI system.

Learn more in our blog about Prompt Injections.

Insecure Output Handling:

This occurs when outputs from LLMs are not adequately validated, posing risks like XSS, CSRF, SSRF, privilege escalation, or remote code execution. A lot of enterprises are starting to rely on LLMs to convert unstructured data to structured data (like JSON) that can be fed into systems downstream. There is also a lot more control being given to LLMs to act as AI Agents that have ability to take action. When LLMs control the input into privileged systems or take actions on such systems their output becomes extremely important. Any form of hallucination or poisoning of the output can have drastic consequences.

Training Data Poisoning:

The integrity of LLMs can be compromised by tampering with their training data, introducing vulnerabilities, biases, or backdoors that could affect the model's security and ethical behavior. Training LLMs is expensive and time consuming and there is little to no visibility into the data that went into the model after the model has been trained. So if there was any sort of data poisoning it is extremely hard to detect post-facto and can lead turn into a security disaster waiting to happen at any point.


Model Denial of Service (DoS):

Attackers can exploit LLMs to consume excessive resources, leading to service degradation or incurring high operational costs. Hosting LLM models requires quite expensive resources like GPUs and even with high end GPUs you can only server a limited number of simultaneous requests. This makes it much easier to perform Denial of Service attacks on LLM applications and models. There is another aspect to DoS, known as Denial of Wallet (DoW) attacks. For applications that rely on third party hosted models or closed models, repeated request from attackers can quickly pile up their infrastructure bills or deplete their wallets on third party providers.


Supply Chain Vulnerabilities:

The lifecycle of LLM applications is susceptible to attacks via vulnerable components, third-party datasets, or plugins, leading to security breaches. There are 3 popular modes of using LLMs right now:

  1. Your own foundational model,

  2. Closed models (example OpenAI’s GPT4, Google’s Gemini etc)

  3. OpenSource models (example Llama2, Gemma etc).

All of these approaches have a different risk of supply chain attacks. Unlike traditional systems, open source LLMs are not easy to audit, even when you have access to all the information. Hence the risk of unknown threats is much higher. The recent xz utils backdoor highlighted the risk of open source software. Imagine using an open source LLM that has such backdoors which can later lead to security breaches.

Sensitive Information Disclosure:

LLMs might inadvertently reveal confidential data in their responses, posing significant privacy and security risks. The data could be something that inadvertently made it to its training data set, fine tuning data set or something the system has access through Retrieval Augmented Generation(RAG). You can look at an LLM deployed in your infrastructure as an insider with highly privileged access. If the system has access to confidential or sensitive data there is a risk of leaking that data through its responses. A recent attack was when GPT was asked to repeat a word an infinite number of times and after repeating it a few times it started leaking PII data of individuals. During our testing of an LLM system our red teaming agent was also able to extract financial transactions that were used to fine-tune a model.

An important aspect to protect in your LLM based application is the system instructions (or system prompt). Learn more from our blog on why system prompt leaks are bad.

Insecure Plugin Design:

Plugins are used to allow LLMs to either take action, interact with other systems or get data from other sources. For example getting the current weather from the internet to answer a users query about the weather. These plugins can have insecure inputs and lack robust access control, making applications more susceptible to attacks like remote code execution. Given the plugins can allow both read and write actions, they should can pose a big risk to LLM applications. Imagine a plugin that is used to access data from your email to summarise it, but instead it also ends up sending your confidential data to an untrusted server.

Excessive Agency

LLM-based systems might undertake unintended actions due to excessive functionality, permissions, or autonomy granted to them, leading to undesirable outcomes. There may be various reasons including hallucinations, prompt injections and jailbreaks why a LLM’s output may not be trusted. If this LLM based system is allowed to take actions these untrusted outputs can cause a lot of harm to an enterprise.

Overreliance:

Systems or individuals relying too heavily on LLMs without proper oversight might face misinformation, miscommunication, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs. Apart from humans relying on inaccurate information that a LLM may provide a larger concern is when LLMs are used to directly influence critical systems. For example using a LLM to write code and the code is vulnerable. Such over-reliance can cause longer term security issues.

Model Theft:

Unauthorized access, copying, or exfiltration of proprietary LLM models can lead to significant economic losses, compromised competitive advantage, and unauthorized access to sensitive information.

Bridging the Gap

The OWASP Top 10 for LLM Applications serves as a critical resource for professionals involved in the design, development, and security of LLM-integrated systems. By addressing the unique challenges posed by LLMs, the list aims to foster safer adoption of this transformative technology, ensuring that as LLMs continue to evolve and integrate into various aspects of business and society, they do so with security at the forefront.

SydeLabs offers a comprehensive protection from such threats. With SydeLabs you can identify the inherent risks of your genAI system by conducting AI red teaming on your LLM endpoints. Once you have arrived at an optimum harmlessness of your model, deploy your feature with confidence while protecting it with our intent-based firewall. Do reach out to us at hello@sydelabs.ai to book a demo.

owasp top 10 for genAI

Ruchir Patwa

Co-founder, SydeLabs

San Francisco, California

Protect your generative AI applications from the ever-expanding threat landscape of LLM systems.

San Francisco, California

Protect your generative AI applications from the ever-expanding threat landscape of LLM systems.

San Francisco, California

Protect your generative AI applications from the ever-expanding threat landscape of LLM systems.