The Dark Side of Generative AI: Part 2 — Mitigating the Security and Privacy Risks

To secure your project, you need a thorough assessment and mitigation of every possible risk. Here is how to approach this mission.

Note: This is part 2 of the 3-part series that will explore the pitfalls that developers of the Generative AI-based application need to know and try to mitigate.

Part 1 — Understanding the Security and Privacy Risks
Part 2 — Uncovering pitfalls and mitigations that affect the Security and Privacy Risks
Part 3 — Navigating the Murky Waters of Generative AI Regulation Landscape (Coming)

Introduction

Have you ever wondered how risk assessment for generative AI applications differs from the risk assessment performed for other applications? Generative AI applications have unique attributes related to their architecture, usage, and user interaction that pose risks that are not necessarily present in other applications. It is essential to recognize these applications’ specific risks and take steps to mitigate them in order to ensure their security.

There is much room for creativity with the ingredients the chef can use to make a meal. However, developers must address every potential risk when making applications safe and secure. Attackers only need to find one vulnerability to exploit the application, whereas defenders must ensure they do not create a single mistake. The same principle applies when assessing the risks generative AI application, or any other application, is exposed to.

In this article we will provide a guide for identifying and mitigating the key dangers, along with a few interesting examples of what can and has gone wrong in the past. We will also give examples of the tools that can be used to mitigate the risks.

Main pitfalls in each risk category

The previous blog series installment introduced five risk categories: Security, Privacy, Misuse, Regulation, and Service Quality. This installment will delve into each category’s dangers and potential downfalls.

Security

Security risk is one of the toughest challenges that application developers face. Ignoring these risks can lead to severe consequences, such as the loss or theft of sensitive information, financial losses, reputational damage, and legal implications. Therefore, it is crucial to prioritize security measures and ensure correct implementation to avoid any negative impact on the application or its users.

Security risk refers to the potential exploitability of a system. Developers must prioritize implementing robust security measures such as encryption, access controls, and regular vulnerability assessments to mitigate these risks and protect against cyber threats.

Here are a few examples of security pitfalls that demonstrate some of the security risks:

Denial of Service attacks might overload the backend API used by the application and prevent it from functioning correctly. This risk is not unique to generative AI-augmented applications. However, as such applications become a critical part of enterprise operations, protecting from such attacks must be prioritized accordingly. One distinctive aspect is that generative AI applications rely on Large Language Models with high resource utilization, so those applications might be easily overloaded and lead to denial of service.
Input Data Validation is important for any application, however as Generative AI applications need a lot of input data it makes it even more critical. When applications don’t properly validate user input, it can lead to various security issues. One such issue is data overflow, where an attacker can input more data than the application expects, causing the application to crash or behave unexpectedly. Attackers can exploit the lack of data validation to execute arbitrary code or hijack the application. Another primary concern is prompt injection, where attackers can inject malicious prompts into an application’s input fields. This allows attackers to gain unauthorized access to sensitive data or to negatively affect the output produced by the Large Language Model. For example, an attacker could inject a prompt that asks LLM to bypass application guardrails and give away sensitive information.
Vulnerabilities that may exist in an application’s library or hosting infrastructur, could allow attackers to make the application behave in unexpected ways or take complete control over the application or its environment. Generative AI application is built using a multitude of components, much like any other complex system. However, given that generative AI applications operate on potentially sensitive data, vulnerabilities in those libraries can pose a greater risk than in other cases.
Generative AI models require massive training data, often stored in third-party storage spaces. This poses a potential security risk if the data is not adequately protected. It’s important to note that the quality of the training data directly affects the output of the generative AI application. Therefore, if an attacker can manipulate the training data, they can cause the application to behave unintendedly.
Misconfiguarions are nesty and unfortunately quite common. An attacker could gain unauthorized access or control over the application infrastructure by exploiting system misconfiguration. With greater autonomy and capabilities of the generative AI applications, the impact of a misconfiguration could be much higher.

Recent research shows interesting examples of potential attacks on the Open AI GPT-4 agent (full paper) using its three public APIs.

The fine-tuning API used with just 15 harmful examples (out of 100) causes removal of the guardrails, opening a potential for system misuse. The research demonstrated engaging the application in harmful behaviors: misinformation, leaking private email addresses, and inserting malicious URLs into code generation.
Using the function call API, the research demonstrated that the application can divulge the function call schema and execute arbitrary functions with non-sanitized input.
Knowledge retrieval functionality hijacked by injecting instructions into retrieval documents, demonstrating prompt ejection attacks. When asked to summarize a document that contains a maliciously injected instruction, the model will obey that instruction instead of summarizing the document.

This research provides an example of the security risks for a generative AI application like OpenAI ChatGPT, but most risks are relevant to a broader set of applications.

Many more attacks are already known, like the exfiltration of sensitive files and the insertion of a backdoor into Open AI Code Interpreter. Attacks can be used in different modalities, like in the case of Open GPT-4 Vision Prompt Injection.

This document does not claim to provide a complete list of potential threats but instead emphasizes the need for thorough risk assessment and threat modeling.

Privacy

Keeping data under tight control, protecting it from unauthorized access, auditing its usage, etc., is a challenging task the industry has dealt with for a long time. Failing to do so leads to many risks, including privacy violations. The task becomes even more complex with the sophistication and volume of the data typically being managed for training and operating generative AI applications.

It’s important to know that the data used to train machine learning models may unintentionally include personal information that should not be exposed. Due to the large volume and variety of data used during the training process, ensuring that this personal data is not used is difficult. Furthermore, the Large Language Model lacks transparency and control over which pieces of information are used to generate the data, making it challenging to prevent the leakage of private data during the ongoing use of the application.
Most generative AI applications require user input in unstructured form. In such situations, users can easily overlook the sensitive nature of the input they feed those tools and unintentionally submit sensitive information, leading to another risk already mentioned. Such sensitive information might be in the form of sensitive company data, customer data, trade secrets, classified information, and intellectual property.

The following research demonstrates an attack carried out by a white-hat security researcher on the OpenAI Chat GPT application: ChatGPT Vulnerability Allows Training Data to be Accessed by Telling the Chatbot to Endlessly Repeat a Word.

Misuse

Many generative AI applications’ capabilities can be used for illegal or unesthetic activities, and application vendors need to do their best to prevent that to reduce the risk of being liable for assisting bad actors in their tasks.

The bad actor might use an application that helps produce high-quality emails to carry out phishing campaigns.
Image, video, and audio generation technology allow the creation of high-quality ‘Deep Fakes’. This means fake media that is impossible to identify as fictitious. For example, such media made to affect public options and released to the public can affect the masses and cause political unrest.

Cybercriminals can leverage generative AI to develop sophisticated malware that can evade traditional cybersecurity measures — for example, generating a large number of variations of their malicious software, with the intention that some of them will avoid detection.

An example of the above risks is WormGPT - a new AI tool that allows cybercriminals to launch sophisticated phishing attacks by launching sophisticated phishing and business email compromise campaigns. It’s speculated that it uses the open-source GPT-J language model developed by EleutherAI. Another example is FraudGPT, a new AI tool tailored for sophisticated attacks like crafting spear phishing emails, creating cracking tools, carding, etc.

Regulation

Regulatory and legal risks can pose a significant threat to businesses, potentially resulting in major losses, bankruptcy, or even personal legal liability for company executives, as demonstrated by the following examples.

During the training of machine learning models, it’s important to be mindful of the data sources used. While data is often collected from various sources on the internet, some of this data might have loosely defined usage rights. The New York Times filed a lawsuit against Microsoft and OpenAI, citing concerns over copyright infringement. The newspaper has alleged that the companies have caused significant financial losses, which could potentially amount to billions of dollars.
Governments have different approaches when it comes to solving issues related to data usage rights. Some are making usage rights more restrictive, while others take the opposite approach. The Japanese government has recently reaffirmed its position on copyright enforcement in relation to data used for AI training. According to this policy, AI is permitted to use any data for training without any restrictions related to the source or purpose of the data and without any limitations on the reproduction or origin of the content.
Various countries have begun implementing regulations that specify which technologies can be publicly available as open source and which ones are considered potentially dangerous and require strict limitations on their use. Non-compliance with these regulations can result in legal consequences and harm to one’s reputation. Consequently, application vendors must prioritize staying up to date with the latest regulations to ensure that they are in compliance.
It is important to be aware that generative AI applications are susceptible to producing inaccurate and misleading data, which can result in material damage to the users. In such cases, the affected users may seek compensation and engage in legal proceedings against the application vendors who failed to warn or prevent the production of misleading data that caused the losses.
Regulations such as GDPR or HIPAA demand that personal data must be closely monitored and audited. However, the black-box nature of technologies like the Large Language Model can make it challenging to achieve this level of control. Therefore, developers of applications that integrate regulated data need to consider the risk of non-compliance.
The emergence of generative AI applications has created numerous opportunities for businesses to generate new data. However, to take full advantage of these new data, vendors must establish usage rights for the generated data to prevent copyright infringement. This will ensure that the application remains open and usable while also safeguarding the revenue stream that comes from the generated content. By considering this aspect early on in the development process, businesses can avoid future losses and maximize their revenue potential.

Service Quality

A generative AI application’s quality of service depends on various aspects, such as efficient system maintenance and prompt resolution of any technical issues that may arise. The user experience, functionality, and overall performance of the application are also crucial factors that can impact service quality.

To ensure optimal application performance and response quality, it is crucial for vendors to consistently improve their applications based on user feedback and monitor for any irregularities in feedback patterns. This is particularly important for generative AI applications, as their performance can deteriorate over time due to the unpredictable nature of model training and fine-tuning.
Generative AI models rely heavily on the quality of data they are trained on. If the data is insufficient or of poor quality, it can impact the accuracy and effectiveness of the generated content, which can ultimately lead to a decline in service quality.
Quality of data also includes the training data diversity. If the training data is not diverse enough, it can result in a model that is biased toward certain types of content or language. This can lead to inaccuracies in the generated content and negatively impact the overall service quality.

Mitigation

Application developers should not feel helpless when faced with various pitfalls that can pose a great risk to the success of their applications. Instead, they should systematically apply mitigation measures based on a thorough risk assessment of their application before releasing it. Here are some of the main mitigations applications developers should consider.

Risk Assessment

Risk Assessment and Risk Management are among the first steps needed as the assessment will help prioritize and uncover the following mitigations required. There are several industry frameworks that can help with this step, some general and some more specific to generative AI applications:

FAIR (Factor Analysis of Information Risk) is a general, international quantitative model for information security and operational risk. Its unique quantitative approach allows for an easier balance between risk and user value.
Gartner AI TRiSM is an AI-specific set of solutions to identify and mitigate risks proactively. It covers four main pillars: Monitoring and Explainability, Operations, Application Security, and Privacy.

Governance

Governance programs help organizations ensure that they are operating in a transparent, ethical, and compliant manner. These programs establish policies and procedures that guide decision-making, risk management, and overall operations. By implementing effective governance programs, organizations can minimize the risk of legal or reputational harm while also improving efficiency and accountability.

AWS Generative AI Security Scoping Matrix is a tool that helps prioritize security disciplines based on selected solution scope. It is most appropriate for customers running their applications on AWS infrastructure.
NIST AI RMF (Risk Management Framework) is a comprehensive framework developed in collaboration between private and public sectors to manage risks associated with artificial intelligence (AI) and is intended for voluntary use to improve the development, use, and evaluation of AI products, services, and systems.

Policy

Usage policies for generative AI applications are essential to ensure responsible and ethical use of this technology. Such policies should provide a legal framework that outlines the limitations of use, potential risks, and liability for any harm caused by the AI-generated content. By establishing clear guidelines, we can prevent the misuse of generative AI and promote its safe and beneficial use for all.

Opt-out

Most generative AI applications are interested in saving user data for future model training. However, it’s essential to allow users to opt out of their data being saved and used to improve the model to prevent leakage of sensitive users’ data. This will also help to show transparency and contribute to user trust in the application.

ML Operation

Proper implementation of MLOPS (Machine Learning Operations) industry best practices is essential to deal with data drift, continuous monitoring, retraining, and tuning of the model to ensure that it continues to perform accurately over time. There are many mature products that exist on the market that can help build state-of-the-art MLOPS infrastructure, so most organizations do not need to develop it in-house.

Guardrails

There are several types of guardrail mechanisms that can be implemented to mitigate different kinds of risks.

Input Guardrails can help validate that the user entered text that is aligned with the application’s intended usage and reduce the risk of prompt injection.
Output Guardrails can help validate the validity and accuracy of the generated content and reduce the risk of hallucinations that almost all LLMs are exposed to.

Resource Utilization

Generative AI applications require a lot of computation power, typically in the form of GPUs, to train models or perform inference at scale. It is essential to have proper control and thresholds on the consumed compute power to ensure that one application function does not starve other functions of resources. Additionally, consumption should not exceed the limit the company can afford.

User Education

As generative AI applications have some unique characteristics, like a level of confidence in the accuracy of generated content or data privacy, it is essential to educate end users (enterprise employees or consumers) about the risks associated with generative AI platforms and provide guidelines for the appropriate use of these tools.

Security

Like any other application, a generative AI-based application can be vulnerable to various security attacks. To mitigate this risk, essential security disciplines like identity and access management, data protection, privacy and compliance, application security, and threat modeling must be implemented during both the development and operation phases. As the industry has been tackling this issue for many years, there are numerous mature frameworks and tools available to help. Recently, some of these frameworks have even released versions specifically designed for generative AI applications and their potential threats.

MITRE | ATLAS™ — A framework based on the older MITRE ATTACK framework, specifically built for generative AI applications. It enumerates adversary tactics and techniques based on red-team demonstrations and real-world attacks.
OWASP Top 10 LLM — Similarly to its older and famous OSWAP 10 framework, it’s an educational resource for generative AI developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing Large Language Models.
Red Teaming is a well-established practice aiming to identify security issues before attackers do by applying offensive methods by defenders themselves and in a controlled environment to eliminate any real negative impact. Recently, it has been extended with techniques like Prompt Injections specifically for generative AI application validation.

Compliance

Making sure compliance with regulations in the relevant location is critical. As of today, the regulation might be quite vague in some areas, but we definitely see major progress in regulation clarity across many countries. For cases where regulation is not clear enough at the moment, it’s essential to take a proactive approach and engage in conversation with policymakers and industry experts to understand where it is going and try to minimize the risk of non-compliance in the future.

Call to Action

In conclusion, as the adoption of generative AI continues to grow, it is crucial to address the security concerns associated with this technology. From data loss and unauthorized access to the creation of deepfakes and fake news, the risks are real. However, by implementing essential security measures, assessing risks, and implementing effective governance strategies, organizations can mitigate these threats.

At Atchai, we specialize in generative AI security solutions that combat unique AI system threats. If you’re embarking on a new generative AI project and want to make sure you develop it responsibly and protect your users’ data, we will be happy to discuss how we can help.

Let’s build together products that are both capable and conscientious.

Michael