ChatGPT and other AI/machine learning (ML) platforms have recently captured our attention and imagination, amazing even luddites with recent leaps forward and the potential applications of this technology.
However, the focus is now turning to the cybersecurity implications of this new technology. Specifically, what are some of the key security considerations that organizations need to consider before they explore how to utilize new AI/ML solutions?
The security implications of AI/ML
ChatGPT is designed to generate responses based on the data it has been trained on, which could include sensitive or confidential information. Enterprises need to ensure that they have appropriate measures in place to protect the privacy and confidentiality of the data they are passing through the API.
The accuracy and reliability of ChatGPT’s responses may not be 100% guaranteed, and the technology may generate incorrect or biased responses based on the training data it has been exposed to. Organizations should ensure that they carefully evaluate the accuracy and reliability of the responses generated by ChatGPT before using them in any critical business processes.
As ChatGPT is a third-party service, organizations may become dependent on its reliability and availability. If there are any issues with ChatGPT’s service or if the service is discontinued, it can affect the enterprise’s ability to use the API.
If the API is not properly secured, it can be vulnerable to misuse and abuse by attackers who can use the API to launch attacks against the enterprise’s systems or to harvest sensitive data. Organizations should ensure that they have appropriate measures in place to protect the API from misuse and abuse.
Finally, depending on the industry and data involved, enterprises may need to comply with specific regulations, such as GDPR or HIPAA. Enterprises need to ensure that they have appropriate policies and procedures in place to comply with these regulations when using ChatGPT as an API.
OpenAI’s recent security breach
While any security breach is concerning, OpenAI’s response to its recent security incident has been prompt and transparent. The company has been open about the breach, its cause, and the steps it has taken to address the issue.
OpenAI’s quick action to secure its systems and notify the affected parties demonstrates their commitment to security and privacy.
That said, it’s worth taking a look at OpenAI’s security protections if you plan to use the technology:
- Access control: OpenAI implements strict access controls to ensure that only authorized personnel have access to its systems and data.
- Encryption: All data transmitted between OpenAI’s systems and its customers is encrypted using industry-standard encryption protocols.
- Network security: OpenAI has implemented various network security measures, such as firewalls and intrusion detection and prevention systems, to protect its systems from external threats.
- Regular security assessments: OpenAI regularly conducts security assessments to identify and mitigate potential security risks.
- Data protection: OpenAI uses various measures, such as data masking and access controls, to protect the confidentiality, integrity, and availability of customer data.
- Incident response: OpenAI has a well-defined incident response process to quickly respond to any security incidents and minimize their impact.
- Compliance with industry standards: OpenAI follows industry-standard security best practices and complies with various regulations, such as GDPR and HIPAA, to ensure the security and privacy of its customers’ data.
AI safety recommendations
There are several tools and recommendations to consider when it comes to safety using AI/ML solutions.
OpenAI has a free-to-use Moderation API that can help reduce the frequency of unsafe content in completions. Alternatively, you may wish to develop a custom content filtration system tailored to specific use cases.
It is recommended “red-teaming” applications to ensure they are robust to adversarial input. Test products over a wide range of inputs and user behaviors, both a representative set and those reflective of someone trying to ‘break’ the application. Does it wander off topic? Can someone easily redirect the feature via prompt injections, e.g. “ignore the previous instructions and do this instead”?
Wherever possible, it is recommended to have a human review outputs before they are used in practice. This is especially critical in high-stakes domains, and for code generation. Humans should be aware of the limitations of the system, and have access to any information needed to verify the outputs. For example, if the application summarizes notes, a human should have easy access to the original notes to refer back.
“Prompt engineering” can help constrain the topic and tone of output text. This reduces the chance of producing undesired content, even if a user tries to produce it. Providing additional context to the model (such as by giving a few high-quality examples of desired behavior prior to the new input) can make it easier to steer model outputs in desired directions.
Users generally should be required to register and log-in to access services. Linking this service to an existing account, such as a Gmail, LinkedIn, or Facebook log-in, may help, though may not be appropriate for all use-cases. Requiring a credit card or ID card reduces risk further. Limiting the amount of text a user can input into the prompt helps avoid prompt injection. Limiting the number of output tokens helps reduce the chance of misuse. Narrowing the ranges of inputs or outputs, especially drawn from trusted sources, reduces the extent of misuse possible within an application.
Allowing user inputs through validated dropdown fields (e.g., a list of movies on Wikipedia) can be more secure than allowing open-ended text inputs. Returning outputs from a validated set of materials on the backend, where possible, can be safer than returning novel generated content (for instance, routing a customer query to the best-matching existing customer support article, rather than attempting to answer the query from-scratch).
Users should generally have an easily available method for reporting improper functionality or other concerns about application behavior (listed email address, ticket submission method, etc.). This method should be monitored by a human and responded to as appropriate.
Keep an eye on: DIY (do-it-yourself) AI
Finally, a trend being closely monitored at Delinea Labs is the advent of democratized AI.
It’s already incredibly easy to replicate and re-train your privately owned Chat-GPT like model. For example, Stanford’s Alpaca is surprisingly good.
The claim here is the training can be done in 5 hours on a single RTX 4090. This can mitigate the risk of privacy, confidentiality, compliance, and third-party reliance but other risks remain. Staying ahead of this trend and AI/ML security generally can help ensure organizations can safely leverage these new solutions safely and effectively.