8. AI Harms & Risks
Choosing what to include, or exclude, took some time to figure out. I think what we have here is a great starting point for anyone involved in the assessment process of harms and risks that come with generative AI.
Here are the core areas to focus on, spread across the 3-layer framework for AI safety and security:
Data confidentiality and harmful language
The most obvious and first challenges to account for with generative AI is how to manage the data interactions. While the defenses and mitigations are in place to prevent an LLM from producing content such as profanity, hate, and violence, it is still possible that some content generated may cause the user to question the outcome - providing feedback through the user interface is a good way to capture these instances and review what new kinds of harms are being generated.
For data confidentiality, this is an existing risk that many companies are already dealing with (or not) and AI brings new attention to the issues. While these projects can take years to mature, identifying and labeling sensitive information can be achieved side by side with the maturity of the definitions and policies to control the information. Start with the highest risk users and content, apply detection and protection along with the rollout of generative AI, and in time you will have better visibility and control.
Overreliance and excessive agency
To reduce the impact of overreliance, we must keep the human in the decision making, the creative process, and the skilled ability to function well without the enhancement of AI. Employee skilling, company policies, and operational procedures, should continue to plan for the potential that our AI companion is not available. Today we already rely so much on our computers being connected to the internet anytime, anywhere - how will the loss of AI access impact us in future?
Allowing the AI to do too much of our own work could lead to a level agency that means it blurs the line between what we do vs what the computer is doing for us. The real problem is when we rely on the accuracy and completeness of the response, only to find out with additional research that information provided by our AI companion was incomplete or misrepresented some facts. Don't just summarize, be critical about the results provided and ensure you fully understand the context. Introduce critical thinking into the human aspects of working with AI systems.
AI supply chain and secure plugins
Likely the most critical component to consider in any AI system is the level of trust given to any individual component, the dependency on services, and the reliability of our software partners to secure their own solutions.
Recommended by LinkedIn
Key components of the NIST SSDF help guide the development of secure software by providing four key focus areas of the SDL:
Safeguards and jailbreaks
Take a look at the information shared about the technique called Crescendo to gain some idea about how a company like Microsoft discovers and mitigates evolving attacks against AI guardrails. Some interesting ways to attempt an AI Jailbreak include:
Here is my favorite quote from this chapter:
There are plenty more harms and risks to explore. The book is available now on Amazon - Guardians of AI: Building innovation with safety and security.
In the next newsletter we will explore some of the key insights from Chapter 9: AI System Attacks.
Digital Marketer | Interactive Copywriter | Writer | Content Strategist | Journalist | Brand Consultant
6mo👍