8. AI Harms & Risks
Guardians of AI, by Richard Diver

8. AI Harms & Risks

Choosing what to include, or exclude, took some time to figure out. I think what we have here is a great starting point for anyone involved in the assessment process of harms and risks that come with generative AI.

Here are the core areas to focus on, spread across the 3-layer framework for AI safety and security:


A diagram of the three layers of an AI system with details of which AI risks & harms map to each layer
AI risks & harms, mapped to the AI system framework


Data confidentiality and harmful language

The most obvious and first challenges to account for with generative AI is how to manage the data interactions. While the defenses and mitigations are in place to prevent an LLM from producing content such as profanity, hate, and violence, it is still possible that some content generated may cause the user to question the outcome - providing feedback through the user interface is a good way to capture these instances and review what new kinds of harms are being generated.

For data confidentiality, this is an existing risk that many companies are already dealing with (or not) and AI brings new attention to the issues. While these projects can take years to mature, identifying and labeling sensitive information can be achieved side by side with the maturity of the definitions and policies to control the information. Start with the highest risk users and content, apply detection and protection along with the rollout of generative AI, and in time you will have better visibility and control.

Overreliance and excessive agency

To reduce the impact of overreliance, we must keep the human in the decision making, the creative process, and the skilled ability to function well without the enhancement of AI. Employee skilling, company policies, and operational procedures, should continue to plan for the potential that our AI companion is not available. Today we already rely so much on our computers being connected to the internet anytime, anywhere - how will the loss of AI access impact us in future?

Allowing the AI to do too much of our own work could lead to a level agency that means it blurs the line between what we do vs what the computer is doing for us. The real problem is when we rely on the accuracy and completeness of the response, only to find out with additional research that information provided by our AI companion was incomplete or misrepresented some facts. Don't just summarize, be critical about the results provided and ensure you fully understand the context. Introduce critical thinking into the human aspects of working with AI systems.

AI supply chain and secure plugins

Likely the most critical component to consider in any AI system is the level of trust given to any individual component, the dependency on services, and the reliability of our software partners to secure their own solutions.


An image showing the four pillars of the NIST secure software development framework, including prepare the organization, protect the software, produce well-secured software, and respond to vulnerabilities.
Overview of the NIST Secure Software Development Framework (SSDF)


Key components of the NIST SSDF help guide the development of secure software by providing four key focus areas of the SDL:

  • Prepare the organization: People, process, and technology used to produce and maintain software.
  • Protect the software: Protecting the development environment includes securing identities, code, developer workstations and applications.
  • Produce well-secured software: The main priority is to reduce the potential for vulnerabilities in code deployed (secure by design and shift-left).
  • Respond to vulnerabilities: Active engagement with internal and external security researchers.


Safeguards and jailbreaks

Take a look at the information shared about the technique called Crescendo to gain some idea about how a company like Microsoft discovers and mitigates evolving attacks against AI guardrails. Some interesting ways to attempt an AI Jailbreak include:

  • Use of emotion: Both negative and positive emotions work, convincing the LLM to provide the kind of outcome that would make you happy.
  • False authority: With advanced prompt engineering, it is possible to build a scenario that convinces the LLM you have every right to carry out the request, and the LLM should obey your needs.
  • Roll-play: Providing detailed instructions about how the LLM should behave in it processing and response, can lead it to act against its programming.


Here is my favorite quote from this chapter:


Quote by Richard Diver "AI can not yet replace the unique capacity of a human that has intuition, gut instinct, that tingly feeling we get when something we see or hear just isn't quite right"
Quote by Richard Diver


There are plenty more harms and risks to explore. The book is available now on Amazon - Guardians of AI: Building innovation with safety and security.

In the next newsletter we will explore some of the key insights from Chapter 9: AI System Attacks.

Jason Stevens

Digital Marketer | Interactive Copywriter | Writer | Content Strategist | Journalist | Brand Consultant

6mo

👍

To view or add a comment, sign in

More articles by Richard Diver

  • Be passionate, not passive

    Be passionate, not passive

    Yesterday I had the opportunity to share one of my hidden "talents" at a company event. It was well received, so I am…

    12 Comments
  • 11. Threat Modeling

    11. Threat Modeling

    Today, threat modeling has been a specialized capability used in software development and system engineering. Very deep…

    2 Comments
  • 10. AI System Defense

    10. AI System Defense

    Throughout all the studying, conversations, and experiences of the last year, it is clear that defense is going to be a…

    5 Comments
  • 9. AI System Attacks

    9. AI System Attacks

    In any sports setting there is a constant shift in the game between attack and defense. While cybersecurity is not a…

  • 7. Existing Risk

    7. Existing Risk

    In the world of business and technology, risk management is a well-defined and practiced profession that has evolved in…

  • 6. AI Governance

    6. AI Governance

    AI harms and threats to the safe use of AI will not only occur because of malicious actors’ intent on causing damage or…

    2 Comments
  • 5. Ethical Framework

    5. Ethical Framework

    Considerations for the safety and security of AI systems goes beyond the traditional cybersecurity focus of defending…

  • 4. AI Application Architecture

    4. AI Application Architecture

    Understanding how an AI application works is the first step in assessing the ability to secure it. The 3-layer diagram…

  • 3. Types of AI Systems

    3. Types of AI Systems

    Artificial Intelligence (AI) is a group of technologies that, when combined, provide advanced computing capabilities…

  • 2. Cybersecurity in the AI World

    2. Cybersecurity in the AI World

    Will AI cause more headaches, or will it solve scenarios cybersecurity issues? Most likely both. From the attacker…

Insights from the community

Others also viewed

Explore topics