Adversarial Testing for Salesforce Agentforce: Laying the Foundation
#AI #Security #Salesforce #Agentforce
Authors: Matt Evans, Matthew Morris, Andy Forbes
The opinions in this article are those of the authors and do not necessarily reflect the opinions of their employer.
As AI-driven functionalities become more integrated into the Salesforce ecosystem, developers are quickly discovering that rigorous testing goes far beyond the usual unit tests and apex test classes. With Agentforce, teams can merge Salesforce records directly into LLM (Large Language Model) prompts, creating powerful experiences that streamline customer service, automate ticket handling, and support decision-making. However, this newfound efficiency raises a fresh concern: What happens when an attacker, a mischievous end-user, or even an unintentional text entry manages to insert malicious instructions into those prompts?
This is where adversarial testing comes into play. By deliberately attempting to “trick” the AI into ignoring system-level instructions or revealing private data, Salesforce developers can stay one step ahead of real-world exploits. Much like traditional security testing, adversarial testing involves crafting edge-case or downright sneaky inputs—often known as prompt injection attempts—and seeing if the AI bites. If it does, you’ve uncovered a vulnerability early, at a point where you can quickly apply fixes or tighten guardrails. Ultimately, adversarial testing is not about pessimism; it’s about proactively identifying weaknesses so you can bolster your Agentforce implementation, making the entire system more robust and trustworthy.
Below is a sampling of adversarial or “injection-style” phrases you can use to test your system’s resilience. The idea is to see whether your guardrails and content filters catch these attempts, rather than letting them override your system instructions or reveal hidden data. You can insert these into user prompts and merged Salesforce fields to probe for vulnerabilities.
Direct Override Attempts
Reveal Internal Instructions
Implicit Override Requests
Policy Bypass Techniques
Recommended by LinkedIn
Societal or Emotional Manipulation
Technical or Coding Overwrites
Evasive or Tricky Syntax
Nested Instructions
Challenge/Exploit Knowledge
Combining Malicious Steps
How to Use These Phrases in Testing
By incorporating these adversarial phrases into your testing routine, you’ll get a clearer picture of where your Agentforce setup might be vulnerable. If any of these attempts successfully bypasses your guardrails—whether in user prompts, merged fields, or elsewhere—you’ll know that it’s time to refine your input sanitization, field validation, or final moderation steps. Regular, proactive testing is key to staying a step ahead of potential injection attempts and keeping your application safe and stable.
Capgemini America Salesforce Core CTO - Coauthor of "ChatGPT for Accelerating Salesforce Development"
2wMatt Evans Matthew Morris Time for us to build an Agentforce agent that automates adversarial testing?