Red Teaming’s attempt to go mainstream with generative AI

When you think about foundational security, red teaming is often simply not on the list. According to the Ponemon Institute in 2023, only 64% of organizations use red teaming, and most testing is done using tabletop exercises instead of live environment attacks. Look at the AWS Well Architected Security Pillar; security simulations are found in the “Incident response” section towards the end, as the 2nd to last control, and you’re expected to “consider” doing them regularly. Hardly a ringing endorsement. But, if you have been watching the generative AI security publications the past month, a lot of red ink related to the importance of red teaming in the age of Agentic AI have been spilled. Here’s just a few published in May:

To me, this begs the question, is red teaming now ready for the mainstream because of generative AI and agentic architectures?

Threat Modeling: A science and an art, with not enough practitioners

Threat modeling is, in essence, building a map of all information and resources inside of a system, and then applying an adversarial lens identifying how to abuse, bypass, or disrupt each information flow, component, or interaction. Check out the Threat Modeling Manifesto for a more complete (and more accurate) definition. While threat models and techniques, like STRIDE from 2009, have been around for a long time, the practice of using them is still minimally applied in most organizations. But even for those versed in traditional models, they are likely too limited to be sufficient for generative AI, especially agentic systems. In agentic systems, threat models need to extend into vulnerabilities around Model Context Protocol (MCP) and Agent2Agent (A2A) assumptions, the impact of distributed autonomy and maintaining context across agents, and non-deterministic behavior. Because of this, new frameworks like the MAESTRO (Multi-Agent Environment, Security, Threat, Risk, and Outcome) Framework from OWASP have been developed. MAESTRO maps threats across architectural layers (foundation models, data operations, agent frameworks, etc.) using the OWASP Agentic Security Initiative (ASI) threat taxonomy (e.g., Tool Misuse, Privilege Compromise, Inter-agent Communication Poisoning).

But if we haven’t done great threat modeling in the past, why will anything change now? Well, the advent of agentic AI models will increase not just the need for threat modelers, but also increase our capacity to perform threat modeling. Let’s acknowledge that agentic AI systems create more complex and less predictable models where trying to have finite, well defined paths may not be entirely possible. In this case, stochastic modeling of both the system itself and the threats that can be introduced is greatly helped by an AI informing a security analyst. LLMs can also assist in the creation of threat models. LLMs can be powerful assistants to security teams in drafting initial threat models or suggesting mitigations. Because of all of this, Google believes defenders have the advantage in this scenario. People and partners who can bring real, industry-specific threat modeling to your generative AI application will be able to move the needle on security far more than legacy guardrails.

Red (or Purple) Teaming: Putting threat models into practice

Let’s say you have put together a great threat modeling team or set of tools, how do you validate you’ve got the right risks or the right investments? This where simulations like red teaming (or purple teaming if you have the staff). Where internal tooling can do the basic protections against generic attacks like jailbreaking or prompt injection, far more threats will be found with a holistic assessment of the agentic workflow and bringing your industry’s particulars into the thinking. For example, multi-turn attacks gradually steering one agent towards tricking a downstream agent to perform a malicious activity or tricking it into revealing sensitive information, misusing agent-based tools in agentic workflows to poison downstream data or prompts, and testing the robustness of agents’ permission models, looking for ways to escalate privileges or take unauthorized control. As we’ve talked about before, and as found multiple time on the artifacts quoted, shifting between black-box, grey-box, and white-box models for defining system prompts will all create different outcomes. Depending on your adversarial model, you can’t assume your backend architecture will be kept secret from a persistent threat team.

Once again, automation and using LLM’s for your benefit both play a key role in successfully putting these protections in place in a continuous manner. Threats are constantly evolving, and agentic models make the risk of something like another Log4Shell more likely as downstream agents have access to different resources. Having tools and partners that will integrate into your security posture with continuous evaluation, while also maintaining updated threat models as real-time threats are discovered will be invaluable to staying secure.

The Path Forward: Using gen AI for gen AI security

Securing agentic AI is not a one-time task but an ongoing discipline. It requires:

  • Specialized Tools and Techniques: Continuously developing and adopting new methods for red teaming and security testing.
  • Community Collaboration: Initiatives like the LVE Project, which aims to create an open repository of LLM vulnerabilities, and joint efforts by organizations like CSA and OWASP, are vital for sharing knowledge and best practices.
  • Adaptive Defenses: Building systems that can monitor agent behavior, detect anomalies, and enforce security policies dynamically. This includes both traditional deterministic controls and newer reasoning-based defenses.

So automated red teaming, powered by LLM’s but guided by humans, have a clear opportunity to fill these gaps. Whether or not there’s room in the security budget to prioritize this will depend on whether or not real value and threat reduction can be shown. In the Ponemon Institute report 56% of survey participants said they wanted to increase investment in red teaming, so the opportunity is there. Whether or not we can tie our value to the value generative AI is bring to the organization will make the difference. As always, if you want to discuss this more or connect with us about helping you achieve this transformation, please reach out to questions@generativesecurity.ai.

About the author

Michael Wasielewski is the founder and lead of Generative Security. With 20+ years of experience in networking, security, cloud, and enterprise architecture Michael brings a unique perspective to new technologies. Working on generative AI security for the past 2 years, Michael connects the dots between the organizational, the technical, and the business impacts of generative AI security. Michael looks forward to spending more time golfing, swimming in the ocean, and skydiving… someday.