OWASP Top 10 for LLM Applications Newsletter - August '24 Edition

Red Teaming LLM Agents !!

Red Teaming LLM Agents !!

Greetings Gen AI Security Enthusiasts and OWASP Community Members!

โœ๐Ÿฝ Generative Insights : Reflections from the Editorial Desk โœ๐Ÿฝ

In this edition, we're diving into two intriguing topics: LLM Agents and AI Red Teaming. While they are separate and distinct for now, the idea of Red Teaming LLM Agents is not far-fetchedโ€”that might very well become one of the definitive use cases for LLM Agents! One that could refine how we think about AI security and robustness.

 Back from the Future - A Quick Heads-Up: This newsletter turned out a bit longer than anticipated, but thatโ€™s because I wanted to give you a complete look at our (security) view of LLM Agents. Itโ€™s important to have a solid understanding of these intriguing (and sometimes pesky) Autonomous LLM Agents (ALA) so you can form your own informed opinions.

I promise, itโ€™s worth sticking with it to the endโ€”though I know, for some of you, this might feel like your cue to stop reading. Hang in there, and apologies in advance if Iโ€™ve tested your patience!

What are agents, actually ? You might ask. You should, because I found a very workable definition here !

๐—”๐—œ ๐—”๐—ด๐—ฒ๐—ป๐˜๐˜€: ๐˜ˆ๐˜ฏ ๐˜ˆ๐˜ ๐˜ˆ๐˜จ๐˜ฆ๐˜ฏ๐˜ต ๐˜ช๐˜ด ๐˜ข ๐˜ฑ๐˜ณ๐˜ฐ๐˜จ๐˜ณ๐˜ข๐˜ฎ ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ถ๐˜ด๐˜ฆ๐˜ด ๐˜ฐ๐˜ฏ๐˜ฆ ๐˜ฐ๐˜ณ ๐˜ฎ๐˜ฐ๐˜ณ๐˜ฆ ๐˜“๐˜ข๐˜ณ๐˜จ๐˜ฆ ๐˜“๐˜ข๐˜ฏ๐˜จ๐˜ถ๐˜ข๐˜จ๐˜ฆ ๐˜”๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ๐˜ด (๐˜“๐˜“๐˜”๐˜ด) ๐˜ฐ๐˜ณ ๐˜๐˜ฐ๐˜ถ๐˜ฏ๐˜ฅ๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜”๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ๐˜ด (๐˜๐˜”๐˜ด) ๐˜ข๐˜ด ๐˜ช๐˜ต๐˜ด ๐˜ฃ๐˜ข๐˜ค๐˜ฌ๐˜ฃ๐˜ฐ๐˜ฏ๐˜ฆ, ๐˜ฆ๐˜ฏ๐˜ข๐˜ฃ๐˜ญ๐˜ช๐˜ฏ๐˜จ ๐˜ช๐˜ต ๐˜ต๐˜ฐ ๐˜ฐ๐˜ฑ๐˜ฆ๐˜ณ๐˜ข๐˜ต๐˜ฆ ๐˜ข๐˜ถ๐˜ต๐˜ฐ๐˜ฏ๐˜ฐ๐˜ฎ๐˜ฐ๐˜ถ๐˜ด๐˜ญ๐˜บ. ๐˜‰๐˜บ ๐˜ฅ๐˜ฆ๐˜ค๐˜ฐ๐˜ฎ๐˜ฑ๐˜ฐ๐˜ด๐˜ช๐˜ฏ๐˜จ ๐˜ฒ๐˜ถ๐˜ฆ๐˜ณ๐˜ช๐˜ฆ๐˜ด, ๐˜ฑ๐˜ญ๐˜ข๐˜ฏ๐˜ฏ๐˜ช๐˜ฏ๐˜จ & ๐˜ค๐˜ณ๐˜ฆ๐˜ข๐˜ต๐˜ช๐˜ฏ๐˜จ ๐˜ข ๐˜ด๐˜ฆ๐˜ฒ๐˜ถ๐˜ฆ๐˜ฏ๐˜ค๐˜ฆ ๐˜ฐ๐˜ง ๐˜ฆ๐˜ท๐˜ฆ๐˜ฏ๐˜ต๐˜ด, ๐˜ต๐˜ฉ๐˜ฆ ๐˜ˆ๐˜ ๐˜ˆ๐˜จ๐˜ฆ๐˜ฏ๐˜ต ๐˜ฆ๐˜ง๐˜ง๐˜ฆ๐˜ค๐˜ต๐˜ช๐˜ท๐˜ฆ๐˜ญ๐˜บ ๐˜ข๐˜ฅ๐˜ฅ๐˜ณ๐˜ฆ๐˜ด๐˜ด๐˜ฆ๐˜ด ๐˜ข๐˜ฏ๐˜ฅ ๐˜ด๐˜ฐ๐˜ญ๐˜ท๐˜ฆ๐˜ด ๐˜ค๐˜ฐ๐˜ฎ๐˜ฑ๐˜ญ๐˜ฆ๐˜น ๐˜ฑ๐˜ณ๐˜ฐ๐˜ฃ๐˜ญ๐˜ฆ๐˜ฎ๐˜ด.

Exec Summary

  • LLM Agents : Our focus is, of course, OWASP T10 LLM Applications v2 and LLM Agents (along with a few interesting asymptotes).

    • Weโ€™ve two candidate proposals in the pipeline for v2, a dedicated Slack channel buzzing with ideas and some exciting developments to share. 

    • Leading the charge is John Sotiropoulos, the head of this work stream, who, along with a few of our members, will be sharing insights.

    • IBM has an interesting roadmap w.r.t LLM Agents ! Of course, IBM always had, and will have interesting roadmaps. Theyโ€™ve consistently set the standard, and this time is no different.

    • Speaking of standards, Mr.Sotiropoulos is our standards bearer, representing us in NIST, CosAI and beyond. โ€ฆ very collaborative, knowledgable and meticulous โ€ฆ in short, someone whom youโ€™d want in your team ! BTW, John has a new book out Adversarial AI Attacks, Mitigations, and Defense Strategies. Amazon says it is #1 on the topic of Computer Viruses ! Most of us might debate that topic placement - but who are we to quibble with the all-knowing Amazon Algorithm?

  • AI Red Teaming : A Rising Tide

    • AI Red Teaming has been bubbling all over the place ever since NIST coined the term. Then, there is also the GRT (Generative AI Red Teaming) popularized by AI Village.

      As Jason Ross aptly puts it, โ€œFrom my pov, I think defining very clearly what we mean by "red team" is critical: that term is overloaded in the AI/ML space and means different things to different peopleโ€ Well said, Jasonโ€”clarity is key โ€ฆ

      We have started an initiative on this very topic (checkout the current charter here - weโ€™d love your feedback and comments)

      The proposal is moving through our lightweight process - the core team member voting is nearly complete. When I last spoke with Scott Clinton, (who has done an excellent job shaping our light weight process for initiatives - among many other contributions), all the votes were in favor, with just a couple of requested changes.

One more double click โ€ฆ

LLM Agents

The two main candidate proposals are Vulnerable Autonomous Agents and Agent Autonomy Escalation. (Remember, here our focus is on the risks and mitigations - not on agent architectures, frameworks and a host of other essential topics. The IBM roadmap (below) gives a glimpse of the latter.)

Background:

John has done a good summary of the domains we are looking at:

IBM has a set of good introduction here. I like the IBMโ€™s roadmap viz.

๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿด: ๐——๐—ฒ๐˜ƒ๐—ฒ๐—น๐—ผ๐—ฝ ๐—ฎ๐˜‚๐˜๐—ผ๐—ป๐—ผ๐—บ๐—ผ๐˜‚๐˜€ ๐—ฎ๐—ป๐—ฑ ๐—ฏ๐—ฟ๐—ผ๐—ฎ๐—ฑ๐—น๐˜† ๐—ถ๐—ป๐˜๐—ฒ๐—น๐—น๐—ถ๐—ด๐—ฒ๐—ป๐˜ ๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€

We will build autonomous AI that learns reliably and efficiently from its environment and responds to previously unseen situations through broad generalizations.

IBM Roadmap

I asked our slack list #team-llm_v2_agents_discussion about the relevance of agents. Couple of interesting replies:

  • Is the concept of LLM agents a fad?

    • NO, but it is not a proven business model...yet. Agents powered by language models will continue to grow in popularity.

    • Andrew Ng, in his talk โ€œWhatโ€™s Next for AI Agentic Workflows,โ€ mentioned that agent frameworks could significantly enhance performance. He pointed out that GPT-3.5 with agents outperforms a single instance of GPT-4. A similar study by Mirac Suzgun et al., Meta-Prompting supports this view.

  • What are some quick wins?

    • Generating any kind of art, writing, and so on. Create multi-agent system with 3 roles, idea generator, writer, and editor/critique. Browsing the internet with playwright.

    • Code generation systems like GPT Engineer can quickly produce an initial codebase for a micro service or a simple project with common login functionality. My teammates use it often.

      • However, due to the current limitations in model context, once the code is generated, itโ€™s more effective to use a single-agent tool like GitHub Copilot for code editing rather than relying on GPT Engineerโ€™s editing features.

  • What are some esoteric use cases?

    • I know people are going to (and have already) put these into physical robots running local language models at the edge, independent of an API.

    • There are two advanced prompting techniques involving role-playing: Solo-Performance Prompting (SPP) and Meta-Prompting. While these arenโ€™t exactly agents, the way prompts are structured enables a multi-agent-like generation process.

T10 V2 Candidate proposals

1.Vulnerable Autonomous Agents [here] proposal deals with the risks associated with Autonomous LLM-based Agents (ALA)

2.Agent Autonomy Escalation [here] occurs when a deployed LLM-based agent (such as a virtual assistant or automated workflow manager) gains unintended levels of control or decision-making capabilities. This can result from misconfigurations, unexpected interactions between different agents, or exploitation by malicious actors.

  • Risks : Harmful or Unauthorized Actions , Operational Chaos(disrupt workflows, cause data corruption, or trigger automated processes improperly), Security Risks (agents performing harmful actions) and Loss of Control

  • Mitigations: Emmanuel Guilherme Junior, the author, has 5 mitigation strategies viz., Principle of Least Privilege, Access Controls, Fail-Safe Mechanisms, Separation of Duties and finally Behavioral Monitoring

Current Status :

John has succinctly and thoughtfully listed three options for v2 w.r.t agents. Good read, not only for understanding the OWASP T10 list but also to get an insight into the state of agents.

The Challenge and Complexity Unveiled
The real challengeโ€”and the complexityโ€”emerges from the intersection of LLM07: Insecure Plugin Extension Design, LLM08: Excessive Agency, and the new possibility LLM07x: MultiAgent Autonomy Escalation. These areas are rapidly evolving, with countless permutations and scenarios unfolding. Honestly, none of us can predict exactly how these dynamics will play out or how quickly theyโ€™ll change.

That said, I encourage you to dive inโ€”itโ€™s a fascinating glimpse into the evolving landscape of Autonomous LLM Agents (ALAs). In my humble opinion, itโ€™s time well spent.

There is no single clear-cut way to capture agentic threats and concerns. Depending on what we want to highlight, we could:

[Option #1] - Accommodate by updating existing entries (LLM08:Excessive Agency, and LLM07:Insecure Plugin Extension Design) to reflect new landscape including new Llama tools and beyond Open AI plugins

Or

[Option #2] - Merge LLM08:Excessive Agency and LLM07:Insecure Plugin Design into one entry for Agents

Or

[Option #3] - Update LLM08:Excessive Agency to cover excessive agency in agentic workflows and related agents ; replace LLM07:Insecure Plugin Design with a new entry (LLM07:MultiAgent Autonomy Escalation) on conversational multi-agents and the newer attack vectors they bring (local memory, full autonomy with adaptation, environment)

What do you think? Before you make up your mind, take a moment to read onโ€”weโ€™ll dive into the pros and cons of these options.

 

If you are bullish on LLM Agents, youโ€™d definitely choose Option #3. OTOH, if you lean towards a more pragmatic approach, Option #1 could be the way to go. Last time I spoke with John, he was leaning towards Option #1.

Next Steps:

  1. Seek Feedback on options

  2. Make a recommendation

  3. Create some proof-of-concept merged entries, test, and finalise

Regardless of how we cover this in Top 10, we will propose and start and autonomous agents Initiative and Working Group to continue tracking developments in agents in detail

AI Red Teaming initiative

Shifting focus, letโ€™s take a closer look at the current state of the Generative AI Red Teaming (GRT) initiative proposal here. We will update the proposal once all the core-team comments are in.

  • Initially we had a much more ambitious proposal here - one that was broader in scope. We received practical and valuable advice from Scott and Steve - weโ€™ve now honed in on a more focused proposal:

    • Scott - โ€œโ€ฆ this proposes a lot of deliverables - 5x what we as a team have delivered so far - probably need to prioritize, perhaps a phased approachโ€

    • As I am writing this newsletter, Steveโ€™s comments came along, โ€œkeep the first version tight, and focused. These things can spin out of control if not tightly managed. As a group, create a schedule and really reduce the scope as much as possible. One you get something out, you can always expand. Just don't let it turn into a runaway project that is super late to market โ€ฆ โ€

As this edition of the newsletter has grown much longer than I initially planned, Iโ€™ve decided to save the in-depth AI Red Team discussion for the next edition. I know some of you might be disappointed, and for that, I apologize.

In the meantime, I encourage you to share your insights and comments on the AI Red Teaming proposal here. Your input is invaluable and will help us steer this initiative in the right direction

๐Ÿ”— OWASP LLM Documents & Links ๐Ÿ”—

  • Previous newsletters (to catchup after marveling our eloquence ๐Ÿ™ƒ) [Here]

  • OWASP Top 10 LLM main site [Here]

  • OWASP Top 10 for LLM Applications v1.1 [Here]

  • LLM AI Security & Governance Checklist v1.1 [Here]

๐Ÿ“ฑStay Connected๐Ÿ“ฑ

Aubrey King has expanded our social media reach across various platforms including LinkedIn, Youtube, Twitter (Agree with Will, not going to say X), and soon even more! Follow and donโ€™t be shy to re-share any content we post! BTW, if you're presenting anything related to our project externally, do let us know at [email protected] so we can review and blast out to our social media followers!

๐Ÿ’ญ Closing Thoughts ๐Ÿ’ญ

We look forward to seeing you in our Slack channels and at our upcoming meetings!

Till next time โ€ฆ Stay Secure & Stay Sharp โ€ฆ

Krishna Sankar
LinkedIn | Medium | Github