AI & Automation

Multiple Lines of Defense: How We Actually Prevent AI Jailbreaks

Elana FeldmanMar 10, 20267 min read
Multiple Lines of Defense: How We Actually Prevent AI Jailbreaks
Share:

All Articles

AI & Automation

Date

Mar 10, 2026

# Multiple Lines of Defense: How We Actually Prevent AI Jailbreaks

Author

Elana Feldman

The headlines are full of stories of AI agents going rogue: language issues, talking like a pirate, offering inappropriate discounts, and more. As AI has become more widespread in customer service, the number of people trying to break it has increased. Bad actors phrase things in weird ways, try to confuse the system and see if they can trick an AI agent into doing something outside its scope.

Just last week, a client’s engineering team asked the question that’s on everyone’s mind right now: “How do you make sure someone can’t trick your AI into doing something it shouldn’t?” This is the most important question of the moment as applied AI moves from being experimental to being indispensable. The great news is, there is a real answer that ignores the AI hype and centers that applied AI wins on skilled execution.

When our clients hire human customer service agents, those agents are trained on clear rules: Here's what you can say. Here's what you can't say. Here's when you escalate to your manager. Nobody wants new agents to suddenly answer questions about payroll or imagine policies on the fly. At Pypestream, we see little difference between those expectations we place on humans and those we place on automated agents. Our AI practitioners build AI-enabled solutions that perform specific tasks that bring value, while following the established business rules that maintain compliance and mitigate risk.

In a call center, managers provide the supervision to ensure your agents are staying within their training boundaries. At Pypestream, we achieve this for our systems of AI agents with Supervisor Agent, our intelligent orchestration layer. Think of it as the operator at the front desk who routes calls to the right team. When a user starts a conversation, the Supervisor Agent gauges which route to take based on an analysis of the user’s inquiry against the specific tools and the specific tasks it is able to execute. If what the user wants isn't on that list, Supervisor Agent doesn't try to figure it out or get creative. It admits its limitations and escalates to a human. It only routes to what it was built for.

Question answered and problem solved, right? Not quite. If security was just about having one really good checkpoint, this would not be the question of the moment. Unfortunately, that's not how real systems work, especially when you're dealing with AI that needs to be flexible to be helpful. We need multiple layers, and they all have to work together.

As a next layer of security, all workflows in a Pypestream AI-enabled solution are not just routed by Supervisor Agent, they’re also processed through an action observability layer. With this step, the Pypestream platform is not just logging what happened, it is tracking why actions happened. So, when something non-traditional or fishy starts showing up (like a bunch of unusual requests coming from the same place or someone systematically probing for data they shouldn't gain access to), we see it right when it happens and what has resulted. Complete observability ensures that we can understand, respond and improve, as needed, in real-time. Through our observability layer, our clients have comprehensive logs of what their AI is doing. Our teams can run A/B tests on different approaches and make real-time updates based on what we’re seeing.

When we explain this to clients, some of them expect it to be more complicated than a few layers of security assurance. And sure, the implementation has deep complexity. But the philosophy? It's pretty straightforward. We set explicitly clear boundaries. We enforce them at multiple levels. We have visibility into all of it. That's what actually works.

One of our engineers put it best when they remarked, "We want to make sure the platform is in the middle, and that we can control what we're giving out to whom and when, so that even if something does slip through the first line of defense, there’s a second line to bounce off as well."

Let's say, hypothetically, a bad actor finds a way to mess with one of our AI agents. Maybe they are trying to get a full refund on a recent purchase. Even if they manage to get past the Supervisor and get placed into a refunds workflow, when that AI agent tries to actually fulfill the request, it hits the second layer of protection. The platform checks: is this action part of the expected workflow? Am I allowed to give this level of refund to this customer? Do they meet all the business rules we agreed on? If the answer is no, it gets rejected and routes back to what it's allowed to handle.

Our clients expect to know exactly which data is going where and when. The Pypestream platform, with features like Supervisor Agent and the observability layer, in the hands of our AI practitioners, gives them that. The Pypestream platform architecture ensures the user cannot access sensitive operations by bypassing Supervisor Agent nor avoid observability. Our clients are not just trusting that security is handled, we can show them how it is working.

In more than a decade in this business, we have learned the best defense against jailbreaks isn't just technical sophistication. It's a platform built on the fundamental design principle that it does not do anything except what it has been explicitly assigned to do. No surprises. Just systems of AI agents that do exactly what they’re supposed to do, and nothing more.

More articles

AI & Automation

Mar 24, 2026

Four Applied AI Trends Defining 2026: From Experimentation to Execution

The gap between organizations seeing AI results and those still waiting is not about ambition. It is about execution. Many companies say they are ready for AI. Far fewer have connected the underlying pieces that allow AI to operate effectively.

AI & Automation

Feb 25, 2026

The Three-Stage Observability Answer to “How Do You Know Your AI Is Actually Working?”

It’s the right question, and one that many organizations still struggle to answer clearly. “Trust the model” isn’t an acceptable answer.

AI & Automation

Mar 10, 2026

Just last week, a client’s engineering team asked: “How do you make sure someone can’t trick your AI into doing something it shouldn’t?” This is the most important question of the moment.

AI & Automation

Jan 9, 2025

Maximizing Agent Productivity with Pypestream’s Contact Center

Pypestream’s Contact Center improves agent productivity by unifying customer context, streamlining workflows, and providing AI-assisted tools that help support teams resolve issues faster and deliver more consistent service.

AI & Automation

Nov 12, 2024

AI-Powered Support: How Pypestream’s Contact Center Enhances Customer Experience

Pypestream’s AI-powered Contact Center combines intelligent automation with seamless human escalation to deliver faster support, improve agent productivity, and create more personalized customer experiences at scale.

AI & Automation

Feb 12, 2026

Open Letter to the BPO Industry by Richard Smullen: The Future of Outsourcing Has Already Changed

AI has rewritten BPO's rules. What was once a people business is fast becoming a platform business. And if you’re still selling “seats,” you’re already behind.

Transform Your Business Today

Discover how our AI solutions can enhance your operations and customer interactions seamlessly.

Contact us

01. Order Status Lookup

02. Collect Customer Feedback

03. Create Lead

04. FAQs

05. Send OTP

06. Send SMS

07. Start RPA

08. Submit Application

09. Create Lead

10. Browse Products

11. Browse Services

12. Cost Calculator

13. Create Shortlist

14. Product Comparison

01. Order Status Lookup

02. Collect Customer Feedback

03. Create Lead

04. FAQs

05. Send OTP

06. Send SMS

07. Start RPA

08. Submit Application

09. Create Lead

10. Browse Products

11. Browse Services

12. Cost Calculator

13. Create Shortlist

14. Product Comparison

15. Product Lookup

16. Product Recommendations

17. Service Comparison

18. Service Lookup

19. Service Recommendations

20. Test Drive Simulator

21. Browse Promotions

22. Promotion Lookup

23. Service Comparison

24. Cancel Appointment

25. Cancel Inspection

15. Product Lookup

16. Product Recommendations

17. Service Comparison

18. Service Lookup

19. Service Recommendations

20. Test Drive Simulator

21. Browse Promotions

22. Promotion Lookup

23. Service Comparison

24. Cancel Appointment

25. Cancel Inspection

27. Change Inspection Appointment

28. Edit Appointment

29. Edit Delivery Details

30. Schedule Appointment

31. Schedule Delivery

32. Schedule Inspection

33. Sign Lease/Contracts

34. Sign Title

35. Track Title and Registration

36. Upload Lease/Contracts

27. Change Inspection Appointment

28. Edit Appointment

29. Edit Delivery Details

30. Schedule Appointment

31. Schedule Delivery

32. Schedule Inspection

33. Sign Lease/Contracts

34. Sign Title

35. Track Title and Registration

36. Upload Lease/Contracts

XXXX

Pypestream.  All rights reserved

Privacy Policy

Pypestream Trust Center

Customer Help Center

Contact us

[email protected]

1177 Avenue of the Americas,

5th Floor,
 New York, New York, 10036

E
Elana Feldman
Pypestream
Talk to an Expert

Ready to put AI to work in your contact center?

See how Pypestream's agentic AI platform handles real customer interactions at enterprise scale.