Podcasting
Advertisers
Enterprise
Pricing
Resources
Discover Discover

Log in
Sign up free

AI Breakdown

Arxiv paper - PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling

2025-02-27

In this episode, we discuss PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling by Avery Ma, Yangchen Pan, Amir-massoud Farahmand. The paper introduces PANDAS, a hybrid technique that enhances many-shot jailbreaking by altering fabricated dialogues with positive affirmations, negative demonstrations, and optimized adaptive sampling tailored to specific prompts. Experimental results on AdvBench and HarmBench using advanced large...

In this episode, we discuss PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling by Avery Ma, Yangchen Pan, Amir-massoud Farahmand. The paper introduces PANDAS, a hybrid technique that enhances many-shot jailbreaking by altering fabricated dialogues with positive affirmations, negative demonstrations, and optimized adaptive sampling tailored to specific prompts. Experimental results on AdvBench and HarmBench using advanced large language models show that PANDAS significantly outperforms existing baseline methods in scenarios involving long input contexts. Additionally, an attention analysis highlights how PANDAS exploits long-context vulnerabilities, providing deeper insights into the mechanics of many-shot jailbreaking.

View more

Comments (3)

More Episodes

You may also like

Closer To Truth

Self-Mastery Become Your Best

Heal, Survive & Thrive!

The Mel Robbins Podcast

ŒIL pour YEUX, DENT pour MÂCHOIRE 😎

‌BPLUS بی‌پلاس پادکست فارسی خلاصه کتاب

Halacha Headlines

All Ears English Podcast

Easy German: Learn German with native speakers | Deutsch lernen mit Muttersprachlern

Bad Podcast Pitches

Get this podcast on your phone, Free

Create Your Podcast In Minutes

Full-featured podcast site
Unlimited storage and bandwidth
Comprehensive podcast stats
Distribute to Apple Podcasts, Spotify, and more
Make money with your podcast

It is Free

Podcast Services
MONETIZATION & MORE
KNOWLEDGE BASE
Support
Podbean

Privacy Policy
Cookie Policy
Terms of Use
Consent Preferences
Copyright © 2015-2025 Podbean.com