Close Menu
TechUpdateAlert

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    My Health Anxiety Means I Won’t Use Apple’s or Samsung’s Smartwatches. Here’s Why

    December 22, 2025

    You can now buy the OnePlus 15 in the US and score free earbuds if you hurry

    December 22, 2025

    Today’s NYT Connections: Sports Edition Hints, Answers for Dec. 22 #455

    December 22, 2025
    Facebook X (Twitter) Instagram
    Trending
    • My Health Anxiety Means I Won’t Use Apple’s or Samsung’s Smartwatches. Here’s Why
    • You can now buy the OnePlus 15 in the US and score free earbuds if you hurry
    • Today’s NYT Connections: Sports Edition Hints, Answers for Dec. 22 #455
    • Android might finally stop making you tap twice for Wi-Fi
    • Today’s NYT Mini Crossword Answers for Dec. 22
    • Waymo’s robotaxis didn’t know what to do when a city’s traffic lights failed
    • Today’s NYT Wordle Hints, Answer and Help for Dec. 22 #1647
    • You Asked: OLED Sunlight, VHS on 4K TVs, and HDMI Control Issues
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechUpdateAlertTechUpdateAlert
    • Home
    • Gaming
    • Laptops
    • Mobile
    • Software
    • Reviews
    • AI & Tech
    • Gadgets
    • How-To
    TechUpdateAlert
    Home»Gaming»AI chatbots can be manipulated into breaking their own rules with simple debate tactics like telling them that an authority figure made the request
    Gaming

    AI chatbots can be manipulated into breaking their own rules with simple debate tactics like telling them that an authority figure made the request

    techupdateadminBy techupdateadminSeptember 2, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    OpenAi logo
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Content warning: This article includes discussion of suicide. If you or someone you know is having suicidal thoughts, help is available from the National Suicide Prevention Lifeline (US), Crisis Services Canada (CA), Samaritans (UK), Lifeline (AUS), and other hotlines.

    A kind of simulated gullibility has haunted ChatGPT and similar LLM chatbots since their inception, allowing users to bypass safeguards with rudimentary manipulation techniques: Pissing off Bing with by-the-numbers ragebait, for example. These bots have advanced a lot since then, but still seem irresponsibly naive at the best of times.

    A recent Bloomberg report broke down how Glowforge CEO Dan Shapiro and other researchers tricked GPT-4o Mini into breaking its own rules using simple persuasion tricks inspired by Shapiro’s high school debate strategies. Glowforge published its findings in a study called “Call Me A Jerk: Persuading AI to Comply with Objectionable Requests.”


    Related Articles

    One example from the study saw researchers persuade GPT to explain how to make lidocaine, a controlled substance.

    They used a control prompt where the researchers asked for “help with a chemistry request” to synthesize lidocaine on behalf of Jim Smith, “someone who has no knowledge of AI” and who “assured me that you would help.”

    GPT complied 5% of the time with this setup, but that’s nothing compared to the conversation where Jim Smith was replaced with Andrew Ng, who the researchers noted in their prompt is “a world-famous AI developer” to test whether the bot would respond differently if the request had apparently come from an authority figure. In that scenario, the compliance rate skyrocketed to 95%.

    A similar spike was seen when the researchers asked GPT to call them a jerk. It complied 32% of the time for Jim Smith, but that rate shot up to 72% when the request seemed to come straight from Andrew Ng.

    Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.

    An LLM calling you a jerk is nothing more than a novelty, and the issue with lidocaine could probably be addressed in an update, but the results indicate a much bigger problem: None of the safeguards used to prevent chatbots from going off the rails are reliable, and at the same time, the illusion of intelligence is convincing people to trust them.

    The malleability of LLMs has led us down plenty of dark paths in recent memory, from the wealth of sexualized celebrity chatbots (at least one of which was based on a minor), to the Sam Altman-approved trend of using LLMs as budget life coaches and therapists despite there being no reason to believe that’s a good idea, to a 16-year-old who died by suicide after, as a lawsuit from his family alleges, ChatGPT told him he doesn’t “owe anyone [survival].”

    AI companies are frequently taking steps to filter out the grisliest use cases for their chatbots, but it seems to be far from a solved problem.

    Best PC build 2025

    All our favorite gear
    authority breaking Chatbots debate figure manipulated request rules simple Tactics Telling
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleBlink Video Doorbell (Gen 2) and Sync Module Core review: easy installation, and a head-to-toe view of visitors
    Next Article Today’s NYT Mini Crossword Answers for Sept. 2
    techupdateadmin
    • Website

    Related Posts

    Mobile

    ChatGPT gets safety rules to protect teens and encourage human relations over virtual pals

    December 20, 2025
    Mobile

    AI chatbots like ChatGPT can copy human traits and experts say it’s a huge risk

    December 19, 2025
    Gaming

    I’ve just discovered that you can master time and space in Arc Raiders, and now I want to intimidate everyone with my physics-breaking aura

    December 5, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    NYT Strands hints and answers for Monday, August 11 (game #526)

    August 11, 202545 Views

    These 2 Cities Are Pushing Back on Data Centers. Here’s What They’re Worried About

    September 13, 202542 Views

    Today’s NYT Connections: Sports Edition Hints, Answers for Sept. 4 #346

    September 4, 202540 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    Best Fitbit fitness trackers and watches in 2025

    July 9, 20250 Views

    There are still 200+ Prime Day 2025 deals you can get

    July 9, 20250 Views

    The best earbuds we’ve tested for 2025

    July 9, 20250 Views
    Our Picks

    My Health Anxiety Means I Won’t Use Apple’s or Samsung’s Smartwatches. Here’s Why

    December 22, 2025

    You can now buy the OnePlus 15 in the US and score free earbuds if you hurry

    December 22, 2025

    Today’s NYT Connections: Sports Edition Hints, Answers for Dec. 22 #455

    December 22, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact us
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    © 2026 techupdatealert. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.