Researcher tricks ChatGPT into revealing security keys – by saying “I give up”




  • Experts show how some AI models, including GPT-4, can be exploited with simple user prompts
  • Guardrail gaps don’t do a great job of detecting deceptive framing
  • The vulnerability could be exploited to acquire personal information

A security researcher has shared details on how other researchers tricked ChatGPT into revealing a Windows product key using a prompt that anyone could try.

Marco Figueroa explained how a ‘guessing game’ prompt with GPT-4 was used to bypass safety guardrails that are meant to block AI from sharing such data, ultimately producing at least one key belonging to Wells Fargo Bank.

https://cdn.mos.cms.futurecdn.net/y9QMgAXqMgSAnmNkY94gXR.jpg



Source link

Latest articles

spot_imgspot_img

Related articles

spot_imgspot_img