It turns out that all you need to get past an AI chatbot’s guardrails is a little bit of creativity. In a study published by Icaro Lab called “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models,” researchers were able to bypass various LLMs’ safety mechanisms by phrasing their prompt with poetry.
According to the study, the “poetic form operates as a general-purpose jailbreak operator,” with results showing an overall 62 percent success rate in producing prohibited material, including anything related to making nuclear weapons, child
→ Continue reading at Engadget