Researchers: ‘Adversarial attacks’ capable of producing harmful AI responses

May 17, 2024

Credit: Adobe Stock Images

A study by Amazon Web Services researchers has revealed critical security vulnerabilities in large language models that understand and respond to speech, which could allow them to be manipulated into generating harmful responses using sophisticated audio attacks, according to VentureBeat.

Click for more special coverage

The study found that, despite safety checks, speech-language models are highly susceptible to "adversarial attacks," which are slight, imperceptible changes to audio input that can drastically alter the model’s behavior. These attacks achieved an average success rate of 90% in generating toxic outputs during experiments.

Moreover, the study demonstrated that audio attacks on one SLM could transfer to other models, achieving a 10% success rate even without direct access. This transferability suggests a fundamental flaw in the way these systems are currently trained for safety.

The implications are significant, as adversarial attacks could lead to misuse for fraud, espionage, or physical harm.

The researchers proposed countermeasures like adding random noise to audio inputs, which reduced the attack success rate, but acknowledged that this is not a complete solution.

An In-Depth Guide to AI

Get essential knowledge and practical strategies to use AI to better your security program.

Learn More

SC Staff

(Credit: daily_creativity – stock.adobe.com)

AI/ML

Anthropic Claude models compromised 3 companies during testing

Laura FrenchJuly 31, 2026

OpenAI’s Hugging Face incident disclosure prompted Anthropic to review its own evaluations.

AI/ML

AI chatbots outperform humans in building trust for scams

SC StaffJuly 31, 2026

Researchers from four universities conducted a study pitting AI chatbots against human scammers in a simulation of "pig butchering" scams, a form of romance scam that escalates to fake cryptocurrency investments.

Zero trust

What we learned about zero-trust from the OpenAI breach of HuggingFace

Alan LeFortJuly 31, 2026

Here’s why in a world where AI models can hack other companies, we have to outreason our adversaries.

Get daily email updates

SC Media's daily must-read of the most current and pressing daily news

Related Terms

Algorithm