Forescout's Vedere Labs found that large language models continue to lack vulnerability research and exploitation capabilities, with 48% and 55% of models failing the first and second VR tasks, respectively, while 66% and 93% failed the first and second exploit development tests, respectively, Infosecurity Magazine reports. Instability has been observed across most of the LLMs, with those that accomplished ED tasks needing error interpretations, output debugging, and other user guidance, according to the report from Forescout's Vedere Labs. Meanwhile, open-source LLMs available on HuggingFace were found to be weakest for both VR and ED tasks. However, significant improvements in VR and ED have been observed over the three-month testing period. "These results suggest that generative AI hasn't yet transformed how vulnerabilities are discovered and exploited by threat actors, but that may be about to change. The age of 'vibe hacking' is approaching, and defenders should start preparing now," said researchers, who expect AI-based exploits to increase in prevalence but not in sophistication.
AI/ML, Vulnerability Management
Vulnerability research, exploitation capabilities of LLMs still lagging
