Underground Demand for Malicious LLMs Is Robust

Summary: Researchers from Indiana University Bloomington have uncovered a thriving underground market for malicious large language models (LLMs), dubbed “Mallas,” which are being used for various cybercriminal activities. These models, often uncensored or jailbroken, pose significant risks as they can generate phishing emails and malware at scale.

Threat Actor: Cybercriminals | cybercriminals
Victim: Tech companies | tech companies

Key Point :

  • The underground market has identified 212 malicious LLMs, with one model, WormGPT, generating $28,000 in just two months.
  • Malicious LLMs can produce effective phishing emails and malware, with 41.5% capable of generating phishing content.
  • Researchers have made available a dataset of prompts used for creating malware to help combat these threats.
  • Tech companies are urged to implement stronger safeguards and restrict access to uncensored models to mitigate risks.

Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development

So-Called Mallas Are Easily Bought or Rented

Underground Demand for Malicious LLMs Is Robust
Image: Shutterstock

The underground market for illicit large language models is a lucrative one, said academic researchers who called for better safeguards against artificial intelligence misuse.

See Also: Mitigating Identity Risks, Lateral Movement and Privilege Escalation

Academics at the Indiana University Bloomington said they identified 212 malicious LLMs on underground marketplaces from April through September. The financial haul for the threat actor behind one of them, WormGPT, is calculated at $28,000 over just two months, which underscores the allure for bad agents to break artificial intelligence guardrails and also the raw demand propelling them to do so.

Several illicit LLMs on sale were uncensored and built on open-source standards, and some were jailbroken commercial models. Academics behind the paper call the malicious LLMs “Mallas.”

Hackers can maliciously use Mallas to write targeted phishing emails at scale at a fraction of the cost, develop malware and automatically scope and exploit zero-days.

Tech giants developing artificial intelligence models have mechanisms in place to prevent jailbreaking and working on methods to automate detection of jailbreaking prompts. But hackers have also discovered methods to bypass the guardrails.

Microsoft recently detailed hackers using a “skeleton key” to force OpenAI, Meta, Google and Anthropic’s LLMs to respond to illicit requests and reveal harmful information. Researchers from Robust Intelligence and Yale University also identified an automated method for jailbreaking OpenAI, Meta and Google LLMs that doesn’t require specialized knowledge, such as the model parameters.

University of Indiana researchers found two uncensored LLMs: DarkGPT, sold for 78 cents for every 50 messages, and Escape GPT, a subscription service that costs $64.98 a month. Both models produced accurate, malicious code that went undetected by antivirus tools about two-thirds of the time. WolfGPT, available for a $150 flat fee, allowed users to write phishing emails that could evade a majority of spam detectors.

Nearly all of the malicious LLMs the researchers examined were capable of generating malware, and 41.5% could produce phishing emails.

The malicious products and services were primarily built on OpenAI’s GPT-3.5 and GPT-4, Pygmalion-13B, Claude Instant and Claude-2-100k. OpenAI is the LLM vendor that the malicious GPT builders targeted most frequently.

To help prevent and defend against attacks the researchers discovered, they made available to other researchers the dataset of prompts used to create malware through the uncensored LLMs and to bypass the safety features of public LLM APIs. They also urged AI companies to default to releasing models with censorship settings in place and allow access to uncensored models only to the scientific community, with safety protocols in place. Hosting platforms such as FlowGPT and Poe should do more to ensure that Mallas aren’t available through them, they said, adding, “This laissez-faire approach essentially provides a fertile ground for miscreants to misuse the LLMs.”

Source: https://www.bankinfosecurity.com/underground-demand-for-malicious-llms-robust-a-26223