Anthropic launches $15K jailbreak bounty program for its unreleased next-gen AI

Synthetic intelligence agency Anthropic introduced the launch of an expanded bug bounty program on Aug.8, with rewards as excessive as $15,000 for individuals who can “jailbreak” the corporate’s unreleased, “subsequent era” AI mannequin.

Anthropic’s flagship AI mannequin, Claude-3, is a generative AI system just like OpenAI’s ChatGPT and Google’s Gemini. As a part of the corporate’s efforts to make sure that Claude and its different fashions are able to working safely, it conducts what’s referred to as “pink teaming.”

Crimson teaming

Crimson teaming is mainly simply making an attempt to interrupt one thing on function. In Claude’s case, the purpose of pink teaming is to attempt to work out the entire ways in which it could possibly be prompted, compelled, or in any other case perturbed into producing undesirable outputs.

Throughout pink teaming efforts, engineers would possibly rephrase questions or reframe a question as a way to trick the AI into outputting info it’s been programmed to keep away from.

For instance, an AI system educated on knowledge gathered from the web is prone to comprise personally identifiable info on quite a few folks. As a part of its security coverage, Anthropic has put guardrails in place to stop Claude and its different fashions from outputting that info.

As AI fashions turn out to be extra strong and able to imitating human communication, the duty of making an attempt to determine each attainable undesirable output turns into exponentially difficult.

Bug bounty

Anthropic has carried out a number of novel security interventions in its fashions, together with its “Constitutional AI” paradigm, however it’s at all times good to get recent eyes on a long-standing concern.

In accordance with an organization weblog submit, it’s newest initiative will expand on present bug bounty packages to concentrate on common jailbreak assaults:

“These are exploits that would enable constant bypassing of AI security guardrails throughout a variety of areas. By concentrating on common jailbreaks, we goal to handle a few of the most vital vulnerabilities in essential, high-risk domains akin to CBRN (chemical, organic, radiological, and nuclear) and cybersecurity.”

The corporate is simply accepting a restricted variety of individuals and encourages AI researchers with expertise and those that “have demonstrated experience in figuring out jailbreaks in language fashions” to use by Friday, Aug. 16.

Not everybody who applies might be chosen, however the firm plans to “broaden this initiative extra broadly sooner or later.”

Those that are chosen will obtain early entry to an unreleased “subsequent era” AI mannequin for red-teaming functions.

Associated: Tech firms pen letter to EU requesting more time to comply with AI Act

Anthropic launches $15K jailbreak bounty program for its unreleased next-gen AI

admin

Recommended

10 Altcoins had a Significant Price Increase

Will The Halving Send Bitcoin Price To $100,000? Analytics Platform Reveals What You Should Expect

Popular News

Protocol-Owned Liquidity: A Sustainable Path for DeFi

Cryptocurrency for College: Exploring DeFi Scholarship Models

What are rebase tokens, and how do they work?

What is Velodrome Finance (VELO): why it’s a next-gen AMM

$10 XRP Price Envisioned By Fund Manager As Ripple Mounts Trillion-Dollar Payment Markets ⋆ ZyCrypto

Latest

I hid 4 Bluetooth trackers (including AirTags) to test their reliability – here’s how Android rivals compared

I stopped using my iPhone’s hotspot after testing this 5G router – and that won’t change

Categories

Follow us

Recommended