Google just launched its first dedicated AI vulnerability reward program, offering up to $30,000 for researchers who uncover security flaws that could turn AI systems into weapons. The move comes as prompt injection attacks threaten to weaponize everything from smart home devices to enterprise email systems, with bug hunters already earning over $430,000 in AI-related bounties over the past two years.
Google is putting serious money behind AI security. The tech giant launched its first dedicated AI vulnerability reward program Monday, offering researchers up to $30,000 for uncovering security flaws that could weaponize artificial intelligence systems. The timing isn't coincidental - as AI systems become more integrated into daily workflows, the attack surface is expanding rapidly.
The program's qualifying vulnerabilities read like a cybersecurity nightmare playbook. Researchers can earn top-tier payouts for discovering "rogue actions" - AI exploits that cause real-world harm. Think prompt injections that trick Google Home into unlocking doors, or malicious prompts that summarize someone's entire email inbox and forward it to an attacker's account.
This isn't Google's first rodeo with AI security bounties. Bug hunters have already pulled in over $430,000 during the two years since the company officially started inviting AI researchers to probe its AI features. But Monday's announcement formalizes what constitutes an AI vulnerability, creating clear categories for researchers who previously navigated murky waters.
The program breaks down AI bugs as issues that exploit large language models or generative AI systems to cause harm or bypass security measures. At the top of the threat hierarchy are "rogue actions" - modifications to user accounts or data that compromise security or trigger unwanted behaviors. One particularly creative example from previous research showed how a poisoned Google Calendar event could manipulate smart home devices, opening shutters and killing lights.
But Google's drawing clear lines about what won't earn bounties. Simply getting Gemini to hallucinate or generate problematic content doesn't qualify. "Issues related to content produced by AI products - such as generating hate speech or copyright-infringing content - should be reported to the feedback channel within the product itself," . The company wants its AI safety teams to "diagnose the model's behavior and implement the necessary long-term, model-wide safety training" rather than patch individual content issues.