You can now raise an alarm when the AI misbehaves

by OmarAli 4 days ago

4 days ago

You can now raise an alarm when the AI misbehaves

Write AI lab Every week I occasionally encounter AI models that behave poorly and bizarrely. There’s usually nothing you can do about it other than share these stories with you. But that could soon change.

A group of AI researchers have created a crowdsourcing website called Flaw Reporting for AI (FLARE-AI) to report and track AI harms. For example, if a chatbot generates malware or a bomb-making recipe, reveals personal information, or causes users to become delusional, FLARE-AI could be used to raise the alarm. The open-source code behind the system allows others to review a problem and forward reports to modelers, as well as to organizations such as MITER, a nonprofit that tracks problems with engineering systems. It’s a bit like Downdetector, which creates real-time user reports about global service outages affecting things like apps and websites.

The website is another step in the group’s ongoing work with AI reports, which I first wrote about last year. Members of the group also discussed a congressional bill announced in June that would give the U.S. government a central role in prosecuting this type of AI misbehavior.

“Right now, there is no centralized, accountable way to report errors in AI systems,” says Avijit Ghosh, an artificial intelligence policy researcher at HuggingFace, who led the development of FLARE-AI with computer scientists Elaine Zhu and Shayne Longpre.

The alarm system was developed in collaboration with 49 AI experts from 32 different organizations. In an article outlining the work, the researchers argue that their initiative could prove crucial as AI becomes more widespread and agent systems become more powerful. They believe that the lack of a consistent method for reporting AI errors is a significant problem.

“I think it’s a really good initiative,” says Jessica Ji, a researcher at the Center for Security and Emerging Technology think tank. Ji says researchers are right to note that existing reporting mechanisms are fragmented and that AI models are black boxes. “I support anything that makes AI more transparent,” she says.

Although bugs and cybersecurity issues are getting a lot of attention – especially recently – Ghosh tells me that problems with AI systems include issues such as psychological harm, discrimination or bias, and misinformation. He adds that different companies have different standards on such issues, causing some issues to go undetected. “As there is no coordinated disclosure system, there are no external mechanisms to enforce transparency,” says Ghosh.

A number of recent incidents involving popular AI tools show how easily the technology can break.

This week, a company called LayerX revealed a way to make AI-powered web browsers, including OpenAI’s Atlas and Perplexity’s Comet, push their limits. For example, convincing the AI model behind the browser that it is playing a game could cause the browser to go rogue and attempt to hack a website. (The companies responsible for the affected browsers have fixed the problem, LayerX says.) And in April of this year, Johann Rehberger, a security researcher, discovered a way to trick Claude into revealing personal information using images generated by ChatGTP.

AI also leads to bizarre new problems. Last year, OpenAI was forced to update its models after discovering that they were overly sycophantic, which sometimes seemed to encourage delusional thinking.

According to Rumman Chowdhury, CEO and founder of Humane Intelligence PBC, FLARE-AI could be a useful way for many AI developers to implement ways to report problems with their tools. However, she adds that such initiatives often come with significant challenges.

https://www.wired.com/story/flare-website-ai-flaw-reporting-safety/

Useful Links

Edtior's Picks

Latest Articles

You can now raise an alarm when the AI ​​misbehaves

Useful Links

Edtior's Picks

Latest Articles

You can now raise an alarm when the AI misbehaves