Ride the Lightning
Cybersecurity and Future of Law Practice Blog
by Sharon D. Nelson Esq., President of Sensei Enterprises, Inc.
Red-Teaming AI Models Coming to DEF CON 31
May 9, 2023
Cyberscoop reported on May 4 that a group of leading artificial intelligence companies in the U.S. have committed to open their models to red-teaming at this year’s DEF CON hacking conference as part of a White House initiative to confront the security risks posed by AI.
Attendees at the premier hacking conference held annually in Las Vegas in August will be able to attack models from Anthropic, Google, Hugging Face, Microsoft, NVIDIA, OpenAI and Stability AI to find vulnerabilities. The event is expected to draw thousands of security researchers.
A senior administration official speaking to reporters on condition of anonymity ahead of the announcement said the red-teaming event is the first public assessment of large language models. “Red-teaming has been really helpful and very successful in cybersecurity for identifying vulnerabilities,” the official said. “That’s what we’re now working to adapt for large language models.”
This isn’t the first time Washington has looked to the ethical hacking community at DEF CON to help find weaknesses in critical and emerging technologies. The U.S. Air Force has held capture-the-flag contests there for hackers to test the security of satellite systems and the Pentagon’s Defense Advanced Program Research Agency brought a new technology to the conference that could be used for more secure voting.
Very fast advances in machine learning have resulted in many product launches featuring generative AI tools. But in the rush to launch these models, many AI experts are worried that companies are moving too quickly to ship new products to market without thoroughly addressing safety and security concerns.
Historically, advances in machine learning have occurred in academic communities and open research teams, but AI companies are increasingly closing off their models to the public, making it more difficult for independent researchers to examine potential shortcomings.
“Traditionally, companies have solved this problem with specialized red teams. However this work has largely happened in private,” AI Village founder Sven Cattell said in a statement. “The diverse issues with these models will not be resolved until more people know how to red team and assess them.”
Among the risks posed by these models are using them to create and spread disinformation; to write malware; to create phishing emails; to provide harmful knowledge not widely available to the public, such as instructions on how to create toxins; biases that are difficult to test for; the emergence of unexpected model properties and what industry researchers refer to as “hallucinations” — when an AI model gives a confident response to a query that isn’t grounded in reality.
Trust me, I’ve seen a fair number of hallucinations – and find them worrisome.
The DEF CON event will rely on an evaluation platform developed by Scale AI, a California company that produces training for AI applications. Participants will be given laptops to use to attack the models. Any bugs discovered will be disclosed using industry-standard responsible disclosure practices.
The announcement of the red teaming coincided with a set of White House initiatives aimed at improving the safety and security of AI models, including $140 million in funding for the National Science Foundation to launch seven new national AI institutes. The Biden administration also announced that the Office of Management and Budget will release guidelines for public comment this summer for how federal agencies should deploy AI.
I predict this will be a DEF CON to remember. And it is sorely needed!
Sharon D. Nelson, Esq., President, Sensei Enterprises, Inc.
3975 University Drive, Suite 225, Fairfax, VA 22030
Email: Phone: 703-359-0700
Digital Forensics/Cybersecurity/Information Technology
https://senseient.com
https://twitter.com/sharonnelsonesq
https://www.linkedin.com/in/sharondnelson
https://amazon.com/author/sharonnelson