Looking inside the black box

Looking into the code.
Looking into the code.
DPA via Reuters
One of the biggest challenges facing artificial intelligence companies is that they don’t know everything about their algorithms. This so-called black box problem is exacerbated by the fact that deep learning models do precisely that — they learn. And when they learn they change. They take in enormous troves of data, detect patterns, and spit something out: How a sentence should read, what an image should look like, how a voice should sound.

But now researchers at Anthropic, the AI startup that makes the chatbot Claude, claim they’ve had a breakthrough in understanding their own model. In a blog post, Anthropic researchers disclosed that they’ve found 10 million “features” of their Claude 3 Sonnet language model, with certain patterns that pop up when a user inputs something it recognizes. They’ve been able to map features that are close to one another: One for the Golden Gate Bridge, for example, is close to another for Alcatraz Island, the Golden State Warrior, California Governor Gavin Newsom, and the Alfred Hitchcock film Vertigo — set in San Francisco. Knowing about these features allows Anthropic to turn them on or off, manipulating the model to break out of its typical mold.

This development offers hope that the companies behind powerful generative AI models will soon have much more control over their creations, as MIT professor Jacob Andreas told theNew York Times. “In the same way that understanding basic things about how people work has helped us cure diseases,” Andreas said, “understanding how these models work will both let us recognize when things are about to go wrong and let us build better tools for controlling them.”

More from GZERO Media

US President Donald Trump pardons a turkey at the annual White House Thanksgiving Turkey Pardon in the Rose Garden in Washington, D.C., USA, on Nov. 25, 2025.
Andrew Leyden/NurPhoto

Although not all of our global readers celebrate Thanksgiving, it’s still good to remind ourselves that while the world offers plenty of fodder for doomscrolling and despair, there are still lots of things to be grateful for too.

Marine Le Pen, French member of parliament and parliamentary leader of the far-right National Rally (Rassemblement National - RN) party and Jordan Bardella, president of the French far-right National Rally (Rassemblement National - RN) party and member of the European Parliament, gesture during an RN political rally in Bordeaux, France, September 14, 2025.
REUTERS/Stephane Mahe

Army Chief Asim Munir holds a microphone during his visit at the Tilla Field Firing Ranges (TFFR) to witness the Exercise Hammer Strike, a high-intensity field training exercise conducted by the Pakistan Army's Mangla Strike Corps, in Mangla, Pakistan, on May 1, 2025.

Inter-Services Public Relations (ISPR)/Handout via REUTERS

Field Marshal Asim Munir, the country’s de facto leader, consolidated his power after the National Assembly rammed through a controversial constitutional amendment this month that grants him lifelong immunity from any legal prosecution.