That robot sounds just like you

First, OpenAI tackled text with ChatGPT, then images with DALL-E. Next, it announced Sora, its text-to-video platform. But perhaps the most pernicious technology is what might come next: text-to-voice. Not just audio — but specific voices.

A group of OpenAI clients is reportedly testing a new tool called Voice Engine, which can mimic a person’s voice based on a 15-second recording, according to the New York Times. And from there it can translate the voice into any language.

The report outlined a series of potential abuses: spreading disinformation, allowing criminals to impersonate people online or over phone calls, or even breaking voice-based authenticators used by banks.

In a blog post on its own site, OpenAI seems all too aware of the potential for misuse. Its usage policies mandate that anyone using Voice Engine obtain consent before impersonating someone else and disclose that the voices are AI-generated, and OpenAI says it’s watermarking all audio so third parties can detect it and trace it back to the original maker.

But the company is also using this opportunity to warn everyone else that this technology is coming, including urging financial institutions to phase out voice-based authentication.

AI voices have already wreaked havoc in American politics. In January, thousands of New Hampshire residents received a robocall from a voice pretending to be President Joe Biden, urging them not to vote in the Democratic primary election. It was generated using simple AI tools and paid for by an ally of Biden's primary challenger Dean Phillips, who has since dropped out of the race.

In response, the Federal Communications Commission clarified that AI-generated robocalls are illegal, and New Hampshire’s legislature passed a law on March 28 that requires disclosures for any political ads using AI.

So, what makes this so much more dangerous than any other AI-generated media? The imitations are convincing. The Voice Engine demonstrations so far shared with the public sound indistinguishable from the human-uttered originals — even in foreign languages. But even the Biden robocall, which its maker admitted was made for only $150 with tech from the company ElevenLabs, was a good enough imitation.

But the real danger lies in the absence of other indicators that the audio is fake. With every other AI-generated media, there are clues for the discerning viewer or reader. AI text can feel clumsily written, hyper-organized, and chronically unsure of itself, often refusing to give real recommendations. AI images often have a cartoonish or sci-fi sheen, depending on their maker, and are notorious for getting human features wrong: extra teeth, extra fingers, and ears without lobes. AI video, still relatively primitive, is infinitely glitchy.

It’s conceivable that each of these applications for generative AI improves to a point where they’re indistinguishable from the real thing, but for now, AI voices are the only iteration that feels like it could become utterly undetectable without proper safeguards. And even if OpenAI, often the first to market, is responsible, that doesn’t mean all actors will be.

The announcement of Voice Engine, which doesn’t have a set release date, as such, feels less like a product launch and more like a warning shot.

More from GZERO Media

Across North America and Europe, blackouts are becoming more common, often hitting when the demand peaks or when the weather turns extreme. The surging demand for power and new energy sources are putting pressure on the energy systems. Meeting today’s energy needs takes a flexible, pragmatic, “all-of-the-above” approach — drawing on all fuels and technologies. Learn how Enbridge is delivering reliable, affordable energy in uncertain times.

Amir Seaid Iravani premanent representative of the Islamic Republic of Iran speaks during the UN Security Council on June 24, 2025 in New York City.
John Lamparski via Reuters Connect

It’s not clear yet how much the US attack on Iran's nuclear sites this weekend set back the Islamic Republic's ability to develop atomic weapons, but experts say the airstrikes almost certainly threw a bomb into something larger: the global nuclear non-proliferation regime.

A pie graph showing the percentage of Americans in favor of having a third major political party.
Ico Oliveira

Remember when Elon Musk threatened to start his own political party during his spat with Donald Trump? It’s unclear how many Americans would switch their political affiliation to a Musk-run party specifically, but a plurality agree that they’d like another major political party to rival the Democrats and Republicans.

Open Call is the heart of Walmart’s $350 billion commitment to US manufacturing, supporting products made, grown or assembled in America. The pitch event represents a unique opportunity for selected entrepreneurs to meet face-to-face with Walmart merchants and earn a chance to get their products on store shelves nationwide. Last year, finalists from across the country represented 48 states, with entrepreneurs from over half these states receiving deals. It’s all a part of Walmart’s investment in American jobs and communities. Learn more about Walmart’s annual Open Call.

Last week, Microsoft released its 2025 Responsible AI Transparency Report, demonstrating the company’s sustained commitment to earning trust at a pace that matches AI innovation. The report outlines new developments in how we build and deploy AI systems responsibly, how we support our customers, and how we learn, evolve, and grow. It highlights our strengthened incident response processes, enhanced risk assessments and mitigations, and proactive regulatory alignment. It also covers new tools and practices we offer our customers to support their AI risk governance efforts, as well as how we work with stakeholders around the world to work towards governance approaches that build trust. You can read the report here.