AI & Automization, HLT, Best Practice

AI-Driven News Radio: DW's Experimental Podcast Creator

Is there a way to create a five minute news show in five minutes? Well, not quite, but we're certainly getting there. Andy Giefer, HLT/NLProc aficionado and our team's tech lead, has put together a piece of software that aims to experiment with the automation of web radio – and already shows very promising results. In this post, we take a look at how the prototype application works, and what kind of benefits and risks it brings.

So what exactly is the Podcast Creator? Described in the most concise way, it's an easy-to-use, AI-driven, mash-up authoring tool (only available for macOS at the moment) that allows journalists to create a professional news podcast in no time, based on previously published web articles, partly or fully synthetic voices and a largely automated production workflow.

Let's take a quick look at the current UI:

The core functionalities

The app works pretty much like this: Users copy/paste text from any news article, import it via a DW URL, or directly bring it in via MONITIO (a state-of-the-art news monitoring platform co-developed by DW). They can then manually edit the selected news items, change the running order – and will at some point also be able to request automatic summaries (via LLaMa or Alpaca).

Once the news podcast is properly compiled and edited, users prompt the system to produce/build a high-quality audio file, based on a template.

The workflow basically looks like this:

The template provided by the Podcast Creator follows a typical newscast structure: an introduction with music, a main section with the actual stories, an outro.

The name of the presenter and their voice can be customized. And here's where it gets really interesting: Voices can be fully synthetic (i.e. system voices) or based on real people (i.e. news anchors), whose voices are cloned with coqui TTS. We're also exploring the possibilities of voice shaping (using tools provided by ElevenLabs), which describes the process of taking an artificial voice and making it sound like your own. This allows presenters to read the news in a language that's foreign to them – all based on abstract voice characteristics and AI models.

The benefits

So what makes the app interesting and useful? First of all, the Podcast Creator can help journalists save a lot of time: The process of curating and revising a standard newscast is drastically sped up (without losing editorial control), and the routine task of reading the news is fully automated. Journalists can thus focus on other, more demanding or rewarding work: analysis, comments, big picture explainers, cooperation with other newsrooms, you name it. Shaped, cloned, or fully synthesized voices also ensure the news can be broadcast 24/7, 365 days a year, even when the hosts are unavailable or budgets don't allow a continuous service.

The most exciting (and most radical) feature may be the potential to expand the news service to virtually any region in the world. As long as there's a voice model, the news can be read in that language. And, as pointed out before, AI can also make a sampled and shaped voice perform in a plethora of languages the human anchor would've never been able to master in their lifetime. The system thus also enables the massive "export" of voice talent.

The risks

As with all tools based on automation and AI, there are a number of risks, and we're fully aware of them.

For instance, it's not unlikely that a massive rollout could make a broadcaster less trustworthy and damage its brand, even if journalists and production managers are 100% transparent about the use of AI technology. Audiences might say: "They're faking the voices, so who knows if they're not also faking the news?"

A massive use of AI voices would also make a news show less authentic. In spite of quantum leaps in the field of HLT, most people can still tell the difference between a computer voice and a human host, especially when listening closely for a couple of minutes. Furthermore, occasional background noises, slightly sore voices, bloopers, and other imperfections hinting at a "manual" production may actually be a good thing in the long run – and something that's very hard to reproduce synthetically.

There's also the risk that in times of austerity and fierce competition, AI technology will simply be used to cut budgets, lay off staff, or increase earnings – instead of using freed resources for better and more inclusive journalism.

What's next?

In the next couple of months, Andy and his team will refine the Podcast Creator, do more user testing, and add more features. But what's even more important: A lot of experts at DW will discuss the app, the underlying technology and all its implications. Before doing anything other than lab experiments with a tool like this, it's highly important to consider ethical and legal aspects. If DW puts AI tools to work at scale, it'll have to be in the form of trusted, transparent, and sustainable AI.

Alexander Plaum