Image generated with Copilot. Prompt: “visual representation of an abstract audio waveform flying through air”.
Image generated with Copilot. Prompt: “visual representation of an abstract audio waveform flying through air”.
Verification, Best Practice

Synthetic Audio Detectors Put to the Test

Synthetic media and deepfakes have been around for a while. So why do journalists not have a reliable tool to detect them yet? The answer: Because there is not just one way to create them. Currently, we rely on a mix of using our own senses, known verification methods, and imperfect but useful detection tools. We have tested three of them for you – and here is what we have found.

Detecting synthetic audio is a complex undertaking. Different synthesis models require different detectors – a challenge, especially for researchers. Furthermore, synthetic media are evolving rapidly. In an interview from November 2024, Hany Farid, Berkeley professor and image analysis expert, reflected:

"If you would have asked me in 2021/22 if I worried about deepfakes of politicians, I would have said no, not really, because we had not gotten the voice yet. And the voice is really, really hard. And then, boom, I woke up one day and we got voice cloning.” (Source: Deepfakes, AI and the Battle for Democracy)

Unfortunately, journalists still lack a dependable tool for detecting it. However, there are a few free tools that claim to be able to spot synthetic voices. We set out to test: Hiya, DeepfakeTotal, and DeepFake-O-Meter.

Expectations

As media innovation managers, our goal is to support journalists and make their work easier – by equipping them with the knowledge, methods and technology to navigate both information and disinformation. Based on our own experience in journalism and conversations with colleagues, we have identified the following basic requirements a synthetic audio detection tool should meet:

Usability

  • Support for uploading all common file formats (audio and video) and URLs from social networks
  • No file length restrictions
  • Accessible user interface
  • Easy navigation (playback speed, skip, etc.)
  • Clear error messages

Transparency

  • Who develops and funds the tool?
  • Training data used (languages, voices, models)
  • Results must be reproducible, traceable and detailed, with clear AI flags
  • Transparent and comprehensible explanations, including info on potential false positives
  • Processing time no longer than 30 seconds
  • Clear information on what happens with user data

The three open-source tools we have tested are currently employed by journalists and the open source intelligence (OSINT) community in real use cases. Two of them have been developed by research institutes, one by a private company. The latter is implemented in the InVId Verification Plugin, which is widely used in the journalism community.

In order to evaluate the tools, we built a diverse dataset out of ten samples. This includes fully AI-generated samples, mixed clips combining natural and synthetic audio, and fully natural audio taken from social media. We also produced our own natural test recordings. The dataset covers multiple languages, different lengths, as well as male and female voices. It also includes a variety of audio formats, such as WAV, M4A and MP3. Before testing, we converted all files to MP3 format to ensure comparability. The testing took place in August 2025.

Hiya (in the InVID Plugin)

Screenshot of Hiya’s graphical user interface in the verification plugin.
Screenshot of Hiya’s graphical user interface in the verification plugin.

This tool is available as a Chrome plugin and via the InVID extension (log-in required). As this embedded version allows for a more detailed analysis, we tested Hiya that way.

  • Who develops the tool: Hiya Inc. is a private software company from the US.

  • What they promise: "Free, real-time detection, multi-language, works on any website, detects voices created by all popular voice synthesis tools, requires only 1 second of audio to perform a verification. [...] While we strive to provide the best possible results, we cannot guarantee complete accuracy."

  • What happens with user data: Hiya indicates that uploaded content is "not sold to third parties".

  • Content upload: Via file or audio link. 

  • Formats: No specific formats mentioned – supports common audio formats (MP3, WAV). Video links didn't work.

  • Length: 2s – 5min (longer files failed).

  • Free usage limit: No.  

  • Results presented: Visualized as bar graphs, gauge meters, percentages, and text.

  • Explanation: No detailed guidance; text only repeats visuals and score info.

Our experience with Hiya

Out of ten test cases, Hiya correctly identified four (including three in English), incorrectly identified three, and produced three inconclusive results.

Once enabled in the InVID plugin, the audio feature appears in the left-hand menu. If Hiya can't process your content, you will receive an error message, but it is not always clear what went wrong. The processing time is fast, though.

It takes a moment to familiarize yourself with the interface. On the left, there is a helpful bar graph: It highlights which parts may be AI-generated, indicated by timestamps and different colors (green for authentic, red for synthetic). On the right, you can find a gauge meter which only provides an overall percentage, and this can contradict the graph. As the gauge does not reflect partial synthesis, relying on this alone might lead to false conclusions.

Overall, it is unclear why certain content cannot be processed, which can be frustrating. For example, we had to shorten an audio file, risking the loss of relevant content. Nevertheless, the bar graph can highlight sections that are worth a closer look. The makers of the tool point out that results "may not always be 100% accurate".

Deepfake Total

Screenshot of Deepfake Total’s graphical user interface.
Screenshot of Deepfake Total’s graphical user interface.

Deepfake Total is a research and educational platform designed to raise awareness about detecting audio deepfakes.

  • Who develops the tool: Fraunhofer Applied and Integrated Security (AIESEC), a German research institute. 

  • What they promise: "Analyse suspicious audio files to detect deepfakes, and automatically share them with the security community."

  • What happens with user data: Not clearly disclosed.

  • Content upload: Via file or URL. 

  • Formats: Supports audio files (MP3, WAV, OGG, M4A, FLAC, AAC) of up to 20 MB. Plus, video links from YouTube, X, and Instagram.

  • Length: No general restriction but only analyzes up to 120 seconds (that you get to choose).

  • Free usage limit: No. 

  • Results presented: Visualized as a gauge meter with a percentage, plus an optional bar graph.

  • Explanation: No detailed guidance; text only repeats visuals and score info.

Our experience with Deepfake Total

Deepfake Total correctly identified seven out of ten test cases, incorrectly identified one, and gave two as inconclusive.

Deepfake Total is a stand-alone website that does not require login credentials to access. It can process uploaded files or social media videos in seconds, automatically extracting the audio. However, you have to select a specific 120-second segment of the audio file, meaning that only short sections can be AI-generated. For the best results, be sure to process the entire file.

Our first impression: The interface feels quite minimalistic. For instance, the text size is small, and the results are not immediately visible. The analysis results gauge meter only shows a general percentage, which again provides limited insight. It also does not display partial synthesis, which could lead to users overlooking subtle AI-generated elements within the content and drawing misleading conclusions. Viewing the graph requires an extra click; it provides you with potential AI traces at specific points in time. This feature is particularly useful for deeper investigations.

Fraunhofer, the developer, is a trusted and reputable institute in the domain of synthetic content detection. On Deepfake Total's website, they share a research paper, which is good for transparency, but we presume most users will not take the time to read it. A clear overview of the tool features (e.g. supported languages) would be more useful.

Overall, Deepfake Total leaves a positive impression by showing where it most likely detects AI-generated elements. Still, we are unsure what type of content it can best handle (e.g., male voices, English, etc.) because the test files and their correct results varied.

In addition, Fraunhofer has developed an interactive game to help detect audio deepfakes. As you play, you will notice your ears becoming more attuned to subtle details such as pauses, emotions, pronunciation and variations in sound. Definitely worth a try!

DeepFake-O-Meter

Screenshot of DeepFake-O-Meter’s graphical use interface.
Screenshot of DeepFake-O-Meter's graphical use interface.

The DeepFake-O-Meter research platform provides access to various AI-detection models for audio, image and video analysis.

  • Who develops the tool: The Media Forensics Lab (MDFL) at the University of Buffalo. 

  • What they promise: "DeepFake-O-Meter is an open-access platform that integrates state-of-the-art, open-source research methods for detecting AI-generated images, videos, and audio. [...] incorporated detection methods are research prototypes and may not always produce accurate or consistent results". 

  • What happens with user data: Not clearly disclosed.

  • Content upload: Via file. 

  • Formats: WAV, MP3.

  • Length: Depends on the model.

  • Free usage limit: Login required, new users get 30 credits. 

  • Results presented: Color coded scale (from red to green) accompanied by a percentage rate and a short explanation stating that this rate indicates "AI-Generated Likelihood".

  • Explanation: None, it just repeats the percentage.

Our experience with DeepFake-O-Meter

Once registered and logged-in, users can drag and drop or upload files in various formats. Depending on the submitted modality (audio, image or video), the tool then suggests potential models to choose from. Six models are offered for audio.

Caution: when uploading video content, the system chooses the detector models based on the video format. This means the models primarily analyze visual aspects, rather than audio. Therefore, if you are verifying audio, we recommend separating it from the images.

For our test, we ran all six audio models on all our example files. The models for audio analysis were developed between 2021 and 2025, which immediately raises a question: How accurate are they in detecting cutting-edge synthetic media?

While it was easy to provide an overview of how many of our test cases were identified correctly by the other tools, this was not possible for DeepFake-O-Meter.

The results were confusing, as six different percentage rates were produced for each test audio. In most cases they widely ranged, sometimes from 0.1 % to 99 % likelihood of synthesis for the same content. This demonstrates the biggest challenge we faced when using this tool for our journalistic use case: On what basis should I choose a model, and which result can I trust in my specific case?

The provider's transparency regarding the purpose and capabilities of this tool is consistent with our overall assessment: it should be regarded as a "testing ground" and "a free tool to experiment with" for researchers and public users in the detection of synthetic media. Therefore, its outputs should be interpreted with caution and in the appropriate context.

Takeaways

These are our main insights after the tests:

  • Synthetic media are evolving rapidly – but detectors are getting better as well.

  • Nevertheless, we cannot fully recommend one single detector; deepfakes are based on different synthesis models; each one requires a different detection tool trained on that model.

  • Forensic traces are weakened when content is shared or edited; whenever possible, get the original file or the first, high-quality upload.

  • Do not hesitate to consult detector developers or domain experts for a second opinion, such as the Witness Deepfake Rapid Response Force.

Also keep in mind that AI detection tools are just one part of the verification puzzle, and we should treat them as such. Other verification techniques are still crucial. For a comprehensive overview of workflows and tools, go to howtoverify.info.

The cat-and-mouse game is on-going. We will definitely continue to monitor synthetic media, test detectors, and conduct our own research, keeping you informed every step of the way. In the meantime, collaboration is key, so if you have any feedback or are aware of any useful technique or detector, please let us know here.

Transparency

For full contextual transparency we would like to point out that: 

  • Hiya is implemented in Truly Media, a verification and collaboration platform that DW innovation co-developed with ATC. 

  • Fraunhofer is a long-standing research partner of DW Innovation.

Cover image source: Generated with Copilot. Prompt: "visual representation of an abstract audio waveform flying through air".

Authors
team_julia_bayer.jpg
Julia Bayer
team_anna_schild.jpeg
Anna Schild