AI & Automation, XR & Immersive Journalism

AI, Robotics, XR – and Social Acceptance: A Quick Review of the SERMAS project

...and that's another wrap: last month, we successfully completed the final evaluation of SERMAS, an EU Horizon project we'd been working on with an international consortium for more than three years. Its main focus: Putting together a socially savvy, modular toolkit for AI, robotics, and XR innovation. Here's a quick recap.

SERMAS was kicked off in October 2022, roughly six weeks before the release of ChatGPT, which would trigger the so-called "AI revolution". The eight consortium partners – F6S, KCL, Poste Italiane, Spindox Labs, SUPSI/IDSIA, TUDa, UNIMORE, and DW – thus had to carefully reiterate on original plans and designs, as nobody wanted to build an outdated platform. To make matters more complicated, Brexit issues made sure that the original lead coordinator KCL had to be replaced with UNIMORE. So initial challenges were substantial. And it sure took a while to put together a solid list of requirements. However, the consortium turned out to be motivated, resourceful, and well coordinated – which is why everything turned out just fine in the end.

SERMAS Toolkit

So what is the project's core outcome and asset? In short, it's the SERMAS toolkit, an advanced modular architecture designed to enhance the flexibility, user-friendliness, and interactivity of AI, robotics, and XR systems. This innovative framework stands out for its versatility, built upon a foundation of immersive web technologies, large language models, cloud infrastructure, edge computing, unified communication protocols, and robotics.

SERMAS toolkit on GitHub - partial screenshot.
The SERMAS toolkit on GitHub.

SERMAS facilitates seamless integration of use cases and scenarios through its modular design. This means users can easily add or modify functionalities by incorporating the relevant modules. Central to this architecture is a set of fully animated avatars equipped with computer vision capabilities, enabling them to engage and interact with users in multiple languages. These avatars are not only visually appealing, but also capable of providing contextually relevant information and interaction, both on a screen and in the physical world (mounted to robots capable of free movement).

The toolkit is built to support fully distributed environments, ensuring that it operates effectively regardless of where the runtime is located. This level of adaptability is crucial for addressing diverse application scenarios and ensuring that systems can be tailored to meet specific needs.

SERMAS also places an emphasis on socially acceptable solutions. The consortium carried out several user studies which also focused on psychological aspects. The system is designed to ensure that users have a positive, secure, and safe experience. This focus on social context and user satisfaction is what sets SERMAS apart, making it a compelling choice for practical use in various settings and, potentially, broader adoption in the EU.

Another standout aspect of SERMAS is that much of its technology is open source and available via Github. This openness not only encourages innovation and collaboration, but also allows users to customize and extend the toolkit according to their needs.

SERMAS pilots and DW's journo companion

As proof of concept and validator of the SERMAS results, the consortium created three real-world applications. This is where DW Innovation played a crucial role: as a scenario developer, a co-lead on requirements, and a manager of user studies.

DW's pilot came to be known as Guardia, a virtual safety/security trainer for journalists and other media professionals. The core idea behind it: Create a supplement to the broadcaster's in-person training (which is expensive and time-consuming) and old school PDF handouts (which are cumbersome to read).

As of late 2025, the avatar- and AI-driven system offers a selection of interactive lessons and quizzes (how to behave a checkpoints, what to pack in case of emergency etc.) as well as a free chat mode in which virtual companion Guardia will answer basically any question regarding personal safety and security regulations ("I'm a Russian-German journalist based in Berlin, and I need to travel to Ukraine for work, what precautions should I take?")

Photo of a test session. Foreground: Guardia poster. Background: A DW colleague testing the system on a laptop and a big screen.
A Guardia test session at DW's "Mindspiration Day" in the summer of 2025.

The system runs in (almost) any browser and allows for various modes of interaction: keyboard, mouse/trackpad, and voice. A retrieval augmented generation (RAG) approach and several system prompts minimize hallucinations​. Guardia also comes with an open source CMS (based on Strapi), which means that users can easily upload their own content (text, images, sounds, 3D objects etc.) and create other kinds of web-based training. By exchanging background images, colors, fonts, and log-in screens, the tool can be rebranded with little effort. Even though more immersive functions were eventually not implemented in the final demo, Guardia is also XR-ready, i.e.: Users can hold up their mobile devices or wear an XR headset and have the avatar appear as a digital layer right next to their desks.

Two more pilots based on the SERMAS toolkit were conceived and implemented by Poste Italiane: The Post Office Agent explores multilingual support and personalized conversations for all kinds of customers at Italian post offices while the Receptionist Agent takes care of meeting, identifying, and accompanying business contacts at Poste's HQ in Rome. Unlike Guardia, who "only" relies on web interfaces, XR components, LLMs and speech-to-text / text-to-speech modules, the Poste pilots also make use of sophisticated computer vision technology and state-of-the-art IoT/robotics components, respectively.

A big robot with a smiling face and a SERMAS logo.
Poste's Receptionist Agent is actually a free-moving communication robot.

SERMAS Open Calls

As envisioned in the Horizon Call and the proposal, SERMAS was also shaped by a mechanism known as cascade funding: The consortium re-distributed roughly € 1 Mio. of its budget to third parties who would then do additional research and build more software.

SERMAS launched two open calls (one focusing on development, one on implementation), which received dozens of applications from all over Europe. The consortium eventually funded eight sub-projects; DW Innovation played an active role as user/business reviewer, and as a mentor.

A group picture of the consortium.
The SERMAS consortium and EU reviewers at the final project meeting in Reggio Emilia.

Three open call projects resulted in tools and frameworks that should be particularly interesting for people working at the intersection of media tech and journalism: 3Dify and 3DforXR are all about generating XR-ready 3D models in a smart, user friendly fashion (details in this post), whereas XR City put an avatar called Olivia on a high-tech kiosk in Torres Vedras, Portugal. The project's aim: enhancing citizen engagement, supporting urban planning, promoting cultural heritage, overcoming digital barriers.

SERMAS Legacy

As a consortium, the SERMAS partners created an innovative and practical toolkit for AI/XR/robotics systems. It's designed to be adaptable, socially aware, and user-centric, making it an attractive option for a wide range of stakeholders. SERMAS can thus help transform interactive technology and offer practical solutions across various domains. The Guardia pilot can serve as a blueprint for media organisations who want to relieve their trainers, tutors, and advisors – or simply look for a way to improve their web-based training with immersive and truly interactive elements.

If you're interested in testing and/or adopting Guardia, please reach out to us via innovation@dw.com.

Academics should take a look at the 30+ scientific papers published by the SERMAS consortium, all of them available via Zenodo.

Author
team_alexander_plaum.jpg
Alexander Plaum