SELMA will help media monitors and journalists make sense of huge content streams (big data analysis) – and also enable them to enrich audiovisual (AV) output through transcription, translation, voice-over and subtitling, thus making it more accessible.
The SELMA consortium aims to build a multilingual open-source platform that can process (very) large volumes of content and will feature a (self) learning AI system that is able to share information about data streams–and keep the added value of each language through a novel approach. The idea is to create a crosslingual common space, which means: The system will always collect and analyze data in the original language and subsequently translate it into another language upon request.
Five European institutions have teamed up to establish the language platform:
The Laboratoire Informatique d'Avignon (LIA) at Avignon University, the Institute of Mathematics and Computer Science (IMCS) at the University of Latvia, Portuguese Software company Priberam, the Fraunhofer Institute for Intelligent Analysis and Information Systems, and DW Innovation–who will also lead the consortium.
SELMA is building upon several other DW HLT ventures, including the already completed SUMMA and news.bridge as well as the ongoing GoURMET project. It will run for three years and hopefully produce a fully functioning open-source platform by the end of 2023.
To keep up with news, research results and prototypes, make sure to stay connectedwith SELMA on Twitter. An official project website is coming up soon.
Key visual based on an image by geralt/pixabay Other visuals: SELMA Consortium