Better AV, Betters DBs, Cloud and Edge and Social Sounds

It has been a while since we last compiled a list of interesting media technologies for you–check out this post **that aged rather well–but of course tech scouting is still important to us. And now that a number of new projects have been successfully kicked-off in spite of the pandemic, we actually found the time to sit down, review some internal notes–and introduce some more tools and concepts that we think could have an impact on our industry.

Before reading this listicle, please note that we deliberately left out some topics that were too special, too geeky, or too far away from the daily routines of people working in media technology with a focus on journalism. Furthermore, this will not be the last article of its kind. End of disclaimer, enjoy this post!

Cloud- and AI-based video production

Coordinated ingestion, collaborative editing, better encoding, rendering, streaming, storing, archiving–all this (and more) can be achieved via innovative video production methods that are based on cloud servers, web content management systems, 5G connectivity and–of course–smart cameras. 

A concrete example will shed more light on the matter: If you run a system like the one outlined above (e.g. Sony Hive + Sony XDCAM Air) and send a reporter to an important event, all she has to do is set up the cam. Subsequently, she can attend to her most important task, namely observe the events around her and talk to relevant people. In the meantime, a remote team may do the following:

  • adjust aperture and light sensitivity
  • record locally in HD, but only send an SD signal to the cloud
  • browse what has already been recorded
  • create an edit decision list first, then upload relevant material in HD
  • live stream if necessary
  • send footage to all kinds of channels (multi-casting)

For broadcasters, properly organized cloud/AI video production can have a lot of positive effects, e.g.

  • faster publication of content
  • faster archiving
  • sustainable, integrative production workflows
  • reduction of costs

Needless to say, it will take a while before traditional hardware and workflows will be replaced by video cloud/AI concepts, but this is clearly a path to the future of producing and handling AV content.

Dolt and version control for data

The vast majority of IT people will agree that a decent software project needs version control (e.g. Git)–which is ideally managed in a decentralized, collaborative way (e.g. on Github). But what about projects that focus on databases rather than on code? In the last couple of years, we have come across a couple of interesting products that offer solutions for structured data, data pipelines, versioning, and collaboration. A particularly promising one seems to be Dolt.

It is basically like Git, with the difference that commands are not used on files, but on relational databases/tables. Data and schemes can be changed via SQL. Users are able to save everything (commit), do comparisons on a data cell level (diff), or combine databases (merge). Changing an SQL database and tracking those changes thus becomes a lot easier. Like MySQL, Dolt can be used off- and online. Finally, Dolthub is to Dolt what Github is to Git: The idea is to also provide a web interface that lets users easily store and collaboratively edit things, in this case their database repositories.
While the primary target group of Dolt (and similar tools) are probably admins and developers, the software may also have a future in data-driven journalism (DDJ). This is because data wranglers can

  • improve data set management workflows; handling those is still a bit tedious, especially when there are many models/versions and collaborators (comparing version X to version Y and fixing mistakes can take a long time)
  • (theoretically) acquire and check data samples in an easy fashion–instead of buying an expensive and complex, but potentially useless dataset; in case of a closed deal, the new tool would also ensure changes and updates can be tracked without extra effort
  • easily experiment with data sets; according to the Dolt developers, their software helps user clone open data and build an SQL database in minutes (instead of spending at least an hour to this with CSV files and other tools)

It is still unclear how good tools like Dolt will perform eventually, how stable they are when handling giant datasets, and if full compatibility with MySQL etc. and the implementation of all Guthub features can be guaranteed. However, in the era of DDJ, the general concept of elegant database versioning is quite appealing.

Edge Computing: Local clouds and IoT devices

Nowadays, the concept of edge computing usually refers to two things:

  1. Processing large chunks of data in local miniature computing centres in real time (when regular computers cannot do the job and cloud infrastructure would cause massive latency). In this context, people also talk about fog computing, local clouds, or cloudlets.
  2. Retrieving/reclaiming applications, services, and data from the "big" cloud–and leaving operations and processing to more or less local devices, including phones and gadgets (that later on communicate results to and get updates from the cloud). In this context, people also talk about the internet of things (IoT)

Bigger media organizations can benefit from "the edge" in various ways: Their computing becomes faster (shorter routes, less latency), safer (less externally stored data), and more reliable (as a lot of things work without an internet connection). IT operations also become cheaper in the long run, because huge chunks of data are not sent to some far-off cloud anymore.  A practical example for practical cloud computing are content delivery networks (CDNs) that trust in so-called cloud edge, meaning: A provider distributes content to regional hubs–and users will access them from there.
Looking at other aspects of the technology, it seems that almost all media organizations can benefit from edge computing on the device level. For example when they use

  • smart cams (s. AI-based video production) that sort and edit footage right away and only upload what is really needed; smart drones that know where to go and what to film (without a network connection) and get updates only when they are "docked"
  • human language technology (HLT) that trains complex models in the cloud, but will later on run them on a cheap, local instance
  • VR headsets and AR-enabled phones that measure 3D space with integrated sensors and CPUs–no internet connection needed, unless a new operating system or firmware is needed
  • blockchain ledgers for media asset management that are pulled from a network, then stored and processed locally
  • smart devices that are unlocked via fingerprints or face recognition (on-device security)
  • a collaborative office suite that works offline and is only synced temporarily

Edge computing in a broader sense is already ubiquitous, and markets will continue to grow, especially on the device/IoT level.

Interoperable Master Format (IMF)

Defined by the Society of Motion Picture and Television Engineers (SMPTE) and most prominently advertised and implemented by industry giant Netflix, IMF is a (relatively) new, sustainable standard for versioning, localization, quality assurance, and distribution of AV content.

It is based on the established Digital Cinema Package (DCP) framework and brings modern, object-based tech philosophy to digital TV and broadcasting: Instead of storing all AV info in one file (e.g. MPEG-2), IMF is all about componentization. An IMF package usually features:

  • separate video, audio, and text tracks
  • a composition playlist (CPL) that explains how to assemble things
  • a packaging list (PL) that has metadata on all files
  • an asset map (that describes data paths)
  • an output profile list (OPL) that feature additional info for downstream/final edits (e.g. with regard to cropping and scaling)

The benefits of this approach are immediately clear to anyone who has done B2B marketing before: You can have one master copy in English, one in Spanish, one in Arabic, one for the Web, and one for in-flight entertainment systems–or a single versatile IMF package.

IMF also makes life easier for editors. For example, when a partner orders a localized version of a German documentary, there is no need to mess with video data or the original soundtrack. Inserts, subtitles, and voice-overs can be added via a set of IMF instructions. IMF is also useful when it comes to fixing corrupted frames or editing mistakes. No need for new encoding–just replace the broken bits.

While the IMF is generally respected and considered the ultimate AV standard by some, adaptation at more traditional media organizations still seems to be rather slow. This has to do with several challenges when switching to IMF, including the necessity to acquire new software tools, create novel workflows and spend a lot of time on transcoding at first (due to other video standards in-house).

Social audio

Perhaps a little surprising, the latest social media hype is not about extra hi-res, AI-powered and totally immersive videos and games, but about good, old fashioned audio. There are basically three types of applications out there at the moment:

  • audio extensions of big social media platforms (think: voice messages on WhatsApp)
  • established audio chat apps that keep on adding extras and features (think: Discord, Teamspeak and similar apps for messaging, chatting, voice conferences; a lot of them tied to the Gaming scene)
  • novel social audio apps, aimed at smaller or bigger networks of friends, colleagues, and acquaintances (you have probably heard of Clubhouse, and there is also Twitter Spaces now)

While type 1 will probably not be that important for media organizations (at least outside of customer services and experimental projects), a variation of type 2 could transform digital workflows for good (think: Zoom or Microsoft Teams in audio only mode). Type 3 is quite relevant when testing new forms of audio journalism:

Clubhouse (still iOS and invite only) allows users to join/leave/host topic-driven audio chat rooms/panels ("drop-in audio"), Twitter Spaces (not fully rolled out yet) has a similar approach. These platforms can be used as a supplement to (digital) radio or podcasting formats.  Apps like Audlist, HearMeOut and Riffr are about audio status updates and interactive micro podcasting–thus trying to build a Twitter- or Mastodon-like community around audio. 

All in all, social audio is an interesting topic to be watched, but also a niche, where even the most successful events draw only a fraction of a mainstream video live stream audience.

Logo Deutsche Welle
DW Innovation