Audio mining - transcription of audio and video data | scapos AG

Using intelligent multimedia pattern recognition algorithms, audio mining automatically generates a wide range of metadata for media files, converting spoken words into searchable text.

With the audio mining system of Fraunhofer IAIS, audio and video tracks can be searched specifically for original sounds, and speaker recognition allows people to be found and targeted in the file.

Automatic preparation of audio media stocks

With automatic speech recognition (“speech-to-text”), audio data can be prepared for searching and automatically tagged. It also recognizes different speakers and distinguishes speech from other audio data (music, sounds). The metadata of the audio files can be enriched accordingly to support existing search functions.

Benefit & value

Speech recognition not only helps improve the search function, it can also be used for further optimization: Based on the spoken words, the content is enriched with automatically generated keywords and related to similar content. In this way, users can be given recommendations that point them to further content that is of interest to them. The user’s dwell time is thus extended and even older, no longer popular content is still accessed.

Category:

AI Technologies

Developed by

Fraunhofer IAIS

Your contact person

I will be happy to provide you with information about our software products.

Ying Ge-WolfProduct sales

+49-2241-14-4408
Contact

Request info

Flexibility and usability

Thanks to its service-oriented architecture and message-based communication, the audio mining system offers a high degree of flexibility and the possibility to tailor the range of functions to your individual needs. This allows the system to be integrated into an existing media archive and used, for example, as a metadata enrichment service, or to function as a stand-alone media archive.

According to your requirements

For your version of the audio mining system, we can use existing workflows, e.g. for text mining or audio transcription, or we can develop new individual workflows for you. In close cooperation with your team, customer-specific AI models can be trained, new analysis services can be developed or additionally existing services can be connected.

Areas of application

Radio and television stations
Media library provider
Organizations that want to discover metadata from large amounts of text, audio, and/or video information

Audio mining – transcription of audio and video data

Using intelligent multimedia pattern recognition algorithms, audio mining automatically generates a wide range of metadata for media files, converting spoken words into searchable text.

With the audio mining system of Fraunhofer IAIS, audio and video tracks can be searched specifically for original sounds, and speaker recognition allows people to be found and targeted in the file.

Automatic preparation of audio media stocks

Benefit & value

Category:

Developed by

Your contact person

Flexibility and usability

According to your requirements

Areas of application

Other software products

scapos AG

scapos Software Portfolio Overview

AutoNester-T – Automatic nesting

AutoNester-L – Nesting on leather hides

PackAssistant – Container planning with identical, complex parts

PUZZLE – Optimization of cardboard boxes and pallet loads

AutoBarSizer – cutting optimization software for steel profiles and other bars

AutoPanelSizer – Optimized cutting layouts for panel sizing saws

CutPlanner – Automatic production planning in the textile industry

CuboNester-P – dynamically optimized packing arrangements

CuboNester-C – dynamically optimized cutting plans

MpCCI – Solving multidisciplinary problems by coupling simulations

SAMG – efficiently solve large linear systems of equations

ModelCompare – Compare FEM models quickly and easily

SimCompare – automatic event detection for crash simulations

SimExplore – Comparison and analysis of CAE simulations

MESHFREE – accelerating complex fluid mechanics simulations

MYNTS – Simulation, Analysis and Optimization of Energy Networks

FemZip – Compression Tool for Simulation Results

DIFF-CRASH – Stability Analysis for Simulation Results

OptoInspect3D Inline – fast inline evaluation of point clouds

Audio mining – transcription of audio and video data

FoundationEHR | a foundation AI model for structured electronic health records

MultiGML | multimodal graph machine learning for drug target prioritization