Assessing speech prosody and the syntax/prosody interface from an automatic speech technology oriented point-of-view

Informatikai Tudományok Doktori Iskola
2015. 06. 26
Research objectives:
Speech prosody plays an important role in human speech perception and speech understanding and hence in speech technology applications. Prosody contributes to the segmentation of the speech stream, allows for a hierarchical layering of the content in terms of its relevance (to the topic or situation) or novelty. Prosody and syntax are closely related, especially in formal speech, although so far this relation has only been analyzed from a linguistic (phonologic) point-of-view. It is of basic interest to explore what prosody can add to automatic speech understanding (mostly focusing on pure and „blind” statistical speech-to-text conversion), information extraction, content analysis, or keyword spotting. A hierarchical approach is preferred top-down in terms of acoustic-prosodic analysis in order to see how deep automatic analysis is able to go in the prosodic hierarchy.
Open problems:
- Stress detection (improvement), automatic phrasing and a general and easy-to-use (and as language independent as possible) prosodic model for machine learning applications. This also involves the evaluation of new prosodic-acoustic features and the testing of several modeling frameworks already known.
- Combination of text based and speech based analysis tools to reach optimal performance.
- Explore to what extent syntax is recoverable from speech; Examine the syntax/prosody interface from a point-of-view that focuses on later applicability in real applications (what is exploitable) rather than on pure description.
- Automatic separation of prominence (syntax driven stress) from salience (pragmatic/semantic driven stress). Prosody transfer.
- Elaborate methods of syntactic analysis capable of working on speech recognition output (text or lattices), which usually contains errors (substitution, insertion, deletion).

Research partners:
- MTA Nyelvtudományi Intézet (Research Institute for Linguitics, HAS), Budapest,
- Idiap Research Institute, Martigny, Switzerland

- some experience in speech or language technology (basic knowledge in signal processing, speech recognition, automatic linguistic analysis
- good mathematical backgrounds (probability theory, statistics and processes)
- Working in Linux

felvehető hallgatók száma: 1