Project description: MTBA - Hungarian Telephone Speech Database

[ ID = 0024 ] MTBA - Hungarian Telephone Speech Database 
Project nameMTBA - Hungarian Telephone Speech Database 
Short name or acronymMTBA - Hungarian Telephone Speech Database 
Project URL http://alpha.tmit.bme.hu/speech 
Project description

This is a project for the creation of the fixed line and mobil telephone voices 
based Hungarian speech database. 
The goal of the project is collecting speech telephone database, in which some 
major dialectal variants are represented. This database could provide a 
realistic base both for the training and testing of the present-day 
teleservices, and - because of the phonetically richness - the training of real 
speaker independent recognisers 
The database contains records based on the definition in SpeechDatE for the 
dialectical, age and sex balance and vocabulary. What is important and 
different from the SpeechDatE database, that the phonetically reach sentences 
and words have been segmented and labelled in phoneme level. Thus the database 
will give possibility to train phoneme based recognisers. During the planning 
of the corpus, we should take into consideration not only the variety of the 
dialectical aspects, but the special characteristics of Hungarian language too. 
Since the Hungarian is an agglutinative language, we need to create a larger 
vocabulary in some categories, than it is mandatory. We try to pay an extra 
attention to the topic 'phonetically rich sentences and words', to 
create a phonetically well balanced speech database for text independent speech 
recognizers. A detailed statistical analysis was prepared to examine the 
statistics of phonemes, diphones, triphones and syllables. 

The voice of 500 speakers have been recorded from all over the country, which 
provided the balanced distribution of the dialects.
The speakers has to read a given text material into the phone. 
After recording we prepare the so-called annotation and segmentation process. 
This means that we listen to every recorded speech, and create label files 
containing information about the speaker and the speech according to the 
database definitions.
An automatic labelling system has been developed for helping the handmade 
Project duration -
NamePh.D. Klara Vicsi
OrganisationBudapest University of Technology and Economics 
Address Sztoczek u. 2. 
CityH-1111 Budapest
Country Hungary 
Update this profile Last update: 2004-07-22 09:50:31


Browse and Search the Directory of National Language and Speech Resources Projects World-wide
The National Resources Projects Directory
Browse in alphabetical order Browse in alphabetical order (in frame) Browse by country Browse by ID number Add your profile

Search directories for keywords and phrases (use ~ for space within keys; most word-initial regular expressions can be used)


[print/pda] [no frame] [navigation table] [navigation frame]     Page generated 13-02-2008 by Steven Krauwer Disclaimer / Contact ELSNET