http://www.elsnet.org/pix/elsnetheader.jpg

Project description: MTBA - Hungarian Telephone Speech Database

[ ID = 0024 ]	MTBA - Hungarian Telephone Speech Database
Project name	MTBA - Hungarian Telephone Speech Database
Short name or acronym	MTBA - Hungarian Telephone Speech Database
Project URL	http://alpha.tmit.bme.hu/speech
Project description	This is a project for the creation of the fixed line and mobil telephone voices based Hungarian speech database. The goal of the project is collecting speech telephone database, in which some major dialectal variants are represented. This database could provide a realistic base both for the training and testing of the present-day teleservices, and - because of the phonetically richness - the training of real speaker independent recognisers The database contains records based on the definition in SpeechDatE for the dialectical, age and sex balance and vocabulary. What is important and different from the SpeechDatE database, that the phonetically reach sentences and words have been segmented and labelled in phoneme level. Thus the database will give possibility to train phoneme based recognisers. During the planning of the corpus, we should take into consideration not only the variety of the dialectical aspects, but the special characteristics of Hungarian language too. Since the Hungarian is an agglutinative language, we need to create a larger vocabulary in some categories, than it is mandatory. We try to pay an extra attention to the topic 'phonetically rich sentences and words', to create a phonetically well balanced speech database for text independent speech recognizers. A detailed statistical analysis was prepared to examine the statistics of phonemes, diphones, triphones and syllables. The voice of 500 speakers have been recorded from all over the country, which provided the balanced distribution of the dialects. The speakers has to read a given text material into the phone. After recording we prepare the so-called annotation and segmentation process. This means that we listen to every recorded speech, and create label files containing information about the speaker and the speech according to the database definitions. An automatic labelling system has been developed for helping the handmade segmentation.
Languages
Funding	unknown
Project duration	-
Contact
Name	Ph.D. Klara Vicsi
Organisation	Budapest University of Technology and Economics
Address	Sztoczek u. 2.
City	H-1111 Budapest
Country	Hungary
Email	vicsi_at_tmit.bme.hu
Phone	+H-111
Fax	+
Update this profile	Last update: 2004-07-22 09:50:31

Browse and Search the elsnet Directory of National Language and Speech Resources Projects World-wide
The National Resources Projects Directory	Browse in alphabetical order	Browse in alphabetical order (in frame)	Browse by country	Browse by ID number	Add your profile	Search directories for keywords and phrases (use ~ for space within keys; most word-initial regular expressions can be used)

[print/pda] [no frame] [navigation table] [navigation frame] Page generated 13-02-2008 by Steven Krauwer

Disclaimer / Contact ELSNET