Elsnet
 


Project description: Hungarian speech database

[ ID = 0026 ] SPEECHDAT 
Project nameHungarian speech database  
Short name or acronymSPEECHDAT 
Project URL http://alpha.tmit.bme.hu/speech 
Project description

Here we present the project for the creation of the fixed line telephone voices 
based Hungarian speech database. 

The work is embedded in the SpeechDat-E project, which is the extension of the 
Language Engineering Project LE-4001 SpeechDat to the Eastern European 
languages. The goal of the project is collecting speech database via fixed 
network phones, in which all official European languages and some major 
dialectal variants are represented. This database could provide a realistic 
base both for the trainig and testing of the present-day teleservices, and - 
because of the phonetically richness - the training of real speaker independent 
recognizers. 

The database contains records based on the definition in SpeechDat(II) for the 
dialectical, age and sex balance and vocabulary. During the planning of the 
corpus, we should take into consideration not only the variety of the 
dialectical aspects, but the special characteristics of Hungarian language too. 
Since the Hungarian is an agglutinative language, we need to create a larger 
vocabulary in some categories, than it is mandatory. We try to pay an extra 
attention to the topic 'phonetically rich sentences and words', to 
create a phonetically well balanced speech database for text independent speech 
recognizers. A detailed statistical analysis was prepared to examine the 
statistics of phonemes, diphones, triphones and syllables. 

The voice of 1000 speakers have been recorded from all over the the country, 
which provided the balanced distribution of the dialects. In the organization 
of the speakers the Hungarian Railway Company (MÁV Rt.), the MATÁV 
Rt. Telecommunication Company and several schools and universities help us. 
 

The speakers has to read a given text material into the phone. 

After recording we prepare the so called annotation process. This means that we 
listen to every recorded speech, and create label files containing information 
about the speaker and the speech according to the database definitions. This is 
done with a special software called A_TOOL, which was written at our 
laboratory. The source code (Delphi) of A_TOOL is public domain, you can 
download it here (you will need a zip-file decompression program to unzip it). 
LanguagesHungarian
Fundingpublic
Project duration -
Contact
NamePh.D. Klara Vicsi
OrganisationBudapest University of Technology and Economics, Laboratory of Speech Acoustics 
Address Sztoczek u. 2. 
CityH-111 Budapest
Country Hungary 
Emailvicsi_at_tmit.bme.hu 
Phone+36 1 463-1940 
Fax+36 1 372-0403 
Update this profile Last update: 2004-07-22 10:02:30

 

Browse and Search the Directory of National Language and Speech Resources Projects World-wide
The National Resources Projects Directory
Browse in alphabetical order Browse in alphabetical order (in frame) Browse by country Browse by ID number Add your profile

Search directories for keywords and phrases (use ~ for space within keys; most word-initial regular expressions can be used)

 

[print/pda] [no frame] [navigation table] [navigation frame]     Page generated 13-02-2008 by Steven Krauwer Disclaimer / Contact ELSNET