Project description
 |
Here we present the project for the creation of the fixed line telephone voices
based Hungarian speech database.
The work is embedded in the SpeechDat-E project, which is the extension of the
Language Engineering Project LE-4001 SpeechDat to the Eastern European
languages. The goal of the project is collecting speech database via fixed
network phones, in which all official European languages and some major
dialectal variants are represented. This database could provide a realistic
base both for the trainig and testing of the present-day teleservices, and -
because of the phonetically richness - the training of real speaker independent
recognizers.
The database contains records based on the definition in SpeechDat(II) for the
dialectical, age and sex balance and vocabulary. During the planning of the
corpus, we should take into consideration not only the variety of the
dialectical aspects, but the special characteristics of Hungarian language too.
Since the Hungarian is an agglutinative language, we need to create a larger
vocabulary in some categories, than it is mandatory. We try to pay an extra
attention to the topic 'phonetically rich sentences and words', to
create a phonetically well balanced speech database for text independent speech
recognizers. A detailed statistical analysis was prepared to examine the
statistics of phonemes, diphones, triphones and syllables.
The voice of 1000 speakers have been recorded from all over the the country,
which provided the balanced distribution of the dialects. In the organization
of the speakers the Hungarian Railway Company (MÁV Rt.), the MATÁV
Rt. Telecommunication Company and several schools and universities help us.
The speakers has to read a given text material into the phone.
After recording we prepare the so called annotation process. This means that we
listen to every recorded speech, and create label files containing information
about the speaker and the speech according to the database definitions. This is
done with a special software called A_TOOL, which was written at our
laboratory. The source code (Delphi) of A_TOOL is public domain, you can
download it here (you will need a zip-file decompression program to unzip it). |