This is a project for the creation of the fixed line and mobil telephone voices
based Hungarian speech database.
The goal of the project is collecting speech telephone database, in which some
major dialectal variants are represented. This database could provide a
realistic base both for the training and testing of the present-day
teleservices, and - because of the phonetically richness - the training of real
speaker independent recognisers
The database contains records based on the definition in SpeechDatE for the
dialectical, age and sex balance and vocabulary. What is important and
different from the SpeechDatE database, that the phonetically reach sentences
and words have been segmented and labelled in phoneme level. Thus the database
will give possibility to train phoneme based recognisers. During the planning
of the corpus, we should take into consideration not only the variety of the
dialectical aspects, but the special characteristics of Hungarian language too.
Since the Hungarian is an agglutinative language, we need to create a larger
vocabulary in some categories, than it is mandatory. We try to pay an extra
attention to the topic 'phonetically rich sentences and words', to
create a phonetically well balanced speech database for text independent speech
recognizers. A detailed statistical analysis was prepared to examine the
statistics of phonemes, diphones, triphones and syllables.
The voice of 500 speakers have been recorded from all over the country, which
provided the balanced distribution of the dialects.
The speakers has to read a given text material into the phone.
After recording we prepare the so-called annotation and segmentation process.
This means that we listen to every recorded speech, and create label files
containing information about the speaker and the speech according to the
An automatic labelling system has been developed for helping the handmade