elsnet |
Central and Eastern European Survey |
Name: ESO Corpus, FIT Corpus, Spoken Czech Corpus
Nature: newspaper texts, computer journals, machine stem dictionaries for Czech, Slovak,
Russsian ,English, German, French - size between 200 000-120 000
entries
Language: Czech
Size: approximately 50 mil. word forms,
spoken: now about 100 hours
Format: ASCII and SGML, WAW
Coverage: newspapers, computer journals, spoken Czech - interview and dialogues
Medium: hard disk, CD-ROM
Availability: free for research purposes, partially commercial
products
Software description: Czech, Slovak, Russian spell checker, Czech, Slovak, Russian
lemmatizer and tagger, Czech, Slovak hyphenation programs,
available through personal contact, Czech Electronic Thesaurus,
Czech-English and Czech-German Electronic Dictionary
[Survey] [Organisation] [General Info] [Training] [Resources] [Research] [Staff] [Publications] |