|
The Institute takes care of data collection and subsequent treatment to make it
into a public, non-commercial set
of corpora, called Czech National Corpus. It includes, by 2002, a 100 million
representative corpus of written
contemporary Czech. While much more is to appear soon, also smaller corpora of
the spoken language are being
prepared (Prague Spoken Corpus having been made available already) as well as
diachronic corpora. The department
develops its own methodology, offers some teaching classes and provides for the
nation-wide access to our
corpora, including schools. It is also a primary centre of corpus research in
the country, aiming at some major applications
as well, such a new frequency dictionary. The ICNC is run on the basis of state
grants.
|