Institute of the Estonian Language

Tallinn, Estonia

Institute of the Estonian Language

Tallinn, Estonia

Time filter

Source Type

Bimler D.,Massey University | Uuskula M.,Institute of the Estonian Language
Journal of the Optical Society of America A: Optics and Image Science, and Vision | Year: 2014

Cross-cultural comparisons of color perception and cognition often feature versions of the "similarity sorting" procedure. By interpreting the assignment of two color samples to different groups as an indication that the dissimilarity between them exceeds some threshold, sorting data can be regarded as low-resolution similarity judgments. Here we analyze sorting data from speakers of Italian, Russian, and English, applying multidimensional scaling to delineate the boundaries between perceptual categories while highlighting differences between the three populations. Stimuli were 55 color swatches, predominantly from the blue region. Results suggest that at least two Italian words for "blue" are basic, a similar situation to Russian, in contrast to English where a single "blue" term is basic. © 2014 Optical Society of America.


Piits L.,Institute of the Estonian Language | Kudritski E.,Institute of the Estonian Language | Kiissel I.,Institute of the Estonian Language | Hein I.,Institute of the Estonian Language
Frontiers in Artificial Intelligence and Applications | Year: 2014

This paper describes an attempt to use Estonian statistical parametric speech synthesis for audio pronunciation of words and word forms in online dictionaries. Two new HTS-voices were created and compared for this purpose. The paper gives an overview of a design and evaluation process for these voices. Different errors were detected including quantity errors, bad sound quality, accent errors, gemination at the boundary of compound word components, etc. The level of correctness and sound quality for the two parametric speech synthesisers ranged from 69% to 76%. The paper demonstrates that voice Eva-2, which can accept text with diacritics as input, produces fewer errors. Still, the error rate of both new voices is too high to fill the criteria of orthoepy in learner's dictionaries. © 2014 The Authors and IOS Press.


Mihkla M.,Institute of the Estonian Language | Hein I.,Institute of the Estonian Language | Kiissel I.,Institute of the Estonian Language | Rapp A.,North Estonian Association of the Blind | And 2 more authors.
Frontiers in Artificial Intelligence and Applications | Year: 2014

Systems for automatic reading and broadcasting subtitles (spoken subtitles) are meant to eliminate the language barrier that TV-viewers with special needs (such as the visually handicapped and the dyslectics) may experience in watching TV films or broadcasts in foreign languages that are provided with subtitles. In such systems, a speech signal synchronised with TV subtitles is generated through a separate audio channel. The present article focuses on the questions that have arisen during the development and application of the system of spoken subtitles for Estonian Public Broadcasting: selection of a TTS system and of a synthetic voice, synchronization between the subtitles and synthetic speech utterances, and the marking of speaking turns. Such subjects as the editor interface of the system for automatic reading and broadcasting subtitles as well as the foreign names pronunciation database are also included. © 2014 The Authors and IOS Press.


Altrov R.,Institute of the Estonian Language | Pajupuu H.,Institute of the Estonian Language
Frontiers in Artificial Intelligence and Applications | Year: 2010

The Estonian Emotional Speech Corpus serves as the acoustic basis for emotional text-to-speech synthesis. Because the Estonian synthesizer is a TTS-synthesizer, we started off by focusing on read texts and the emotions contained in them. The corpus is built on a theoretical model and we are currently at the stage of verifying the components of the model. In the present article we give an overview of the corpus and the principles used in selecting its testers. Some studies show that people who have lived longer in a certain culture can more easily recognize vocal expressions of emotion that are characteristic of the culture without seeing the speaker's facial expressions. We therefore decided not to use people under 30 years of age as testers of emotions in our theoretical model. We used two tests to verify the selection principles for the testers. In the first test, 27 young adults aged under 30 were asked to listen to and identify the emotion (joy, anger, sadness, neutral) of 35 sentences. We then compared the results with those of adults aged over 30. In the second test we asked 32 Latvians listen to the same sentences, and then compared the results with those of Estonians. Our analysis showed that younger and older testers, Estonians and Latvians perceive emotions quite differently. From these test results we can say that the selection principle of corpus testers, using people who are more familiar with Estonian culture, is acceptable. © 2010 The authors and IOS Press. All rights reserved.


Viks U.,Institute of the Estonian Language | Vare S.,Institute of the Estonian Language | Sahkai H.,Institute of the Estonian Language
Frontiers in Artificial Intelligence and Applications | Year: 2010

The paper describes a polyfunctional database of Estonian word families which is based on extensive research and contains detailed word formation information about the Estonian vocabulary. It is an XML database integrated into a dictionary management system which offers various possibilities of structure based editing and searching, data reuse etc. The design of the database is based on the word families method, which consists in the organization of words on the basis of common stem morphemes and word formation relations. Until now, the word families method has been used in the compilation of word formation dictionaries. Using the method in the compilation of a database is a novel solution which considerably broadens the access to and the possible uses of word formation data. The database provides material for researchers in computational and general linguistics, language learners and teachers, and lexicographers. The data can also be used in several language technology applications like search engines, text-to- speech synthesis etc. © 2010 The authors and IOS Press. All rights reserved.


Kalvik M.-L.,Institute of the Estonian Language | Mihkla M.,Institute of the Estonian Language
Frontiers in Artificial Intelligence and Applications | Year: 2010

The study is focused on Estonian rhythmic structure as revealed in fluent read speech. The core of the study involves determining the distinctive features of the three degrees of Estonian phonetic quantity and assessment of the significance of those features by statistical methods with an aim to enhance the naturalness of synthetic speech by using the features best identifying each quantity degree in fluent Estonian speech. The theory of adjacent phones is tested on a large data set and the role of intensity as a possible feature to identify quantity degrees is investigated. According to the results of phonetic and statistical analysis the main constitutive factors of quantity degrees and, thus, of speech rhythm are the classical duration ratio of stressed and unstressed syllables, whereas the rest of the duration ratios and tonal characteristics investigated turned out to be less significant for the data analysed. © 2010 The authors and IOS Press. All rights reserved.


Tamuri K.,Institute of the Estonian Language
Frontiers in Artificial Intelligence and Applications | Year: 2012

Currently the Estonian Emotional Speech Corpus is investigated for the distinctive acoustic parameters of three emotions-anger, joy and sadness-and neutral speech, with a view to recognizable synthesis of emotions in Estonian speech. This article is focused on intensity as one of the parameters vital for emotion synthesis. The research question is whether the intensity of Estonian read speech is in any way affected by emotions. The Estonian Emotional Speech Corpus was used as the acoustic basis of the study. The intensity analysis comprised calculations of the means and ranges of the intensities of emotional and neutral speech. In addition, pairwise studies were applied to find out whether intensity differs across emotions and in comparison with neutral speech in utterance-initial and utterance-final positions. The results revealed that mean intensities make a significant difference between concrete emotions as well as in comparison with neutral speech. The highest intensity was measured in neutral speech and the lowest in the utterances of sadness. Intensity ranges, however, were not significantly different between the utterance groups analysed. Intensity at the beginning and end of utterance was also the highest in neutral speech and the lowest with sadness. Those two groups displayed the only statistically significant differences between the intensities of utterance beginnings as well as ends. © 2012 The Authors and IOS Press.


Nurk T.,Institute of the Estonian Language
Frontiers in Artificial Intelligence and Applications | Year: 2012

The article describes the creation of Hidden Markov Model based speech models for both male and female voice for Estonian text-to-speech synthesis. A brief overview of text-to-speech synthesis process is given, focusing on statistical parametric synthesis in particular. System HTS is employed to generate voice models. The creation of speech corpus of Institute of the Estonian Language is analyzed. The process of adapting Estonian-related training data and linguistic specification to HTS is described, as well as experiments carried out on data from different speakers, subcorpora and linguistic specifications. The findings from speech model evaluation are given and possible courses of action to improve the quality of HMM-based speech models trained are proposed. © 2012 The Authors and IOS Press.


Kallas J.,Institute of the Estonian Language | Langemets M.,Institute of the Estonian Language
Frontiers in Artificial Intelligence and Applications | Year: 2012

EELex is a web-based dictionary writing system with Estonian language support including various linguistic resources necessary for dictionary making [1, 2]. Nearly 40 dictionaries of different types (monolingual and bilingual, general and learners' dictionaries, etc.) with standard XML markup make EELex a multipurpose lexicographic database. Using the example of the active Basic Estonian Dictionary [3], this paper describes from the point of a lexicographer the functions of EELex that allow various specialized dictionaries to be generated. We focus on the generation of syntagmatic dictionaries, mainly the valency and collocation dictionaries. © 2012 The Authors and IOS Press.


Viikberg J.,Institute of the Estonian Language
Keel ja Kirjandus | Year: 2014

In most countries a tip or gratuity (Est. 'jootraha') is an extra amount of money given to someone as a reward for good service. Usually the sum is small and goes straight to the attendant (waiter, taxi driver, hairdresser). In Estonian the word (in the form of yotoraa) was first recorded in the 16 th century and is a loan translation from Low German (cf. drink-, drinke-gelt 'Trinkgeld'). Initially the word was used in the sense of a sacrifice (drink offering) to house fairies, but later it acquired the meaning of extra money given to the attendant for buying himself a drink. As beer was a customary drink at that time, we may very well call the extra allowance beer money. The Low German loan translation jooduraha can be related to an earlier Estonian word joot (PL usu. joodud) that meant offering food and drink to guests on some family occasion (christening, wedding) or celebrating the completion of a major work (e.g. the building of a boat or a windmill). We can find examples in the folk tradition that jootu joodi ('a drink was had') also to ensure the success of a forthcoming undertaking (seal hunting, letting the cattle out for the first time in spring). By the 19th century the word jooduraha had basically acquired the meaning of a reward to someone (errand boy, postman, coachman) in return for a service. The word jootraha first appeared in dictionaries in 1917. Today (young) Estonians often use the word tipp (< English tip) instead of jootraha.

Loading Institute of the Estonian Language collaborators
Loading Institute of the Estonian Language collaborators