Time filter

Source Type

Wolk K.,Polish Japanese Academy of Information Technology | Marasek K.,Polish Japanese Academy of Information Technology
Advances in Intelligent Systems and Computing | Year: 2017

The multilingual nature of the world makes translation a crucial requirement today. Parallel dictionaries constructed by humans are a widely-available resource, but they are limited and do not provide enough coverage for good quality translation purposes, due to out-of-vocabulary words and neologisms. This motivates the use of statistical translation systems, which are unfortunately dependent on the quantity and quality of training data. Such systems have a very limited availability especially for some languages and very narrow text domains. Is this research we present our improvements to current quasi-comparable corpora mining methodologies by re-implementing the comparison algorithms, introducing a tuning script and improving performance using GPU acceleration. The experiments are conducted on lectures text domain and bi-data is extracted from web crawl from the WWW. The modifications made a positive impact on the quality and quantity of mined data and on the translation quality as well and used the BLEU, NIST and TER metrics. By defining proper translation parameters to morphologically rich languages we improve the translation quality and draw the conclusions. © Springer International Publishing Switzerland 2017.


Wolk A.,Polish Japanese Academy of Information Technology | Wolk K.,Polish Japanese Academy of Information Technology | Marasek K.,Polish Japanese Academy of Information Technology
Advances in Intelligent Systems and Computing | Year: 2017

The multilingual nature of the world makes translation a crucial requirement today. Within this research we apply state of the art statistical machine translation techniques to the West-Slavic languages group. We do West-Slavic languages classification and choose Polish as a representative candidate for our research. The experiments are conducted on written and spoken texts, which characteristics are defined as well. The machine translation systems are trained within West-Slavic group as well as into English. Translation systems and data sets are analyzed, prepared and adapted for the needs of West-Slavic—* translation. To evaluate the effects of different preparations on translation results, we conducted experiments and used the BLEU, NIST and TER metrics. By defining proper translation parameters to morphologically rich languages we improve the translation quality and draw the conclusions. © Springer International Publishing Switzerland 2017.


Kacprzak M.,University of Bialystok | Starosta B.,Polish Japanese Academy of Information Technology | Wegrzyn-Wolska K.,ESIGETEL
Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) | Year: 2015

The paper is devoted to the problem of modeling human attitudes towards imprecise ideas. A metaset is used for representing an imprecise concept and Opinion Mining techniques are applied to build a preference function which reflects someone's attitude towards the idea.The preferences are then evaluated as real numbers for the sake of comparison and selection of the best matching instance. The core of the idea of representing any imprecise concept with a metaset lies in splitting it into a treelike hierarchy of related sub-concepts. The nodes of the tree determine the membership degrees for metaset members and they are natural language terms which also describe reasons for some particular member to satisfy the represented idea. The Opinion Mining allows for automatic gathering and evaluation of opinions from the Internet. The proposed mechanism is applied to solve the problem of selecting the car best matching the imprecise idea of a good car for a lady. This approach can be applied in a decision support systems that helps both marketers and customers. © Springer International Publishing Switzerland 2015.


Pawlyta M.,Polish Japanese Academy of Information Technology | Skurowski P.,Silesian University of Technology
Advances in Intelligent Systems and Computing | Year: 2016

The paper describes the progress in the research for the automatic inferring method of the body structure—functional body mesh. In the paper we investigate four motionmeasures and machine learning methods—variants ofGaussian mixturemodels, DBScan and Neural Networks. The results were analyzed both—quantitatively and qualitatively using complete and incomplete data, of healthy and impaired persons. All the learning methods were on par with the others, however, we identified cases for which certain method works better. © Springer International Publishing Switzerland 2016.


Polkowski L.T.,Polish Japanese Academy of Information Technology
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2016

Zdzisław Pawlak influenced our thinking about uncertainty by borrowing the idea of approximation from geometry and topology and carrying those ideas into the realm of knowledge engineering. In this way, simple and already much worn out mathematical notions, gained a new life given to them by new notions of decision rules and algorithms, complexity problems, and problems of optimization of relations and rules. In his work, the author would like to present his personal remembrances of how his work was influenced by Zdzisław Pawlak interlaced with discussions of highlights of research done in enliving classical concepts in new frameworks, and next, he will go to more recent results that stem from those foundations, mostly on applications of rough mereology in behavioral robotics and classifier synthesis via granular computing. © Springer International Publishing AG 2016.


Wolk K.,Polish Japanese Academy of Information Technology | Marasek K.,Polish Japanese Academy of Information Technology
Procedia Computer Science | Year: 2015

The quality of machine translation is rapidly evolving. Today one can find several machine translation systems on the web that provide reasonable translations, although the systems are not perfect. In some specific domains, the quality may decrease. A recently proposed approach to this domain is neural machine translation. It aims at building a jointly-tuned single neural network that maximizes translation performance, a very different approach from traditional statistical machine translation. Recently proposed neural machine translation models often belong to the encoder-decoder family in which a source sentence is encoded into a fixed length vector that is, in turn, decoded to generate a translation. The present research examines the effects of different training methods on a Polish-English Machine Translation system used for medical data. The European Medicines Agency parallel text corpus was used as the basis for training of neural and statistical network-based translation systems. The main machine translation evaluation metrics have also been used in analysis of the systems. A comparison and implementation of a real-time medical translator is the main focus of our experiments. © 2015 The Authors. Published by Elsevier B.V.


Brocki L.,Polish Japanese Academy of Information Technology | Marasek K.,Polish Japanese Academy of Information Technology
Archives of Acoustics | Year: 2015

This paper describes a Deep Belief Neural Network (DBNN) and Bidirectional Long-Short Term Memory (LSTM) hybrid used as an acoustic model for Speech Recognition. It was demonstrated by many independent researchers that DBNNs exhibit superior performance to other known machine learning frameworks in terms of speech recognition accuracy. Their superiority comes from the fact that these are deep learning networks. However, a trained DBNN is simply a feed-forward network with no internal memory, unlike Recurrent Neural Networks (RNNs) which are Turing complete and do posses internal memory, thus allowing them to make use of longer context. In this paper, an experiment is performed to make a hybrid of a DBNN with an advanced bidirectional RNN used to process its output. Results show that the use of the new DBNN-BLSTM hybrid as the acoustic model for the Large Vocabulary Continuous Speech Recognition (LVCSR) increases word recognition accuracy. However, the new model has many parameters and in some cases it may suffer performance issues in real-time applications. Copyright © 2015 by PAN - IPPT.


Klec M.,Polish Japanese Academy of Information Technology
Studies in Computational Intelligence | Year: 2016

The experiments described in this paper utilize songs in the MIDI format to train Deep Neural Networks (DNNs) for the Automatic Genre Recognition (AGR) problem. The MIDI songs were decomposed into separate instrument groups and converted to audio. Restricted Boltzmann Machines (RBMs) were trained with the individual groups of instruments as a method of pre-training of the final DNN models. The Scattering Wavelet Transform (SWT) was used for signal representation. The paper explains the basics of RBMs and the SWT, followed by a review of DNN pre-training methods that use separate instrument audio. Experiments show that this approach allows building better discriminating models than those that were trained using whole songs. © Springer International Publishing Switzerland 2016.


Wolk K.,Polish Japanese Academy of Information Technology | Marasek K.,Polish Japanese Academy of Information Technology
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2015

The multilingual nature of the world makes translation a crucial requirement today. Parallel dictionaries constructed by humans are a widely available resource, but they are limited and do not provide enough coverage for good quality translation purposes, due to out-of-vocabulary words and neologisms. This motivates the use of statistical translation systems, which are unfortunately dependent on the quantity and quality of training data. Such has a very limited availability especially for some languages and very narrow text domains. Is this research we present our improvements to Yalign’s mining methodology by reimplementing the comparison algorithm, introducing a tuning scripts and by improving performance using GPU computing acceleration. The experiments are conducted on various text domains and bi-data is extracted from the Wikipedia dumps. © Springer International Publishing Switzerland 2015.


Kubera E.,Lublin University of Life Sciences | Wieczorkowska A.A.,Polish Japanese Academy of Information Technology
Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) | Year: 2015

Identification of particular voices in polyphonic and polytimbral music is a task often performed by musicians in their everyday life. However, the automation of this task is very challenging, because of high complexity of audio data. Usually additional information is supplied, and the results are far from satisfactory. In this paper, we focus on classical music recordings, without requiring the user to submit additional information. Our goal is to identify musical instruments playing in short audio frames of polyphonic recordings of classical music. Additionally, we extract pitches (or pitch ranges) which combined with instrument information can be used in score-following and audio alignment, see e.g. [9, 20], or in works towards automatic score extraction, which are a motivation behind this work. Also, since instrument timbre changes with pitch, separate classifiers are trained for various pitch ranges for each instrument. Four instruments are investigated, representing stringed and wind instruments. The influence of adding harmonic (pitch-based) features to the feature set on the results is also investigated. Random forests are applied as a classification tool, and the results are presented and discussed. © Springer International Publishing 2015.

Loading Polish Japanese Academy of Information Technology collaborators
Loading Polish Japanese Academy of Information Technology collaborators