Time filter

Source Type

Turan G.,University of Illinois at Chicago | Turan G.,MTA SZTE Research Group on Artificial Intelligence | Yaggie J.,University of Illinois at Chicago
IJCAI International Joint Conference on Artificial Intelligence | Year: 2015

A formal framework is given for the postulate characterizability of a class of belief revision operators, obtained from a class of partial preorders using minimization. It is shown that for classes of posets characterizability is equivalent to a special kind of definability in monadic second-order logic, which turns out to be incomparable to first-order definability. Several examples are given of characterizable and non-characterizable classes. For example, it is shown that the class of revision operators obtained from posets which are not total is not characterizable.

Gosztolya G.,MTA SZTE Research Group on Artificial Intelligence
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | Year: 2015

Social signal detection is a task in speech technology which has recently became more popular. In the Interspeech 2013 Com- ParE Challenge one of the tasks was social signal detection, and since then, new results have been published on the dataset. These studies all used the Area Under Curve (AUC) metric to evaluate the performance; here we argue that this metric is not really suitable for social signals detection. Besides raising some serious theoretical objections, we will also demonstrate this unsuitability experimentally: we will show that applying a very simple smoothing function on the output of the frame- level scores of state-of-the-art classifiers can significantly im- prove the AUC scores, but perform poorly when employed in a Hidden Markov Model. As the latter is more like real-world applications, we suggest relying on utterance-level evaluation metrics in the future. Copyright © 2015 ISCA.

Gosztolya G.,MTA SZTE Research Group on Artificial Intelligence
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | Year: 2015

In the recent years extracting non-trivial information from audio sources has become possible. The resulting data has induced a new area in speech technology known as computational paralinguistics. A task in this area was presented at the ComParE 2013 Challenge (using the SSPNet Conflict Corpus), where the task was to determine the intensity of conflicts arising in speech recordings, based only on the audio information. Most authors approached this task by following standard paralinguistic practice, where we extract a huge number of potential features and perform the actual classification or regression process in the hope that the machine learning method applied is able to completely ignore irrelevant features. Although current stateof- the-art methods can indeed handle an overcomplete feature set, studies show that they can still be aided by feature selection. We opted for a simple greedy feature selection algorithm, by which we were able to outperform all previous scores on the SSPNet Conflict dataset, achieving a UAR score of 85.6%. Copyright © 2015 ISCA.

Devai R.,University of Szeged | Vidacs L.,MTA SZTE Research Group on Artificial Intelligence | Ferenc R.,University of Szeged | Gyimothy T.,University of Szeged
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2014

Software development in C/C++ languages is tightly coupled with preprocessor directives. While the use of preprocessor constructs cannot be avoided, current IDE support for developers can still be improved. Early feedback from IDEs about misused macros or conditional compilation has positive effects on developer productivity and code quality as well. In this paper we introduce a service layer for the Visual Studio to make detailed preprocessor information accessible for any type of IDE extensions. The service layer is built upon our previous work on the analysis of directives. We wrap the analyzer tool and provide its functionality through an API. We present the public interface of the service and demonstrate the provided services through small plug-ins implemented using various extension mechanisms. These plug-ins work together to aid the daily work of developers in several ways. We provide (1) an editor extension through the Managed Extensibility Framework which provides macro highlighting within the source code editor; (2) detailed information about actual macro substitutions and an alternative code view to show the results of macro calls; (3) a managed package for discovering the intermediate steps of macro replacements through a macro explorer. The purpose of this work is twofold: we present an additional layer designed to aid the work of tool developers; second, we provide directly usable IDE components to express its potentials. © 2014 Springer International Publishing.

Toth L.,MTA SZTE Research Group on Artificial Intelligence
Eurasip Journal on Audio, Speech, and Music Processing | Year: 2015

Deep convolutional neural networks (CNNs) have recently been shown to outperform fully connected deep neural networks (DNNs) both on low-resource and on large-scale speech tasks. Experiments indicate that convolutional networks can attain a 10–15 % relative improvement in the word error rate of large vocabulary recognition tasks over fully connected deep networks. Here, we explore some refinements to CNNs that have not been pursued by other authors. First, the CNN papers published up till now used sigmoid or rectified linear (ReLU) neurons. We will experiment with the maxout activation function proposed recently, which has been shown to outperform the rectifier activation function in fully connected DNNs. We will show that the pooling operation of CNNs and the maxout function are closely related, and so the two technologies can be readily combined to build convolutional maxout networks. Second, we propose to turn the CNN into a hierarchical model. The origins of this approach go back to the era of shallow nets, where the idea of stacking two networks on each other was relatively well known. We will extend this method by fusing the two networks into one joint deep model with many hidden layers and a special structure. We will show that with the hierarchical modelling approach, we can reduce the error rate of the network on an expanded context of input. In the experiments on the Texas Instruments Massachusetts Institute of Technology (TIMIT) phone recognition task, we find that a CNN built from maxout units yields a relative phone error rate reduction of about 4.3 % over ReLU CNNs. Applying the hierarchical modelling scheme to this CNN results in a further relative phone error rate reduction of 5.5 %. Using dropout training, the lowest error rate we get on TIMIT is 16.5 %, which is currently the best result. Besides experimenting on TIMIT, we also evaluate our best models on a low-resource large vocabulary task, and we find that all the proposed modelling improvements give consistently better results for this larger database as well. © 2015, Tóth.

Discover hidden collaborations