Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2016
This paper deals with a major challenge in clustering that is optimal model selection. It presents new efficient clustering quality indexes relying on feature maximization, which is an alternative measure to usual distributional measures relying on entropy or on Chi-square metric or vector-based measures such as Euclidean distance or correlation distance. Experiments compare the behavior of these new indexes with usual cluster quality indexes based on Euclidean distance on different kinds of test datasets for which ground truth is available. This comparison clearly highlights altogether the superior accuracy and stability of the new method, its efficiency from low to high dimensional range and its tolerance to noise. © Springer International Publishing Switzerland 2016.
Studies in Classification, Data Analysis, and Knowledge Organization | Year: 2011
Gene expression matrices are numerical tables that describe the level of expression of genes in different situations, characterizing their behaviour. Biologists are interested in identifying groups of genes presenting similar quantitative variations of expression. This paper presents new syntactic constraints for itemset mining in particular Boolean gene expression matrices. A two dimensional gene expression profile representation is introduced and adapted to itemset mining allowing one to control gene expression. Syntactic constraints are used to discover itemsets with significant expression variations from a large collection of gene expression profiles. © Springer-Verlag Berlin Heidelberg 2011.
Kacem A.,UTIC |
Saidani A.,UTIC |
Proceedings of the International Conference on Document Analysis and Recognition, ICDAR | Year: 2011
In this paper we present a student information sheet reading system. Relevant algorithm is proposed to locate and label handwritten answer field. As information sheets can be filled in Arabic and/or in French, automating the script language differentiation is a pre-recognition required in the proposed system. We have developed a robust and fast field classification and script language identification method, based on a decision tree, to make these processing practical for sheet recognition. To this end, the system uses several novel features (loops, descenders, diacritics) and analyses the lower profile of script. The classification rates are 92.5% for numeric fields, 94.34% for Arabic scripts and 94.66% for French scripts. Experimental results, carried on 80 sheets, show our system provides an effective way to convert printed sheets into computerized format or collect information for database from printed sheets. © 2011 IEEE.
Bisson G.,LORIA |
Bisson G.,TU Eindhoven
Journal of Mathematical Cryptology | Year: 2012
We design a probabilistic algorithm for computing endomorphism rings of ordinary elliptic curves defined over finite fields that we prove has a subexponential runtime in the size of the base field, assuming solely the generalized Riemann hypothesis. Additionally, we improve the asymptotic complexity of previously known, heuristic, subexponential methods by describing a faster isogeny-computing routine. © de Gruyter 2011.
Bisson G.,LORIA |
Bisson G.,TU Eindhoven |
Sutherland A.V.,Massachusetts Institute of Technology
Designs, Codes, and Cryptography | Year: 2012
We describe a space-efficient algorithm for solving a generalization of the subset sum problem in a finite group G, using a Pollard-Ï approach. Given an element z and a sequence of elements S, our algorithm attempts to find a subsequence of S whose product in G is equal to z. For a random sequence S of length d log2 n, where n = #G and d ≥ 2 is a constant, we find that its expected running time is O(√n log n) group operations (we give a rigorous proof for d > 4), and it only needs to store O(1) group elements. We consider applications to class groups of imaginary quadratic fields, and to finding isogenies between elliptic curves over a finite field. © Springer Science+Business Media, LLC 2011.