Entity

Time filter

Source Type


Simon A.,TU Munich | Chen L.,National Laboratory for Parallel and Distributed Processing
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2010

While the definition of the revised widening for polyhedra is defined in terms of inequalities, most implementations use the double description method as a means to an efficient implementation. We show how standard widening can be implemented in a simple and efficient way using a normalized H-representation (constraint-only) which has become popular in recent approximations to polyhedral analysis. We then detail a novel heuristic for this representation that is tuned to capture linear transformations of the state space while ensuring quick convergence for non-linear transformations for which no precise linear invariants exist. © 2010 Springer-Verlag.


Chen Z.,National Laboratory for Parallel and Distributed Processing | Liu Z.,United International University Dhanmondi
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2010

Compensating CSP (cCSP) is an extension to CSP for modeling long-running transactions. It can be used to specify programs of service orchestration written in a programming language like WS-BPEL. So far, only an operational semantics and a trace semantics are given to cCSP. In this paper, we extend cCSP with more operators and define for it a stable failures semantics in order to reason about non-determinism and deadlock. We give some important algebraic laws for the new operators. These laws can be justified and understood from the stable failures semantics. A case study is given to demonstrate the extended cCSP. © 2010 Springer-Verlag.


Chen Z.,National Laboratory for Parallel and Distributed Processing | Liu Z.,United International University Dhanmondi | Wang J.,National Laboratory for Parallel and Distributed Processing
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2011

Compensating CSP (cCSP) extends CSP for specification and verification of long running transactions. The original cCSP is a modest extension to a subset of CSP that does not consider non-deterministic choice, synchronized composition, and recursion. There are a few further extensions. However, it remains a challenge to develop a fixed-point theory of process refinement in cCSP. This paper provides a complete solution to this problem and develops a theory of cCSP, corresponding to the theory of CSP, so that the verification techniques and their tools, such as FDR, can be extended for compensating processes. © 2011 Springer-Verlag.


Xia F.,National Laboratory for Parallel and Distributed Processing | Dou Y.,National Laboratory for Parallel and Distributed Processing | Jin G.,Wuhan Naval University of Engineering
Journal of Supercomputing | Year: 2012

In the field of RNA secondary structure prediction, MFE, SCFG and the homologous comparative sequence analysis are three kinds of classical computation analysis approaches. However, the parallel efficiency of many implementations on general-purpose computers are greatly limited by complicated data dependency and tight synchronization. Additionally, large scale parallel computers are too expensive to be used easily for many research institutes. Recently, FPGA chips provide a new approach to accelerate those algorithms by exploiting fine-grained custom design.We propose a unified parallelism schemes and logic circuit architecture for three classical algorithms - Zuker, RNAalifold and CYK, based on a systolic-like master-slave PE (Processing Element) array for fine-grained hardware implementation on FPGA. We partition tasks by columns and assign them to PEs for load balance. We exploit data reuse schemes to reduce the need to load matrix from external memory. The experimental results show a factor of 12-14x speedup over the three software versions running on a PC platform with AMD Phenom 9650 Quad CPU. The computational power of our prototype is comparable to a PC cluster consisting of 20 Intel-Xeon CPUs for RNA secondary structure prediction; however, the power consumption is only about 10% of the latter. © Springer Science+Business Media, LLC 2011.


Wang L.,National Laboratory for Parallel and Distributed Processing | Xue J.,Programming Languages and Compilers Group | Yang X.,National Laboratory for Parallel and Distributed Processing
Proceedings -Design, Automation and Test in Europe, DATE | Year: 2010

This paper presents reuse-aware modulo scheduling to maximizing stream reuse and improving concurrency for stream-level loops running on stream processors. The novelty lies in the development of a new representation for an unrolled and software-pipelined stream-level loop using a set of reuse equations, resulting in simultaneous optimization of two performance objectives for the loop, reuse and concurrency, in a unified framework. We have implemented this work in the compiler developed for our 64-bit FT64 stream processor. Our experimental results obtained on FT64 and by simulation using nine representative stream applications demonstrate the effectiveness of the proposed approach. © 2010 EDAA.

Discover hidden collaborations