Autoesl Inc.

Cupertino, CA, United States

Autoesl Inc.

Cupertino, CA, United States
Time filter
Source Type

Niu Y.,Beijing University of Posts and Telecommunications | Kuang J.,Beijing University of Posts and Telecommunications | Dai Z.,Beijing University of Posts and Telecommunications | Ye Q.,Autoesl Inc.
TriSAI 2011 - Proceedings of Triangle Symposium on Advanced ICT 2011 | Year: 2011

This paper aims to study and develop a simplified digital camera SOPC system based on Nios II CPU and FPGA platform, we will implement and verify the whole system on FPGA, including both software and hardware system. In the SOPC system, we used an JPEG encoder IP core developed with high level synthesis methodology. It could encode and decode real-time images, collected by digital camera, and transmit the compressed data to PC terminal by the Ethernet where JPEG file is written to form a complete JPEG picture. The verification results indicate that the compression ratio is obviously improved as 2.2 times as that of JPEG image software with less power consumed and high efficiency.

Chen D.,University of Illinois at Urbana - Champaign | Cong J.,University of California at Los Angeles | Fan Y.,Autoesl Inc. | Wan L.,University of Illinois at Urbana - Champaign
IEEE Transactions on Very Large Scale Integration (VLSI) Systems | Year: 2010

In this paper, we present a low-power architectural synthesis system (LOPASS) for field-programmable gate-array (FPGA) designs with interconnect power estimation and optimization. LOPASS includes three major components: 1) a flexible high-level power estimator for FPGAs considering the power consumption of various FPGA logic components and interconnects; 2) a simulated-annealing optimization engine that carries out resource selection and allocation, scheduling, functional unit binding, register binding, and interconnection estimation simultaneously to reduce power effectively; and 3) a κ-cofamily-based register binding algorithm and an efficient port assignment algorithm that reduce interconnections in the data path through multiplexer optimization. The experimental results show that LOPASS produces promising results on latency optimization compared to an academic high-level synthesis tool SPARK. Compared to an early commercial high-level synthesis tool, namely, Synopsys Behavioral Compiler, LOPASS is 61.6% better on power consumption and 10.6% better on clock period on average. Compared to a current commercial tool, namely, Impulse C, LOPASS is 31.1% better on power reduction with an 11.8% penalty on clock period. © 2006 IEEE.

Cong J.,University of California at Los Angeles | Liu B.,University of California at Los Angeles | Majumdar R.,University of California at Los Angeles | Zhang Z.,Autoesl Inc.
ACM Transactions on Design Automation of Electronic Systems | Year: 2010

Many techniques for power reduction in advanced RTL synthesis tools rely explicitly or implicitly on observability don't-care conditions. In this article we propose a systematic approach to maximize the effectiveness of these techniques by generating power-friendly RTL descriptions in behavioral synthesis. This is done using operation gating, that is, explicitly adding a predicate to an operation based on its observability condition, so that the operation, once identified as unobservable at runtime, can be avoided using RTL power optimization techniques such as clock gating. We first introduce the concept of behavior-level observability and its approximations in the context of behavioral synthesis. We then propose an efficient procedure to compute an approximated behavior-level observability of every operation in a dataflow graph. Unlike previous techniques which work at the bit level in Boolean networks, our method is able to perform analysis at the word level, and thus avoids most computation effort with a reasonable approximation. Our algorithm exploits the observability-masking nature of some Boolean operations, as well as the select operation, and allows certain forms of other knowledge to be considered for stronger observability conditions. The approximation is proved exact for (acyclic) dataflow graphs when non-Boolean operations other than select are treated as black boxes. The behavior-level observability condition obtained by our analysis can be used to guide the operation scheduler to optimize the efficiency of operation gating. In a set of experiments on real-world designs, our method achieves an average of 33.9% reduction in total power; it outperforms a previous method by 17.1% on average and gives close-to-optimal solutions on several designs. To the best of our knowledge, this is the first time behavior-level observability analysis and optimization are performed during behavioral synthesis in a systematic manner. We believe that our idea can be applied to compiler transformations in general. © 2010 ACM.

Cong J.,Autoesl Inc. | Cong J.,University of California at Los Angeles | Liu B.,Autoesl Inc. | Liu B.,University of California at Los Angeles | And 4 more authors.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | Year: 2011

Escalating system-on-chip design complexity is pushing the design community to raise the level of abstraction beyond register transfer level. Despite the unsuccessful adoptions of early generations of commercial high-level synthesis (HLS) systems, we believe that the tipping point for transitioning to HLS methodology is happening now, especially for field-programmable gate array (FPGA) designs. The latest generation of HLS tools has made significant progress in providing wide language coverage and robust compilation technology, platform-based modeling, advancement in core HLS algorithms, and a domain-specific approach. In this paper, we use AutoESL's AutoPilot HLS tool coupled with domain-specific system-level implementation platforms developed by Xilinx as an example to demonstrate the effectiveness of state-of-art C-to-FPGA synthesis solutions targeting multiple application domains. Complex industrial designs targeting Xilinx FPGAs are also presented as case studies, including comparison of HLS solutions versus optimized manual designs. In particular, the experiment on a sphere decoder shows that the HLS solution can achieve an 11-31% reduction in FPGA resource usage with improved design productivity compared to hand-coded design. © 2006 IEEE.

Loading Autoesl Inc. collaborators
Loading Autoesl Inc. collaborators