Time filter

Source Type

Liu M.,CAS Institute of Computing Technology | Liu M.,Loongson Technologies Corporation Ltd | Liu M.,University of Chinese Academy of Sciences | Yan C.,CAS Beijing Institute of Acoustics
Gaojishu Tongxin/Chinese High Technology Letters | Year: 2011

In view of the fact that the computation performance of multi-channel multiplexed finite impulse response (FIR) filters need to be improved, according to their coefficient temporal locality characteristic and data space locality characteristic, the paper proposes a method to optimize the software implementation for multi-channel FIR filters, whose multi-channel input data use the time-multiplexed transport mechanism by adjusting the software framework and the multi-channel input data's storage location. The results of the experiments on the Godson-2 prototype system show that compared with the typical software implementation method for multi-channel FIR filters, the proposed optimization method can achieve the higher degree of locality and performance when the number of channels is larger or when the difference between the amount of the filter order and the amount of continuous data on single channel is greater.

Guo Q.,CAS Institute of Computing Technology | Guo Q.,University of Chinese Academy of Sciences | Guo Q.,Loongson Technologies Corporation Ltd
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | Year: 2012

One of the most critical issues during functional verification is to generate highly effective stimuli. As verification proceeds, the effectiveness of verification stimuli decreases. To improve the effectiveness of stimuli, an online filteration technique to process is proposed to generated stimuli. This technique employs one-class support vector machines to online construct a classifier to predict whether or not a newly generated stimulus is redundant, and the predicted redundant stimulus will not be sent for simulation. Besides, we also propose an instruction sequence kernel to measure the similarities among instruction sequences. Experimental results demonstrate that this technique can reduce about 83% stimuli and 79% verification time in comparison with conventional constrained random generation.

Wang W.,CAS Institute of Computing Technology | Wang W.,Loongson Technologies Corporation Ltd | Wang W.,University of Chinese Academy of Sciences | Shen H.,CAS Institute of Computing Technology | Shen H.,Loongson Technologies Corporation Ltd
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | Year: 2011

Subpixel interpolation is one of the most computation-intensive parts in various HD video decoding processes. The existing subpixel interpolation architectures have difficulties in achieving high performance and flexibility simultaneously. This paper presents a reconfigurable sub-pixel interpolation architecture for multi-standard video decoding. Based on the analysis and comparison of commonalities and differences among interpolation algorithms of various standards, a novel reconfigurable parallel-serial-mixed filtering architecture is proposed, which allows dynamical configuration of the data transfer path, the I/O data pattern and the filter computation unit. It supports various video coding standards including VC-1, H.264/263, AVS and MPEG-1/2/4. The experimental results show that this design can achieve the real-time multi-standard HDTV 1080p (1920x1088@30 fps) video decoding. Compared to previous work, the proposed design can support more types of HD video coding standards while consuming the same amount of silicon resources. It has been applied in a multimedia SoC chip.

Ren T.,University of Chinese Academy of Sciences | Ren T.,CAS Institute of Computing Technology | Xue S.,CAS Institute of Computing Technology | Xue S.,Loongson Technologies Corporation Ltd | And 4 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2015

Browser is the entry point to cloud computing services, and the performance of JavaScript, with which the web applications are built, has become critically important in the user experience. The key to achieving JavaScript execution efficiency is Just-in-time (JIT) compilation. At present, Firefox is one of the most popular cross platform browsers. However, there is no MIPS code generator in IonMonkey, Firefox’s next-generation optimizing JavaScript JIT compiler, leaving the un-performed interpreter the only option to execute Java- Script on MIPS platform in Firefox. In this paper, we managed to implement an efficient and reliable MIPS code generator for IonMonkey. We took an insight into the inner mechanism of IonMonkey, and solved a series of platform-related problems such as double-layer cross platform architecture, patch, jump source chain, and ABI. Additionally, we optimized IonMonkey based on MIPS architecture by using a series of methods such as short-distance jump optimization, range analysis for arithmetic operation, peephole optimization, etc. With the JIT porting and these optimizations, V8 benchmark scores ascended from 38.8 to 957, and the running time of Sunspider benchmark descended from 20428.7 ms to 2689.5 ms. The efficiency of JS engine was significantly improved on MIPS. © Springer International Publishing Switzerland 2015.

Guo Q.,CAS Institute of Computing Technology | Guo Q.,University of Chinese Academy of Sciences | Chen T.,CAS Institute of Computing Technology | Chen T.,Loongson Technologies Corporation Ltd | And 6 more authors.
Microprocessors and Microsystems | Year: 2013

Predictive modeling is an emerging methodology for microarchitectural design space exploration. However, this method suffers from high costs to construct predictive models, especially when unseen programs are employed in performance evaluation. In this paper, we propose a fast predictive model-based approach for microarchitectural design space exploration. The key of our approach is utilizing inherent program characteristics as prior knowledge (in addition to microarchitectural configurations) to build a universal predictive model. Thus, no additional simulation is required for evaluating new programs on new configurations. Besides, due to employed model tree technique, we can provide insights of the design space for early design decisions. Experimental results demonstrate that our approach is comparable to previous approaches regarding their prediction accuracies of performance/energy. Meanwhile, the training time of our approach achieves 7.6-11.8× speedup over previous approaches for each workload. Moreover, the training costs of our approach can be further reduced via instrumentation technique. © 2012 Elsevier B.V. All rights reserved.

Discover hidden collaborations