Grenoble, France
Grenoble, France

Time filter

Source Type

Hadj Salem K.,LCIS Laboratory | Kieffer Y.,LCIS Laboratory | Mancini S.,TIMA Laboratory
Conference on Design and Architectures for Signal and Image Processing, DASIP | Year: 2017

Embedded vision systems design faces a memory-wall kind of challenge: images are big, and therefore memories containing them have high latency; and still, high performance is desired. For the case of non-linear processings, Mancini and Rousseau (Proc. DATE 2012) have designed a software generator of adhoc memory hierarchies, called Memory Management Optimization (MMOpt). While the performance of the generated circuits is very good, design-time decisions have to be made regarding their operation in order to handle finely the compromise between the usual metrics of design area, energy consumption, and performance. This study tackles the optimization challenge set by the design of the operational behavior of the memory hierarchy generated by MMOpt. After a precise formulation as a 3-objective optimization problem is given, two algorithms are proposed, and their performance is analyzed on real-world processings against the previously proposed algorithms. The results show a reduction of the amount of transferred data by 17% on average, and of the computing times by 11.7%, for the same design area. © 2016 ECSI.


Foroutan S.,TIMA Laboratory | Thonnart Y.,CEA Grenoble | Petrot F.,TIMA Laboratory
IEEE Transactions on Computers | Year: 2013

The trend toward integrated many-core architectures makes the network-on-chip (NoC) technology, the on-chip communication infrastructure of choice. However, and as opposed to a simple bus, due to its distributed and complex nature in terms of topology, wire size, routing algorithm, and so on, the timing behavior and thus performance of the infrastructure is difficult to predict. Therefore, one of the important phases in the NoC design flow is performance evaluation, which is to extract performance metrics to verify whether a specific instance from the NoC design space satisfies the requirements of the entire system. In this sense, reducing the time to obtain the NoC performance and consequently speeding-up the design space exploration is one of the keys that can considerably reduce the design-flow time and cost. In an effort toward this direction, we propose in this paper a novel analytical performance evaluation method that can be used in the earliest stages of the design flow, before using time-consuming simulations. The analytical method is used to evaluate the performance of a general purpose NoC and we show that it can predict the router latency, end-to-end per-flow latency, and network saturation point with an accuracy comparable to a cycle-accurate simulation. To systematically analyze the accuracy of our method compared to the corresponding simulation model, we present also an innovative accuracy analysis method. © 2013 IEEE.


Hamdioui S.,Technical University of Delft | Gizopoulos D.,National and Kapodistrian University of Athens | Guido G.,IMEC | Guido G.,Catholic University of Leuven | And 3 more authors.
Proceedings -Design, Automation and Test in Europe, DATE | Year: 2013

Forthcoming technology nodes are posing major challenges on the manufacturing of reliable (real-time) systems: process variations, accelerated degradation aging, as well as external and internal noise are key examples. This paper focuses on real-time systems reliability and analyzes the state-of-the-art and the emerging reliability bottlenecks from three different perspectives: technology, circuit/IP and full system. © 2013 EDAA.


Cherkaoui A.,Hubert Curien Laboratory | Fischer V.,Hubert Curien Laboratory | Aubert A.,Hubert Curien Laboratory | Fesquet L.,TIMA Laboratory
Proceedings - International Symposium on Asynchronous Circuits and Systems | Year: 2013

Self-timed rings are oscillators in which several events can evolve evenly-spaced in time thanks to analog effects inherent to the ring stage structure. One of their interesting features is that they provide precise high-speed multiphase signals. This paper presents a true random number generator that exploits the jitter of events propagating in a self-timed ring with a high entropy. Designs implemented in Alter a Cyclone III and Xilinx Virtex 5 devices provide high quality random bit sequences passing FIPS 140-1 and NIST SP 800-22 statistical tests at a high bit rate. © 2013 IEEE.


Pasca V.,TIMA Laboratory | Anghel L.,TIMA Laboratory | Rusu C.,TIMA Laboratory | Benabdenbi M.,TIMA Laboratory
Proceedings of the 2010 IEEE 16th International On-Line Testing Symposium, IOLTS 2010 | Year: 2010

Three-dimensional (3D) Thru-Silicon-Via (TSV) integration is emerging as a key enabling technology for future high performance systems. The TSV manufacturing defect rates lead to significant interconnect yield loss. For intra-die and inter-die interconnects, techniques such as via widening, via spreading and spare via insertion have been successfully used to improve the yield. However, for high fault rates these solutions are less effective and lead to unacceptable overheads. In this paper, configurable serial fault tolerant links are proposed for inter-die communication in 3D integrated systems. For high TSV fault rates, serial data transmission and signal remapping on fault-free wires are jointly used to ensure correct data transmission. After the interconnect tests, if faulty wires are detected then the link serializes data transmission such that only fault free wires are used. In the proposed link, any subset of data bits can be mapped on any subset of functional wires. Selecting a threshold serialization rate above which the link fails, enables optimal link designs that target interconnect technologies with high fault rates. The impact of inter-die configurable serial fault tolerant links on the performance and area overheads of 3D mesh networks-on-chip (3D NoC) is analyzed. The results show that for an 80% interconnect fault rate the latency degradation up to 14% and area overheads go up to 30%. © 2010 IEEE.


Cherkaoui A.,Hubert Curien Laboratory | Fischer V.,Hubert Curien Laboratory | Aubert A.,Hubert Curien Laboratory | Fesquet L.,TIMA Laboratory
Proceedings -Design, Automation and Test in Europe, DATE | Year: 2012

Many True Random Numbers Generators (TRNG) use jittery clocks generated in ring oscillators as a source of entropy. This is especially the case in Field Programmable Gate Arrays (FPGA), where sources of randomness are very limited. Inverter Ring Oscillators (IRO) are relatively well characterized as entropy sources. However, it is known that they are very sensitive to working conditions. This fact makes them vulnerable to attacks. On the other hand, Self-Timed Rings (STR) are currently considered as a promising solution to generate robust clock signals. Although many studies deal with their temporal behavior and robustness in Application Specific Integrated Circuits (ASIC), equivalent study does not exist for FPGAs. Furthermore, these oscillators were not analyzed and characterized as entropy sources aimed at TRNG design. In this paper, we analyze STRs as entropy sources for TRNGs implemented in FPGAs. Next, we compare STRs and IROs when serving as sources of randomness. We show that STRs represent very interesting alternative to IROs: they are more robust to environmental fluctuations and they exhibit lower extra-device frequency variations. © 2012 EDAA.


Papavramidou P.,TIMA Laboratory | Nicolaidis M.,TIMA Laboratory
IEEE Transactions on Computers | Year: 2016

In modern SoCs embedded memories should be protected by ECC against field failures to achieve acceptable reliability. They should also be repaired after fabrication to achieve acceptable fabrication yield. In technologies affected by high defect densities, conventional repair induces very high costs. To reduce it, we can use ECC-based repair, consisting in using the ECC for fixing words comprising a single faulty cell and self-repair to fix all other faulty words. However, as we show in this paper, for high defect densities the diagnosis circuitry required for ECC-based repair may induce very large hardware cost. To fix this issue, we introduce a new family of memory test algorithms that exhibit a property we termed as "single-read double-fault detection". This approach gains interest in ultimate CMOS and post-CMOS technologies, where the defect densities are expected to increase significantly, and/or in very-low power design, as very-low voltage sharply increases defect densities. © 2015 IEEE.


Shen H.,Eve Company | Hamayun M.-M.,TIMA Laboratory | Petrot F.,TIMA Laboratory
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | Year: 2012

Integration of multiple heterogeneous processors into a single system-on-a-chip is a clear trend in embedded devices. Designing and verifying these devices requires high-speed and easy-to-build simulation platforms. Among the software simulation approaches, native simulation is a good candidate since the embedded software is executed natively on the host machine, and no instruction set simulator development effort is necessary. However, existing native simulation approaches are such that the simulated software shares the memory space of the modeled hardware modules and the host operating system, making impractical the support of legacy code running on the target platform. To overcome this issue seldom mentioned in the literature, we propose the addition of a transparent address space translation layer to separate the target address space from the host simulator one. For this, we exploit the hardware-assisted virtualization technology now available on most general-purpose processors. Experiments show that this solution does not degrade the native simulation speed, while keeping the ability to accomplish software performance evaluation. © 2012 IEEE.


Guironnet De Massas P.,TIMA Laboratory | Petrot F.,TIMA Laboratory
Design Automation for Embedded Systems | Year: 2010

This paper presents a novel simulation-based approach which targets the performance estimation of cache coherence protocol implementations. Our approach allows to model a cache coherence protocol where coherence transactions take zero cycle and do not generate communication accesses, in the hope that it will provide a close lower bound on latency and traffic. The protocol modeling approach relies on cycle-accurate simulation models in which components can access instantaneously and transparently internal states of other components. Using this strategy, the access time and the traffic due to cache misses are taken into account as it would be on a multiprocessor system without cache coherence. However, the proposed approach still ensures that processors receive coherent data. We detail the implementation of this approach in a cycle accurate multiprocessor simulation environment. To show its effectiveness, we implement cache and memory models for two coherence protocols both with and without our omniscient cache coherence (OCC) proposal. We show with a formal method that this approach makes it possible to preserve the consistency models implied by the cache coherence protocols, and experimentally that the OCC strategy protocol gives a close lower bound on latency and traffic. © 2010 Springer Science+Business Media, LLC.


Foroutan S.,TIMA Laboratory | Sheibanyrad A.,TIMA Laboratory | Petrot F.,TIMA Laboratory
Proceedings - Design Automation Conference | Year: 2012

This paper addresses link-buffer capacity allocation in the design process of best-effort 3DNoCs holding hotspot memory ports. We show that in 3DSoCs with integrated wide I/O DRAMs, the congestion spreading is different from SoCs with external DRAMs: the bottlenecks are not anymore the external memory ports but the network links that become saturated and retropropagate the congestion. The distribution of bottleneck links is directly affected by the traffic directed to the hot memory ports. Using an analytical performance evaluation method, we determine network link buffer capacities according to the given workload composed of regular and hotspot traffics. © 2012 ACM.

Loading TIMA Laboratory collaborators
Loading TIMA Laboratory collaborators