Berkeley Wireless Research Center

Federal Way, CA, United States

Berkeley Wireless Research Center

Federal Way, CA, United States
Time filter
Source Type

News Article | December 24, 2015

Home > Press > Engineers demo first processor that uses light for ultrafast communications Abstract: Engineers have successfully married electrons and photons within a single-chip microprocessor, a landmark development that opens the door to ultrafast, low-power data crunching. The researchers packed two processor cores with more than 70 million transistors and 850 photonic components onto a 3-by-6-millimeter chip. They fabricated the microprocessor in a foundry that mass-produces high-performance computer chips, proving that their design can be easily and quickly scaled up for commercial production. The new chip, described in a paper to be published Dec. 24 in the print issue of the journal Nature, marks the next step in the evolution of fiber optic communication technology by integrating into a microprocessor the photonic interconnects, or inputs and outputs (I/O), needed to talk to other chips. "This is a milestone. It's the first processor that can use light to communicate with the external world," said Vladimir Stojanović, an associate professor of electrical engineering and computer sciences at the University of California, Berkeley, who led the development of the chip. "No other processor has the photonic I/O in the chip." Stojanović and fellow UC Berkeley professor Krste Asanović teamed up with Rajeev Ram at the Massachusetts Institute of Technology and Milos Popović at the University of Colorado, Boulder, to develop the new microprocessor. "This is the first time we've put a system together at such scale, and have it actually do something useful, like run a program," said Asanović, who helped develop the free and open architecture called RISC-V (reduced instruction set computer), used by the processor. Greater bandwidth with less power Compared with electrical wires, fiber optics support greater bandwidth, carrying more data at higher speeds over greater distances with less energy. While advances in optical communication technology have dramatically improved data transfers between computers, bringing photonics into the computer chips themselves had been difficult. That's because no one until now had figured out how to integrate photonic devices into the same complex and expensive fabrication processes used to produce computer chips without changing the process itself. Doing so is key since it does not further increase the cost of the manufacturing or risk failure of the fabricated transistors. The researchers verified the functionality of the chip with the photonic interconnects by using it to run various computer programs, requiring it to send and receive instructions and data to and from memory. They showed that the chip had a bandwidth density of 300 gigabits per second per square millimeter, about 10 to 50 times greater than packaged electrical-only microprocessors currently on the market. The photonic I/O on the chip is also energy-efficient, using only 1.3 picojoules per bit, equivalent to consuming 1.3 watts of power to transmit a terabit of data per second. In the experiments, the data was sent to a receiver 10 meters away and back. "The advantage with optical is that with the same amount of power, you can go a few centimeters, a few meters or a few kilometers," said study co-lead author Chen Sun, a recent UC Berkeley Ph.D. graduate from Stojanović's lab at the Berkeley Wireless Research Center. "For high-speed electrical links, 1 meter is about the limit before you need repeaters to regenerate the electrical signal, and that quickly increases the amount of power needed. For an electrical signal to travel 1 kilometer, you'd need thousands of picojoules for each bit." The achievement opens the door to a new era of bandwidth-hungry applications. One near-term application for this technology is to make data centers more green. According to the Natural Resources Defense Council, data centers consumed about 91 billion kilowatt-hours of electricity in 2013, about 2 percent of the total electricity consumed in the United States, and the appetite for power is growing exponentially. This research has already spun off two startups this year with applications in data centers in mind. SiFive is commercializing the RISC-V processors, while Ayar Labs is focusing on photonic interconnects. Earlier this year, Ayar Labs - under its previous company name of OptiBit - was awarded the MIT Clean Energy Prize. Ayar Labs is getting further traction through the CITRIS Foundry startup incubator at UC Berkeley. The advance is timely, coming as world leaders emerge from the COP21 United Nations climate talks with new pledges to limit global warming. Further down the road, this research could be used in applications such as LIDAR, the light radar technology used to guide self-driving vehicles and the eyes of a robot; brain ultrasound imaging; and new environmental biosensors. 'Fiat lux' on a chip The researchers came up with a number of key innovations to harness the power of light within the chip. Each of the key photonic I/O components - such as a ring modulator, photodetector and a vertical grating coupler - serves to control and guide the light waves on the chip, but the design had to conform to the constraints of a process originally thought to be hostile to photonic components. To enable light to move through the chip with minimal loss, for instance, the researchers used the silicon body of the transistor as a waveguide for the light. They did this by using available masks in the fabrication process to manipulate doping, the process used to form different parts of transistors. After getting the light onto the chip, the researchers needed to find a way to control it so that it can carry bits of data. They designed a silicon ring with p-n doped junction spokes next to the silicon waveguide to enable fast and low-energy modulation of light. Using the silicon-germanium parts of a modern transistor - an existing part of the semiconductor manufacturing process - to build a photodetector took advantage of germanium's ability to absorb light and convert it into electricity. A vertical grating coupler that leverages existing poly-silicon and silicon layers in innovative ways was used to connect the chip to the external world, directing the light in the waveguide up and off the chip. The researchers integrated electronic components tightly with these photonic devices to enable stable operation in a hostile chip environment. The authors emphasized that these adaptations all worked within the parameters of existing microprocessor manufacturing systems, and that it will not be difficult to optimize the components to further improve their chip's performance. ### Other co-lead authors on this paper are Mark Wade, Ph.D. student at the University of Colorado, Boulder; Yunsup Lee, a Ph.D. candidate at UC Berkeley; and Jason Orcutt, an MIT graduate who now works at the IBM Research Center in New York. The Defense Advanced Research Projects Agency (DARPA) helped support this work. For more information, please click If you have a comment, please us. Issuers of news releases, not 7th Wave, Inc. or Nanotechnology Now, are solely responsible for the accuracy of the content.

Lin M.,University of Central Florida | Chen S.,Berkeley Wireless Research Center | Demara R.F.,University of Central Florida | Wawrzynek J.,Berkeley Wireless Research Center | Wawrzynek J.,University of California at Berkeley
Microprocessors and Microsystems | Year: 2015

Emerging integrated CPU + FPGA hybrid platforms, such as the Extensible Processing Platform architecture from Xilinx [1], offer unprecedented opportunity to achieving both multifunctionality and real-time responsiveness for memory-intensive embedded applications. However, how to cost-effectively synthesize application-specific hardware constructs that fully exploit memory-level parallelism remains to be a key challenge. To address this problem, we propose a new FPGA-based embedded computer architecture, ASTRO (Application-Specific Hardware Traces with Reconfigurable Optimization). Our main contribution is the development of an integrated methodology that focuses on how to construct an application-specific memory access network capable of extracting the maximum amount of memory-level parallelism on a per-application basis. In particular, our proposed ASTRO architecture can (1) perform dynamic memory analysis to maximally extract the target application's instruction, loop and memory-level parallelism for performance enhancement, (2) synthesize highly efficient accelerators that enable parallelized memory accesses, and therefore (3) accomplish effective data orchestration by utilizing the capabilities of modern FPGA devices: abundant distributed block RAMs and reprogrammability. To empirically validate our ASTRO methodology, we have implemented a baseline embedded processor platform, a conventional CPU + accelerator with a centralized single memory, and a prototype ASTRO machine based on Xilinx MicroBlaze technology. Our experimental results show that on average for 10 benchmark applications from SPEC2006 and MiBench [2], the ASTRO machine achieves 8.6 times speedup compared to the baseline embedded processor platform and 1.7 times speedup compared to a conventional CPU + accelerator platform. More interestingly, the ASTRO platform achieves more than 40% reduction in energy-delay product compared to a conventional CPU + accelerator with a centralized memory. © 2015 Elsevier B.V. All rights reserved.

Alioto M.,University of Siena | Alioto M.,Berkeley Wireless Research Center
ISCAS 2010 - 2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems | Year: 2010

In this paper, the layout density of three-terminal FinFET logic circuits is extensively analyzed. As opposite to previous works, which are focused either on single devices or simplistic circuits, this analysis explicitly includes the geometric constraints that are imposed by the standard cell approach. The impact of the fin technology is analyzed by comparing the lithography- and spacer-defined approaches, as well as evaluating the dependence of layout density on the fin height. Results show that FinFET standard cells have a layout density that is better than bulk cells even for moderately tall fins. The fin height is also shown to be a powerful knob to improve the layout density in FinFET cells. Analysis also shows that the usually claimed 2X density improvement of the spacer-defined technology compared to the lithographydefined is dramatically reduced in real standard cells, and can be negligible for tall fins. All results are justified through considerations at the physical level of abstraction. Various versions of a 32-nm 44-gate library are laid out to carry out the analysis. ©2010 IEEE.

Alioto M.,University of Siena | Alioto M.,Berkeley Wireless Research Center
ISCAS 2010 - 2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems | Year: 2010

In this paper, subthreshold static CMOS logic is analyzed in terms of DC noise immunity in a closed form for the first time. Simplified circuit models of MOS transistors in subthreshold are developed to gain a deeper understanding of the degradation in the DC characteristics under ultra-low voltages, as well as its dependence on design and process parameters. The noise margin is explicitly evaluated and modeled with a simple expression. The impact of PMOS/NMOS imbalance is also explicitly analyzed. Results are validated with simulations in a 65-nm CMOS technology. ©2010 IEEE.

Alioto M.,University of Siena | Alioto M.,Berkeley Wireless Research Center
2011 20th European Conference on Circuit Theory and Design, ECCTD 2011 | Year: 2011

In this paper, the impact of the NMOS/PMOS imbalance on Ultra-Low Voltage (ULV) circuits and their design is discussed within a unitary framework for the first time. Variations are shown to dramatically affect imbalance due to the long-tailed probability density and high variability. The impact of the imbalance on the minimum supply voltage VDD,min ensuring correct gate switching is studied analytically. The results theoretically justify the experimental results in [1], which agree very well with the predictions. The impact of the imbalance on the leakage energy in VLSI systems is also analyzed through a simple but representative example. An analytical model is presented to predict such leakage energy increase due to imbalance. Extensive results in 65-nm CMOS are shown to agree with the design considerations and quantitative models presented. © 2011 IEEE.

Kuo N.-C.,Berkeley Wireless Research Center | Yang B.,Berkeley Wireless Research Center | Wu C.,Berkeley Wireless Research Center | Kong L.,Berkeley Wireless Research Center | And 5 more authors.
Proceedings - 2014 IEEE Asian Solid-State Circuits Conference, A-SSCC 2014 | Year: 2015

This paper demonstrates a CMOS digital polar transmitter with flip-chip interconnection to low-temperature co-fired ceramic (LTCC) interposers. The LTCC interposers contain the PA output balun targeting different operating frequency bands, and the reconfiguration in the carrier frequency is achieved by selecting an appropriate LTCC interposer. The same CMOS core transmitter is reused for different frequency bands. In this design, an output power higher than 22 dBm from 0.6 to 2.4 GHz is demonstrated, with peak power of 27.1 dBm and peak efficiency of 52%. The polar transmitter includes 9-bit phase interpolation and 8-bit amplitude modulation, suitable and verified as a multi-standard universal digital modulator. © 2014 IEEE.

Niknejad A.M.,Berkeley Wireless Research Center
IEEE Microwave Magazine | Year: 2010

Silicon based 60 GHz is a promising technology for high data rate communication. The research team at IBM demonstrated full transceiver front-ends in a SiGe BiCMOS. Dual-conversion superheterodyne radio architecture was selected over a homodyne approach due to its lower carrier feed-through in the transmitter and better I/Q quadrature accuracy. The low-noise amplifier (LNA) is at the lower left and the spiral inductors in the receiver mixer and intermediate frequency (IF) variable gain amplifier (IF VGA) is visible to the right of the LNA. The frequency tripler is in the center, and the phase locked loop (PLL) occupies the right third of the chip. On-wafer measurements were made on the full receiver, including the PLL. The PLL occupies the right third of the chip, and the baseband-to-IF mixer contains the two spiral inductors at the top center. The PA is operated from a 1-V supply to improve reliability and has a simulated small signal gain of 14 dB at 60 GHz. CW measurements verify that the PA can deliver 111 dBm of saturated output power with a peak PAE of 14.6%.

Chien J.-C.,Berkeley Wireless Research Center | Kuo N.-C.,Berkeley Wireless Research Center | Niknejad A.M.,Berkeley Wireless Research Center
Digest of Papers - IEEE Radio Frequency Integrated Circuits Symposium | Year: 2014

This paper presents a new passive coupling technique for quadrature voltage-controlled oscillator (QVCO) with low phase error. The significance of the coupling devices nonlinearity is emphasized. Based on the phase error analysis, modified bi-directional coupling diodes are proposed with higher third-order nonlinearity while lowering the capacitive loading. The proposed prototype and a standard 26-GHz QVCO are fabricated in 65-nm CMOS for comparison. Measurements confirm that the phase error is reduced from 1.5° to 0.36°. Phase error immunity against LC-tank mismatch is significantly reduced. © 2014 IEEE.

Gambini S.,Berkeley Wireless Research Center | Crossley J.,University of California at Berkeley | Alon E.,University of California at Berkeley | Rabaey J.M.,University of California at Berkeley
IEEE Journal of Solid-State Circuits | Year: 2012

We present an ultra-wideband transceiver designed for ultra-low-power communication at sub-10 cm range. The transceiver operates at a 5.6 GHz carrier frequency, chosen to minimize path loss when using a 1 cm antenna, and can switch its architecture between self-synchronous rectification and low-IF to adapt its power consumption to the channel characteristic in real time. A low-power digital circuit exploits redundancy in the modulation scheme to provide a real-time BER estimate used to close the mode-switching loop. Implemented in 65 nm CMOS, the transceiver consumes 25 μ when transmitting and 245 μ when receiving in low-power mode, plus 45 μ in the clock generator, and only requires an external antenna. Dual-mode operation allows range extension and mitigates interference. © 2012 IEEE.

Wu C.,Berkeley Wireless Research Center | Alon E.,Berkeley Wireless Research Center | Nikolic B.,Berkeley Wireless Research Center
IEEE Journal of Solid-State Circuits | Year: 2014

A wide-tuning-range low-power sigma-delta-based direct-RF-to-digital receiver architecture is implemented in 65 nm CMOS. A flat signal transfer function is chosen to support wide-frequency-range radios. A multilevel (two-bit) nonreturn-to-zero DAC improves jitter immunity to enable a high dynamic range, and, with a class-AB low-noise transconductance amplifier, guarantees a highly linear front end. For a 4 MHz signal, the peak SNDR of the receiver exceeds 68 dB and is better than 60 dB across the 400 MHz to 4 GHz carrier frequency range. By virtue of utilizing a negative feedback digitizer close to the antenna, an IIP3 of +10 dBm is achieved while dissipating only 40 mW from 1.1 V/1.5 V supply voltages. © 2014 IEEE.

Loading Berkeley Wireless Research Center collaborators
Loading Berkeley Wireless Research Center collaborators