Nvidia Corporation is an American worldwide technology company based in Santa Clara, California. Nvidia manufactures graphics processing units , as well as system-on-a-chip units for the mobile computing market. Nvidia's primary GPU product line, labeled "GeForce", is in direct competition with AMD's "Radeon" products. Nvidia also joined the gaming industry with its handheld Shield Portable and Shield Tablet, as well as the tablet market with the Tegra Note 7.In addition to GPU manufacturing, Nvidia provides parallel processing capabilities to researchers and scientists that allow them to efficiently run high-performance applications. They are deployed in supercomputing sites around the world. More recently, Nvidia has moved into the mobile computing market, where it produces Tegra mobile processors for smartphones and tablets, as well as vehicle navigation and entertainment systems. In addition to Advanced Micro Devices, its competitors include Intel and Qualcomm. Wikipedia.
Nvidia | Date: 2017-03-27
Embodiments related to managing lazy runahead operations at a microprocessor are disclosed. For example, an embodiment of a method for operating a microprocessor described herein includes identifying a primary condition that triggers an unresolved state of the microprocessor. The example method also includes identifying a forcing condition that compels resolution of the unresolved state. The example method also includes, in response to identification of the forcing condition, causing the microprocessor to enter a runahead mode.
Nvidia | Date: 2017-02-10
This description is directed to a dynamic random access memory (DRAM) array having a plurality of rows and a plurality of columns. The array further includes a plurality of cells, each of which are associated with one of the columns and one of the rows. Each cell includes a capacitor that is selectively coupled to a bit line of its associate column so as to share charge with the bit line when the cell is selected. There is a segmented word line circuit for each row, which is controllable to cause selection of only a portion of the cells in the row.
Nvidia | Date: 2017-01-18
A method, computer readable medium, and system are disclosed for performing tree traversal with backtracking in constant time. The method includes the steps of traversing a tree, maintaining a bit trail variable and a current key variable during the traversing, where the bit trail variable includes a first plurality of bits indicating tree levels on which a node has been postponed along a path from the root of the tree during the traversing, and the current key variable includes a second plurality of bits indicating a number of a current node within the tree, and performing backtracking within the tree during the traversing, utilizing the bit trail variable and the current key variable.
Nvidia | Date: 2017-01-09
A method, computer readable medium, and system are disclosed for detecting and classifying hand gestures. The method includes the steps of receiving an unsegmented stream of data associated with a hand gesture, extracting spatio-temporal features from the unsegmented stream by a three-dimensional convolutional neural network (3DCNN), and producing a class label for the hand gesture based on the spatio-temporal features.
Nvidia | Date: 2017-04-03
In one embodiment, a system comprises: a global clock input for receiving a global clock, a plurality of partitions; and a skew tolerant interface configured to compensate for clock skew differences between a global clock from outside at least one of the partitions and a balanced local clock within at least one of the partitions. The partitions can be test partitions. The skew tolerant interface can cross a mesochronous boundary. In one exemplary implementation, the skew tolerant interface includes a deskew ring buffer on communication path of the at least one partition. pointers associated with the ring buffer can be free-running and depend only on clocks being pulsed when out of reset. The scheme can be fully synchronous and deterministic. The scheme can be modeled for the ATPG tools using simple pipeline flops. The depth of the pipeline can be dependent on the pointer difference for the read/write interface. The global clock input can be part of a scan link.
Nvidia | Date: 2017-02-08
A method for displaying a near-eye light field display (NELD) image is disclosed. The method comprises determining a pre-filtered image to be displayed, wherein the pre-filtered image corresponds to a target image. It further comprises displaying the pre-filtered image on a display. Subsequently, it comprises producing a near-eye light field after the pre-filtered image travels through a microlens array adjacent to the display, wherein the near-eye light field is operable to simulate a light field corresponding to the target image. Finally, it comprises altering the near-eye light field using at least one converging lens, wherein the altering allows a user to focus on the target image at an increased depth of field at an increased distance from an eye of the user and wherein the altering increases spatial resolution of said target image.
Nvidia | Date: 2017-01-03
A system and method uses the capabilities of a geometry shader unit within the multi-threaded graphics processor to implement algorithms with variable input and output.
Nvidia | Date: 2017-02-27
The description covers a system and method for operating a micro-processing system having a runahead mode of operation. In one implementation, the method includes providing, for a first portion of code, a runahead correlate. When the first portion of code is encountered by the micro-processing system, a determination is made as to whether the system is operating in the runahead mode. If so, the system branches to the runahead correlate, which is specifically configured to identify and resolve latency events likely to occur when the first portion of code is encountered outside of runahead. Branching out of the first portion of code may also be performed based on a determination that a register is poisoned.
Nvidia | Date: 2017-02-20
One embodiment of the present invention is a parallel processing unit (PPU) that includes one or more streaming multiprocessors (SMs) and implements a replay unit per SM. Upon detecting a page fault associated with a memory transaction issued by a particular SM, the corresponding replay unit causes the SM, but not any unaffected SMs, to cease issuing new memory transactions. The replay unit then stores the faulting memory transaction and any faulting in-flight memory transaction in a replay buffer. As page faults are resolved, the replay unit replays the memory transactions in the replay bufferremoving successful memory transactions from the replay bufferuntil all of the stored memory transactions have successfully executed. Advantageously, the overall performance of the PPU is improved compared to conventional PPUs that, upon detecting a page fault, stop performing memory transactions across all SMs included in the PPU until the fault is resolved.
Nvidia | Date: 2017-01-20
One embodiment of the present invention includes a parallel processing unit (PPU) that performs pixel shading at variable granularities. For effects that vary at a low frequency across a pixel block, a coarse shading unit performs the associated shading operations on a subset of the pixels in the pixel block. By contrast, for effects that vary at a high frequency across the pixel block, fine shading units perform the associated shading operations on each pixel in the pixel block. Because the PPU implements coarse shading units and fine shading units, the PPU may tune the shading rate per-effect based on the frequency of variation across each pixel group. By contrast, conventional PPUs typically compute all effects per-pixel, performing redundant shading operations for low frequency effects. Consequently, to produce similar image quality, the PPU consumes less power and increases the rendering frame rate compared to a conventional PPU.