Nvidia launched
its latest line of Tesla GPU compute engines at the company’s Graphics
Technology Conference in San Jose today. One model shipping immediately is
based on the existing GK104 chip used in the recently released GTX 680. Dubbed the Tesla K10, the board delivers as much as 4.6
teraflops of single precision floating point performance, roughly three times
the single precision FP of the older, Fermi-based Tesla. The card can also
handle an aggregate memory bandwidth of 320GB per second. This board is
targeted towards oil exploration, signal processing and seismic processing
applications.
The more
intriguing announcement is the Tesla K20. Built on a monster chip with 7.1
billion transistors, the K20 isn’t slated for release until Q4. Nvidia’s CEO,
Jen-Hsun Huang noted that the K20 was the largest, most complex semiconductor
chip ever built. It will likely use the same 28nm manufacturing process used for
the GTX 680. The K20 is designed for computationally intensive HPC
environments, particularly Finite Element Analysis (FEA), finance and physics
applications. It offers three times the double-precision floating point
performance of previous generation Tesla products. In addition to the huge
transistor count, the K20 will sport a 384-bit memory interface.
New Features
In addition to improved compute performance, the K20 will support
several key features to keep the chip busy when being fed compute chores.
Hyper-Q increases the number of work queues from a single queue in the previous
generation Fermi chip to 32 work queues. This improves GPU utilization, keeping
more of the compute cores humming when running parallel compute applications. Dynamic
Parallelism behaves like a kind of parallel branch predictor. When fed tasks,
the K20 can keep track of dependent tasks and spawn new compute kernels to
complete those tasks, rather than having to request more work from the CPU. Huang
demonstrated a simulation of particles colliding, first starting with the last
generation Fermi chip. That GPU could handle 20,000 bodies colliding in real
time at high frame rates. Then he went on to demonstrate real-time modeling of
the Andromeda and Milky Way galaxies colliding – not something we need to worry
about for the time being, since it won’t happen for 3.8 billion years. That
simulation ran on a Kepler-based Tesla, showing over 208,000 bodies colliding. The
GPU in the K20, code-named GK110, is expected to be used in the net Titan
supercomputer being built at the Oak Ridge National Laboratory and the Blue
Waters system at the National Center for Supercomputing Applications at the
University of Illinois at Urbana-Champaign.
No comments:
Post a Comment