It was two years ago that we saw the introduction of NVIDIA's Fermi, NVIDIA's previous architectural change. The GF100 featured 512 stream processors in a 16x32 format built by TSMC in a 40nm process. Sporting OpenGL 4.0 and DirectX 11, enthusiasts climbed over each other to get the latest and greatest GPU from NVIDIA. While the performance was there, power efficiency was very poor with the cards consuming lots of power and running very hot.
Enter the 5 series 8 months later and many would say the GTX580 was the card the GTX480 should have been. It was more powerful thanks to the fully unlocked memory controller (6x64bit), ran cooler and thus became a firm favourite among the high end GPU market.
For a little over a year the GTX580 enjoyed sitting atop of the GPU pile (save for dual GPU variants) and it wasn't until the AMD HD7970 emerged in December 2011 that the GTX580 lost its performance crown. NVIDIA however had not been resting on their laurels...
GTX680 GPU
The successor to Fermi is codenamed Kepler. Listening to feedback of Fermi from gamers around the world, NVIDIA sought to create a graphics card that was not only the best money could buy in terms of performance but crucially it had to run cool and quiet to make for a power efficient model. Kepler addresses these issues in two ways; GPU Boost which we will discuss further in our overclocking section of the review and a redesigned streaming multiprocessor.
Kepler Core
The GTX680 is equipped with a GK104 core which NVIDIA claim is their highest and most power efficient GPU to date. Fabricated on the 28nm process, every component was designed with power efficiency in mind to ensure the GPU gave the best performance-per-watt possible.
While the 28nm manufacturing process holds the bulk of the power saving features, the way the new architecture works is also key to furthering Kepler's power reduction. SMX, an evolution of the SM from Fermi is NVIDIA's new streaming multiprocessor:
The SMX now runs at the graphics clock speed rather than 2x that speed as before but because the GK104 has 1536 CUDA cores (8xSMX), over Fermi's (GF110) 512 CUDA cores (16xSM), Kepler allows the GTX680 to operate at twice the performance per watt measurement compared to the GTX580.
The clock throughput of FMA2, SFU and texture operations have all been significantly increased and while some operations still retain the same speed as the GTX580, the GTX680's much higher core clockspeed ensures a substantial increase for all GPU operations.
To feed the new SMX, each unit has four warp schedulers, each capable of firing off two instructions per warp, every clock. With a redesigned scheduling function, the GTX680 is a lot less complex and much more streamlined allowing the compiler to determine which instructions will be issued and can provide this information direct to the hardware block rather than going around the houses using Multi-port decode and a register scoreboard along with the dependency check before the information gets issued. This method is both quicker and more importantly, more power efficient.
Perhaps the biggest change to the processor core is the omission of the old shader clock. The shader clock was getting tired as it was part of the old Tesla architecture which was implemented as an area optimisation. Because of the way Kepler executes instructions at a much more streamlined manner and at a higher clock rate, fewer copies of the execution unit need be made.
Another improvement over Fermi with the introduction of SMX is the redesigned Polymorph engine (2.0). The Polymorph engine takes care of the tessellation workload of the GTX680. The design of the new Polymorph engine is to ensure that the tessellation we see on screen has little (as possible) impact on the rendering performance. What you may find bizarre though is that NVIDIA have cut the Polymorph engine count in half to 8 from Fermi's 16 yet claim it is almost twice as quick. This is thanks to the improved core an memory clockpeed.
Overall then, it seems NVIDIA have outdone themselves. Higher clockspeeds on both core and memory, 3x the amount of stream processors yet streamlined by eliminating the shader clock to cut the amount of power consumption without compromising overall performance.
The story doesn't end there though as there are a whole host of features the GTX680 has to offer...
Enter the 5 series 8 months later and many would say the GTX580 was the card the GTX480 should have been. It was more powerful thanks to the fully unlocked memory controller (6x64bit), ran cooler and thus became a firm favourite among the high end GPU market.
For a little over a year the GTX580 enjoyed sitting atop of the GPU pile (save for dual GPU variants) and it wasn't until the AMD HD7970 emerged in December 2011 that the GTX580 lost its performance crown. NVIDIA however had not been resting on their laurels...
GTX680 GPU
The successor to Fermi is codenamed Kepler. Listening to feedback of Fermi from gamers around the world, NVIDIA sought to create a graphics card that was not only the best money could buy in terms of performance but crucially it had to run cool and quiet to make for a power efficient model. Kepler addresses these issues in two ways; GPU Boost which we will discuss further in our overclocking section of the review and a redesigned streaming multiprocessor.
Kepler Core
The GTX680 is equipped with a GK104 core which NVIDIA claim is their highest and most power efficient GPU to date. Fabricated on the 28nm process, every component was designed with power efficiency in mind to ensure the GPU gave the best performance-per-watt possible.
While the 28nm manufacturing process holds the bulk of the power saving features, the way the new architecture works is also key to furthering Kepler's power reduction. SMX, an evolution of the SM from Fermi is NVIDIA's new streaming multiprocessor:
The SMX now runs at the graphics clock speed rather than 2x that speed as before but because the GK104 has 1536 CUDA cores (8xSMX), over Fermi's (GF110) 512 CUDA cores (16xSM), Kepler allows the GTX680 to operate at twice the performance per watt measurement compared to the GTX580.
The clock throughput of FMA2, SFU and texture operations have all been significantly increased and while some operations still retain the same speed as the GTX580, the GTX680's much higher core clockspeed ensures a substantial increase for all GPU operations.
To feed the new SMX, each unit has four warp schedulers, each capable of firing off two instructions per warp, every clock. With a redesigned scheduling function, the GTX680 is a lot less complex and much more streamlined allowing the compiler to determine which instructions will be issued and can provide this information direct to the hardware block rather than going around the houses using Multi-port decode and a register scoreboard along with the dependency check before the information gets issued. This method is both quicker and more importantly, more power efficient.
Perhaps the biggest change to the processor core is the omission of the old shader clock. The shader clock was getting tired as it was part of the old Tesla architecture which was implemented as an area optimisation. Because of the way Kepler executes instructions at a much more streamlined manner and at a higher clock rate, fewer copies of the execution unit need be made.
Another improvement over Fermi with the introduction of SMX is the redesigned Polymorph engine (2.0). The Polymorph engine takes care of the tessellation workload of the GTX680. The design of the new Polymorph engine is to ensure that the tessellation we see on screen has little (as possible) impact on the rendering performance. What you may find bizarre though is that NVIDIA have cut the Polymorph engine count in half to 8 from Fermi's 16 yet claim it is almost twice as quick. This is thanks to the improved core an memory clockpeed.
Overall then, it seems NVIDIA have outdone themselves. Higher clockspeeds on both core and memory, 3x the amount of stream processors yet streamlined by eliminating the shader clock to cut the amount of power consumption without compromising overall performance.
The story doesn't end there though as there are a whole host of features the GTX680 has to offer...