For consumers who have never heard of the GeForce GTX 960 and the GPU that powers it, the GM206 can be considered a further cut down version of Maxwell core that powers the GeForce GTX 980 series. To envision this, imagine the GeForce GTX 980’s ‘GM204-200’ core with its 4 Graphic Process Clusters (GPC), with each of these GPCs being made up of four streaming multiprocessor(SMM) ‘blocks’, one 64-bit memory controller and one Rasterizer engine. Each SMM block consists of 32 CUDA cores, 8 Raster Operation Pipelines (ROPs), and 8 Texture Units – plus other low level goodies.
Now imagine slicing off one SMM in three of the four GPC’s and you have the GeForce 970’s version of the GM204 core – the ‘GM204-400’.
Now imagine taking that selfsame 980 Maxwell core (GM204-200) and the factory cuts off two full Graphic Processing Clusters instead of just pairing off a few SMMs. That in a nutshell is what the GM206 is, two Maxwell Graphic Processing Clusters with 8 SMMs and two memory controllers instead of 4 GPCs, 16 SMMs and 4 memory controllers. In other words, on paper the GM206 has approximately half the horsepower of the ‘original’ Maxwell GM204-200, with only two memory controller and only ‘needing’ 2GB of memory instead of 4GB.
In fact it is pretty much safe to say that’s what NVIDIA designer did when designing the GM206 – AKA GeForce GTX 960 series. They took the GM204-200 Maxwell core and cut it in half. However this is being a touch over simplistic as while yes it does indeed have exactly half the ROPS/Texture Units/Cuda Cores of a GeForce 980’s GM204, and while yes its memory bus has also been halved from 256-bit to 128-bit, the GM206 is not simply a GM204 with half of the GPC’s disabled.
Instead NVIDIA did the cutting on the drafting table, and created a new smaller core than the GM204. The end result is not only a new Maxwell core, but one that is more efficient than if they simply used ‘defective’ GM204s with half the cores disabled. In fact the new GM206 weighs in at 56.54% of the GM204-200,with a respectable 2.94 Billion transistors instead of 2.6Billion if it was just a halved GM204-200.
In order to boost performance compared to previous x60 series (which has more CUDA cores and a larger memory bus) NVIDIA has significantly boosted the based and boost clock speeds over what is found on the GM204 counterparts. This is because they had the luxury of having a TDP that is not 50% of the 980 but is closer to 73%. To be precise a reference GM204-200 ‘GeForce 980’ comes with a base clock of 1,126MHz that can be boosted to 1,216Mhz and stay within its 165 watt TDP limit. Compare and contrast to a reference GM206 core which has a base clock of 1,127MHz that can go to 1,178 and has to stay within a 120watt TDP specification. Mix in a much more efficient architecture and on paper this new GM206 is up to 1.4 times more potent in ‘apples to apples’ comparison to previous x60s.
In previous generations the idea of a 128-bit memory bus on a mid-tier card would be a non-starter as the underlying design would not be optimized properly to make use of it. The GM206 on the other hand has been optimized to take full advantage of the smaller pipe-line and at lower resolutions this anemic memory bandwidth of 112.16GB/s should not be an issue. At higher resolutions it may start to cause problems but inexpensive video cards performance levels always suffer at higher resolutions.
This at least was the theory NIVIDA were going on when they designed the GM206 core. Sadly, reality always has the right of way and just like the proverbial Mac truck barreling down a one way street it ran over NIVIDA’s elegant theory. The reality of the matter is the combination of small 128-bit bus and rather small CUDA/TOP/etc count made the 960 rather easy meat for the AMD R9 280X. Since both cards compete in the same $200 price range, the only area a reference GeForce GTX 960 was a clear and concise winner was in the power and noise department. Everywhere else the Tahiti XTL at $200 price point proved to be the better overall value for most consumers.
This is where the PNY GeForce GTX 960 Elite OC enters the equation. Due to the design limitations of the GM206 architecture there was nothing PNY could do on the memory controller side of the equation; so in an interesting and controversial move they have left the memory entirely alone. This means you can expect ‘your’ 960 Elite OC to come standard with 2GB of memory that is clocked at an effective speed of 7,010Mhz with a bandwidth of 112.16GB/s – just like a reference GeForce GTX 960.
This is controversial because the most obvious bottleneck is on the memory side of equation and yet PNY has done nothing to help alleviate it. Instead PNY feels confident that the right solution was to boost the performance of those two Graphic Processing Clusters and cranked the performance dial all the way to ‘ludicrous speeds’…and maybe they’ve even gone to plaid. Concisely put they have bestowed upon the Elite OC a 16% factory overclock that boosts the stock speed from 1,127MHz to 1,304MHz and the boost speed from 1,178 to a whopping 1,367MHz. How big a boost on real world performance this 16% factory overclock translates to remains to be seen, but it should be on the significant end of the spectrum.
To help keep such a large overclock cool and stable, PNY has carried over one of their smaller previous XLR8 ‘OC’ custom dual slot cooler designs. This design consists of a dual tower, dual fan, multi-heat pipe setup that was originally created to handle a much hotter running core. Specifically it was created to keep an overclocked 170w TDP ‘GK104’ GeForce GTX 760 happy and cool. This means when paired to the GM206 this cooler has an additional 50 watts of cooling potential that can be put towards keeping both noise and temperatures in check.
Further helping to keep the VRM and memory components cool and long-lived, PNY have included a secondary heatsink. This large flat heatsink covers nearly the entire PCB and actually has cooling fans which extend past the PCB. To further increase its cooling efficiency, this secondary heatsink actively recycles the waste air from the two down draft coolers. Basically after the air passes through the main towers that keep the GM206 core cool, they then flow over this black heatsink. Then and only then is the air allowed to exit the Elite OC.
To ensure that most of the air from the fans actually makes it to the secondary heatsink – and does not just leak out the sides of the main cooling towers – PNY uses a full fascia covering. As with most of PNY’s XLR8 series, this robust aluminum fascia has been painted an aggressive flat black with yellow racing stripes. We do still prefer the PNY 980’s brushed aluminum aesthetics to the typical XLR8’s more ‘gaming’ orientation, but the Elite OC is still darn good looking.
Unfortunately the back of the Elite OC has not been covered with a backplate, but given the more low cost nature of this new model, it is an acceptable oversight. Interestingly enough, even though this is a heavily overclocked card PNY has been able to keep power consumption extremely low. PNY states that this card should still have a TDP of only 120 watts.
While we have our doubts about the Elite OC not consuming more power than a reference GeForce 960 what is certain is that only a single 6-pin PCIe connector was needed.
The input selection of the Elite OC 960 is pretty much standard options found on any GeForce 960 GTX – it is actually the same as PNY’s reference GeForce 960. In total you get one DVI connector, one full sized HDMI port, and three full size DisplayPorts. This covers all the bases nicely and does so without needing adapters. Equally important due to the layout there is plenty of rear ventilation slits so some of hot air should still exhaust out the back of the case – even though this card makes use of a custom dual downdraft cooler instead of a reference blower design.