AMD FX-8150 'Bulldozer' CPU Review

👤by Alex Hull Comments 📅11-10-11
Features and Specifications

This new generation of CPUs is very different from previous models. The core architecture has been completely redesigned from scratch to optimise the multi-core and high end experience. AMD is focussing very much on multi-threaded applications and optimising their processors towards this. The new AMD FX processers are the World’s first 8-core desktop processors and support various new instruction sets to optimise performance with next-generation applications. The new instructions include support for FMA4, XOP, AES, AVX and SSE 4.2. All the AMD FX parts are also unlocked for the greatest customisation and overclocking potential. Thus, they are geared towards the high end user.

Today, there are four new AMD FX processors being launched:



As mentioned, we will be testing the FX-8150 in this review. Also, we can consider what is available in the 9-series chipset market at this point, and Vortez has already looked at several of these motherboards using Phenom II processors.



This gives plenty of flexibility on specification, but also the option to create a very high end gaming PC with multiple graphics cards and a high end FX CPU for a fairly reasonable price.

The actual technology behind this redesign is complex, and we will spare you all the tiny details as generally we’re not so much interested in how it reaches its performance levels, but what those levels are. AMD do have plenty to say however, so here’s a little info on the CPU architecture:

AMD FX Processor

”Bulldozer” was designed to balance performance, cost and power consumption on multi-threaded applications. The architecture focuses on high-frequency and resource sharing to achieve optimal throughput and blistering speed in next generation applications. AMD FX Processors offer up to eight high-performance, power-efficient cores. These represent the First generation of a new execution-core family from AMD (Family 15h). Other specs include:

– 128 KB of Level1 Cache, 16 KB/Core, 64-byte cacheline, 4-way associative, write-through
– 8 MB of Level2 Cache, 2 MB/”Bulldozer” module, 64-byte cacheline, 16-way associative

Integrated Northbridge which controls:
– 8 MB of Level3 Cache, 64-byte cacheline, 16-way associative, MOESI
– Two 72-bit wide DDR3 memory channels
– Four 16-bit receive/16-bit transmit HyperTransport™ links




Dual Core Building Blocks

When evaluating design ideas for the next generation x86 processor core, AMD engineers looked at ways of optimizing core power and area. Analyzing the bursty nature of today’s PC applications led engineers to look for a way to maximize peak bandwidth across the different cores, and maximize the use of silicon area through the use of shared modules.
The result was to design dual core building blocks that would effectively optimize the resources within the processor. Functions with high utilization (such things as Integer pipelines, Level1 data caches) are dedicated in each core.
The other units are now effectively shared between two cores and include: Fetch, Decode, Floating point pipelines, and the Level2 cache
This design allows two Cores to each use a larger, higher-performance function unit (ex: floating point unit) as they need it with less total die area than having separate, smaller function units for each Core


Floating Point

The floating point unit in “Bulldozer” has also undergone a complete re-design. It has been improved to support many new instructions and has been redesigned to allow resource sharing between Cores. There are two 128-bit FMACs shared per module, allowing for two 128-bit instructions per Core or one 256-bit instruction per dual Core module.

On forward looking benchmarks, the new floating point unit is at its best, able to perform quick 128bit instructions, as well as support acceleration of FMA and XOP operations. Applications using older floating-point instructions are typically unable to take advantage of the full performance of the floating-point unit, which is optimized for the newer FMAC instructions.




’Bulldozer’ Front End

The front-end unit is responsible for driving the processing pipeline, and was designed to make sure that the Cores are constantly fed with information. It has been designed to work with each dual core unit, and allocate threads to individual cores themselves. AMD has made heavy changes that include decoupled predict and fetch pipelines, as well as prediction-directed instruction prefetchers. A Prediction Queue can manage direct and indirect branches that are now fed with a L1 and L2 Branch Target Buffer, which stores destination addresses.
“Bulldozer” modules can decode up to 4 instructions per cycle, (vs 3 on AMD Phenom™ II processors).
The prediction pipeline produces a sequence of fetch addresses. The Fetch pipeline does a look up in the instruction cache, and pulls 32 bytes per cycle into the fetch queue which feeds the decoders.

“Bulldozer” uses a physical register file (PRF) which is a single location that holds the register results of executed instructions. This reduces power by eliminating unnecessary data movement and data replication (keeps one copy instead of broadcasting the data).


Caches

Each Core is equipped with a 16 KB Level 1 Data cache, a 32-entry fully associative DATA TLB, and a fully out of order load/store – capable of two 128-bit loads per cycle or one 128-bit store per cycle. Each dual Core module includes a 2 MB 16-way unified L2 cache with an L2 TLB capable of 124 entry, 8 way that services both instruction and data requests. “Bulldozer” supports up to 23 outstanding L2 cache misses for memory system concurrency.
Finally AMD has designed a shared 8 MB L3 cache with 64 way associativity for both cores in a “Bulldozer” module.





14 pages 1 2 3 4 > »

Comments