The IBM A2 is a massively multicore capable and multithreaded 64-bit Power ISA processor core designed by IBM using the Power ISA v.2.06 specification. Versions of processors based on the A2 core range from a 2.3 GHz version with 16 cores consuming 65 W to a less powerful, four core version, consuming 20 W at 1.4 GHz. Each A2 core is capable of four-way multithreading and have 16 KB+16 KB instruction and data cache per core. All core variants execute instructions in-order.
The A2 core is a processor core designed for customization and embedded use in system on chip-devices. It implements the 64-bit Power ISA v.2.06 Book III-E embedded platform specification with support for the embedded hypervisor features. It is a 4-way simultaneous multithreaded core with 4×32 64-bit general purpose registers (GPR) with full support for both little and big endian byte ordering.
The core has a fine grain branch prediction unit (BPU) with eight 1024-entry branch history tables. It has a 16 KB 8-way set-associative level-1 data cache and a 4-way set-associative 16 KB level-1 instruction cache. It executes a simple in-order pipeline capable of issuing two instructions per cycle; one to the 6-stage arithmetic logic unit (ALU) and one to the optional auxiliary execution unit (AXU).
It includes a memory management unit but no floating point unit (FPU). Such facilities are handled by the AXU, which has support for any number of standardized or customized macros, such as floating point units, vector units, DSPs, media accelerators and other units with instruction sets and registers not part of the Power ISA. The core has a system interface unit used to connect to other on die cores, with a 256-bit interface for data writes and a 128-bit interface for instruction and data reads at full core speed.
The PowerEN (Power Edge of Network), or the "wire-speed processor", is designed as hybrid between regular networking processors, doing switching and routing and a typical server processor, that is manipulating and packaging data. It was revealed on February 8, 2010, at ISSCC 2010.
Each chip has 8 MB of cache as well a multitude of task-specific engines besides the general-purpose processors, such as XML, cryptography, compression and regular expression accelerators each with MMUs of their own, four 10 Gigabit Ethernet ports and two PCIe lanes. Up to four chips can be linked in a SMP system without any additional support chips. The chips are said to be extremely complex according to Charlie Johnson, chief architect at IBM, and use 1.43 billion transistors on a die size of 428 mm² fabricated using a 45 nm process.
The Blue Gene/Q processor is an 18 core chip running at 1.6 GHz with special features for fast thread context switching, quad SIMD floating point unit, 5D torus chip-to-chip network and 2 GB/s external I/O. The cores are linked by a crossbar switch at half core speed to a 32 MB eDRAM L2 cache. The L2 cache is multi-versioned and supports transactional memory and speculative execution. A Blue Gene/Q chip has two DDR3 memory controllers running at 1.33 GHz, supporting up to 16 GB RAM.
It uses 16 cores for computing, and one core for operating system services. This 17th core will take care of interrupts, asynchronous I/O, MPI flow control, and RAS functionality. The 18th core is used as a spare in case one of the other cores are permanently damaged (for instance in manufacturing) but is shut down in functional operation. The Blue Gene/Q chip is manufactured on IBM's copper SOI process at 45 nm, will deliver a peak performance of 204.8 GFLOPS at 1.6 GHz and draws about 55 watts. The chip has a die size of 19×19 mm (359.5 mm²) and uses 1.47 billion transistors.