Mac Musings

The Power of the G5 and the Future of the Mac

Daniel Knight - 2003.07.10

The Power Macintosh G5 is probably the most tightly optimized hardware design since the original Macintosh, which cleverly combined a huge ROM with toolbox routines, innovative use of timing to manage sound and video, Wozniak's brilliant floppy controller, extremely high speed serial ports, and asynchronous response to input to squeeze the most out of 7.83 MHz, 128 KB of RAM, a 400K floppy, and a 512 x 342 pixel black and white GUI display.

The original Macintosh made remarkably efficient use of its resources, and its 8 MHz and 16 MHz siblings only improved on the original idea. Most future Macs went in different directions, using video cards or dedicated video circuitry with separate memory, sometimes adding additional I/O processors or digitizers (as in the Quadra AV models), and often allowing the savvy end user to overclock the computer.

Overclocking is one that that comes to an end with the Power Macintosh G5. Instead of running the PowerPC 970 CPU at some multiple of the system's bus speed, the PPC 970 requires that the memory run at half CPU speed. If you want a 2.0 GHz CPU, you must have a 1.0 GHz memory system. If it is possible to overclock a G5 system - and I'm sure someone will try - it will mean overclocking both the CPU and the RAM to keep them in sync.

Macs haven't had this kind of fixed ratio between memory speed and CPU speed since the earliest Power Macs. Older Macs get overclocked by goosing the whole motherboard, making the whole system bus run faster. On more recent Power Macs, particularly G3 and G4 systems, it's often only the multiplier between the system bus and the CPU that's changed, so only the processor itself is running faster. (Changing the system bus can throw of PCI timing, among other things.)

Everyone has been talking about how fast the PowerPC 970 processor is, but no matter how powerful it is, other bottlenecks can slow it down. In this article, we'll look at the steps Apple has taken to make the whole system as powerful as possible by eliminating one bottleneck after another.

HyperTransport

The G5 doesn't have a conventional system bus. It has multiple subsystems linked with HyperTransport, high speed, low latency, point-to-point I/O technology that can move up to 8 GB of data per second between the CPUs and the memory system of the 2.0 GHz G5.* This is managed by an advanced system controller chip that manages communication between the CPU(s), memory, AGP video card, and the I/O subsystems, which includes the PCI or PCI-X slots and the I/O ports (serial ATA, ethernet, FireWire, USB 2.0, and audio).

* HyperTransport supports up to 12.8 GB/sec., but we'll need faster CPUs or more of them to require that much bandwidth. Individual HyperTransport buses range from 200 MHz to 800 MHz at 2 to 32 bits, allowing up to 6.4 GB/sec. on a single bus.

PCI-X systems can move 2 GB per second between PCI-X cards and memory, video, or the CPU. The I/O controllers use two bidirectional 16-bit 800 MHz HyperTransport connections to allow up to 3.2 GB per second throughput. (See Bandwidth to Burn for more on G5 architecture.)

All in all, HyperTransport is as revolutionary a change from the traditional system bus as the Power Mac G5 is from the Power Macs the preceded it. But it's not something you can overclock.

Bottlenecks

Until the G5, all Power Macs have been bottlenecked by their memory, expansion slot, and I/O systems.

Memory

Apple, Motorola, and IBM have tried to make the best of slow memory by offering ever increasing multipliers for the CPU, in some cases letting the processor run at 16x bus speed (as in those 800 MHz upgrades for the 50 MHz bus of PCI Power Macs and clones). The first point of attack was making the level 1 cache larger, so the CPU would need to access system memory less frequently.

The next point of attack was adding a level 2 cache, typically somewhere between 256 KB and 1 MB, on the system bus between the CPU and system memory. With lower multiple systems, such as the x100 Power Macs, a 1 MB level 2 (L2) cache could do wonders for performance.

With the G3 processor, motherboard L2 caches gave way to backside caches that were in the same socket as the CPU. These caches typically ran at half the speed of the CPU, which was much faster than the system bus (33-50 MHz on pre-G3 models, 66-100 MHz on the G3s). In some iMacs the L2 cache ran at only 40% of CPU speed, but even that (93 MHz on a 233 MHz iMac) was faster than the 66 MHz system bus.

The third advance in L2 caches came with their inclusion on the CPU itself. The tradeoff was cache size for speed. While a backside cache was often 512 KB, the early onboard L2 caches were only 256 KB - but they ran at full processor speed, not 40-50% of CPU speed like the backside caches.

The smaller L2 caches lead to the use of level 3 caches, which were faster than motherboard memory, larger than the onboard L2 caches (1 MB was common), and helped minimize the memory bottleneck.

Interestingly enough, although the G3, G4, and POWER architecture all support a level 3 cache, the PowerPC 970 does not. Instead, it insists on a fast memory system functioning at half CPU speed, which reduces the need for a third level of caching. (The POWER4, which is closely related to the PPC 970, uses huge L3 caches.)

It's conceivable that at some future date IBM could add L3 cache support to a next generation G5 processor, but with the fastest memory system on any personal computer (the Intel world has only reached 800 MHz), Apple should be set for some time to come. The only conceivable problem is keeping RAM speed advancing as fast as CPU speed, since the PPC 970 must have a memory system running at half processor speed. That means a 1.5 GHz memory system for the 3.0 GHz Power Mac G5 we can expect to seen next summer.

Expansion Cards: PCI

Early IBM PCs has an 8-bit 4.77 MHz expansion bus, which was displaced by a 16-bit 8 MHz "AT" bus a few years later. There were two competing standards to replace the AT bus, but neither EISA nor MicroChannel really took off. Both were displaced by the PCI bus, which moves 32 bits of data at 33 MHz.

Apple's primary bus in the early days was NuBus, a 32-bit 10 MHz bus that survived through the first generation of Power Macs. After that, Apple adopted the PCI bus, but with the b&w G3 it expanded the bus by offering 66 MHz slots and 64-bit slots. And that's where PCI has been on the Mac until the Power Mac G5.

PCI-X starts by doubling the maximum speed of the PCI bus, moving from 66 MHz to 133 MHz, and standardizing on 64-bit slots. PCI-X also adds some new features that make for more efficiency, making it a better bus than PCI. Best of all, PCI-X is fully compatible with PCI cards.

Expansion Cards: AGP

Video cards have had a faster bus for some years. AGP first came to Macs with the Sawtooth Power Mac G4, which uses an AGP 2x socket. AGP was a faster bus than PCI, and it was specifically designed for video cards. Over time Apple moved to AGP 4x, which was twice as fast, and the AGP 8x in the Power Mac G5 has twice as fast a bus as that.

Additionally, AGP 8x introduces isochronous operation and texturing abilities to the AGP standard.

Hard Drives

Ignoring the original Macintosh hard drive, Macs used SCSI hard drives exclusively from 1986 until 1984, when they began to use IDE drives in some lower cost models (the PowerBook 150 and Quadra 630).

SCSI is a more intelligent protocol, and with older, slower Macs, there were huge advantages to letting the drive itself do a lot of the work. Early IDE drives were pretty dumb in comparison, but over the years the IDE standard evolved, growing in speed and gaining direct memory access (DMA), which let the drive write data directly to memory.

From that point on the performance of IDE drives drew much closer to that of SCSI drives, and the significantly lower price made IDE drives Apple's choice for the Power Mac G3, the first top-end Mac to feature an IDE drive.

Eleven years of SCSI didn't disappear completely, but no Power Macs after the beige G3 included SCSI as a standard feature. Apple offered build-to-order configurations that used the more efficient SCSI protocol, but only those who needed the most performance chose to go with SCSI drives in their Macs.

The IDE specification has evolved over the years. The Power Mac G3 had a 16.7 MB/sec. IDE bus, the blue & white G3 doubled that, and the various G4 models included Ultra33, 66, and even 100.

That's all going to come to an end over the coming years as Serial ATA displaces the older parallel ATA protocols. Serial ATA is not only faster (up to 150 MB/sec., which is faster than Ultra133), it's much simpler. It doesn't require wide ribbon cables, and drives don't share cables. Each serial ATA device is connected directly to a Serial ATA port.

For external drives, SCSI gave way to FireWire 400, which is faster than most SCSI implementations, faster than most hard drives, but only one-third the speed of Serial ATA (50 MB/sec. vs. 150 MB/sec.). Apple's adoption of FireWire 800 on recent high-end models allows external FireWire drives to move data as quickly as internal Ultra100 drives.

At this point, however, drives simply aren't fast enough to challenge Ultra133, Serial ATA, or FireWire. Where they come into their own is with RAID, where two or more drives are teamed up to work as a single, larger, faster drive from the computer's perspective. RAID setups can reach the speed limits of any of these protocols.

That can be addressed, too, by adding multiple ATA or FireWire controllers to the system, which can produce even higher throughput than RAID drives connected to a single bus. (See Bare Feats for benchmark tests.)

I/O

There's really no need for a faster bus for mice and keyboards, but USB 1.1 has been a bottleneck for things like scanners. USB 2.0, which is 40x as fast, has grown as an alternative to FireWire for two reasons. The biggest reason is that USB 2.0 ports can replace USB 1.1 ports, and Intel made the 2.0 chipset available for the same cost as 1.1 chipsets. The second is that FireWire never really caught on in the Windows world, being used primarily on Macs and digital video cameras.

G5, a Whole New Mac

There is almost nothing legacy about the Power Mac G5. It has an entirely new CPU, system architecture, and drive system. The video, PCI-X, and USB 2.0 are the next generation implementation of standards used on older Macs. Only FireWire, ethernet, and possibly the controller for the Combo drive or SuperDrive (which I suspect is still a parallel ATA device) are carried over from the last generation of Power Macs.

Every subsystem uses the latest technology, reducing one bottleneck after another to create the most powerful desktop computer on the planet, one that could rival a lot of workstations.

G5, a Whole New Generation of Macs

But the Power Mac is just the beginning. Over time the PowerPC 970 will work its way into the Xserve, where the HyperTransport architecture will be even more beneficial. This could move Apple from a competitive player in the server market to a top choice.

Next up will probably be the PowerBook G5, which will also benefit from HyperTransport, Serial ATA notebook drives, AGP 8x video, and all the rest.

After that I anticipate it will move to the consumer models - the flat panel iMac, the eMac, and the iBook. It may take a year or two before the whole line goes G5, and there's a lot of speculation that the iBook will use a next generation IBM G3 with a velocity engine, but eventually the whole line will settle on HyperTransport and the PowerPC 970.

And some time later, IBM will push the envelope and move beyond the 970.