[Future Technology Research Index] [SGI Tech/Advice Index] [Nintendo64 Tech Info Index]

[WhatsNew] [P.I.] [Indigo] [Indy] [O2] [Indigo2] [Crimson] [Challenge] [Onyx] [Octane] [Origin] [Onyx2]

Ian's SGI Depot: FOR SALE! SGI Systems, Parts, Spares and Upgrades

(check my current auctions!)

Origin300/3000 Series

Overview and Architecture

Last Change: 14/Nov/2008

By Ian Mapleson

Origin300 Image Following on from the success of the Origin2000 (and related Onyx2/Octane systems), in July 2000 SGI launched the Origin3000 series, based on an improved modular design called NUMAflex [1st Press Release | 2nd Press Release], aswell as the corresponding graphics-based system, the Onyx3000. The less-scalable versions were Origin300 and Onyx300.

The earlier Origin2000 allowed one to grow a system step by step, based on one or more deskside units: 2 units to a rack, multiple racks linked together via high-speed connections to create a single-image system of up to 128 CPUs - a revolutionary change in how systems could be scaled in performance. However, even a minimum Origin2000 system still meant one had to purchase a large deskside system with room for 8 CPUs, with the same I/O capabilities always included - this was ok much of the time, but many users would like to have been able to grow RAM, I/O capacity, storage, etc. rather than CPUs, or have a mix, so the Origin2000 design could mean wasting some aspect of the available hardware. Plus, the maximum 128-CPU system required an extra 9th rack to act as the routing mechanism for the other 8 racks (the CrayOrigin2000).

Thus, SGI designed NUMAflex as a solution to the scalability problem. Instead of deskside units, each modular building block, or 'brick', is dedicated to a specific function, eg. CPUs, base I/O, PCI/XIO expansion, RAM, storage, system expansion/scalability or (for the equivalent Onyx line) graphics. These bricks are then installed into a short rack or full rack in whatever combination is required, multiple racks linked together for larger systems. This meant much greater flexibility in how a system could be configured and later expanded, and a lower initial outlay for a system with fewer CPUs. With the system scalability functions now placed into a dedicated brick, system expansion to hundreds of CPUs also no longer required a separate rack unit to act as a metarouter. With respect to performance, the CPU/RAM bandwidth was doubled and the CPU density also doubled - initial CPU bricks (C-Brick) had 2 or 4 CPUs per brick, with the final version offering 16 CPUs in a single brick. The XTown connection speeds were also doubled, giving 3.2GB/sec per link (1.6GB/sec in each direction).

From a business point of view, this new design meant money was not being wasted on hardware that is not going to be used, eg. if one did not need PCI expansion, then one simply did not include a PCI brick in the system configuration. Likewise, one could configure additional I/O and storage requirements to precisely match the number of CPUs, relevant to the target application. In addition to the new brick-based design, the system as a whole could be partitioned at the software level to act as multiple separate systems, each with their own separate OS kernel image, thus giving more options for software management, cluster solutions, etc., yet still benefiting from the shared memory design when desirable.

At the entry level, Origin300 scales from 2 to 32 CPUs and up to 32GB RAM (2 PCI slots), or (and this is the key point) up to 16 CPUs with as many as 56 PCI slots, or numerous other possible combinations inbetween. Other bricks could be installed for dedicated storage, XIO expansion for high-performance I/O, or graphics expansion (Onyx300 series). Without a doubt, the brick-style NUMAflex design is far more flexible than Origin2000, with twice as many CPUs in a single rack. The later updated Origin350 range doubled the maximum RAM to 64GB and offered a 4-slot 3.5" high PCIX expansion brick in addition to the standard 12-slot 7" high PCI brick.

The Origin3000 line was released in four initial main flavours: the 3200, 3200C, 3400 and 3800, plus a fifth final update to the series, the 3900. The 3200 scaled from 2 to 8 CPUs, the 3400 from 4 to 32 CPUs, and the 3800 from 16 to 512 CPUs. The 3200C was a cluster version of the 3200, using a cluster interconnect brick to link multiple 3200 systems together to create clusters with up to thousands of CPUs. The 3900 series was the final version of the line, using a new CPU brick with 16 CPUs (Cx-Brick), 4X more than the earlier version, thus giving up to 128 CPUs and 256GB RAM in a single rack. New higher-density versions of the base I/O and PCIX bricks were also released for the 3900 series, giving 11 PCIX slots in the base I/O brick, and 12 PCIX slots in the new PCIX brick.

Here is a summary of the initial bricks for the 3200, 3400 and 3800 systems:

Origin300 Rack Image C-Brick (CPU Module): 4 CPUs and up to 8GB RAM (base brick in Origin300 could have just 2 CPUs). One NUMAlink port to connect to another C-Brick or R-brick and one Xtown2 port to connect to an I/O brick.

I-Brick (Base I/O Module): Included in all systems, contains base-level I/O consisting of a fibre channel (FC) system disk, CDROM, five hot-pluggable PCI slots, 10/100 Ethernet, one IEEE-1394 port and two USB channels, and two Xtown2 ports for connection to Xtown2 ports on C-Bricks.

G-brick (Graphics Expansion): This graphics module is effectively the same 2-pipe graphics module as used in Origin2000, with 1 or 2 G-Bricks per rack, initially released as InfiniteReality3 (max of one 2RM pipe and one 4RM pipe). Each pipe of a G-brick connects to an Xtown2 channel of an I- or X-brick via a 'DNet' cable. Later updated to InfiniteReality4 which is much faster than IR3.

X-Brick (XIO Expansion): Four half-height XIO slots that are fully compatible with the XIO slots used in Origin2000, thus allowing older XIO option cards to be used in Origin3000 systems. Includes two Xtown2 ports for connection to Xtown2 ports on C-Bricks.

N-Brick (CPU/GFX routing): For systems with more than one IR graphics pipe, the N-Brick allows for more efficient connectivity between G-Bricks and C-Bricks for situations where extra I-Bricks or X-Bricks are not required (see PDF documents below for full details).

P-Brick (PCI Expansion): 12 x 64bit hot-pluggable PCI slots using six buses. Each bus can have its two ports running at 33MHz or 66MHz. Includes two Xtown2 ports for connection to Xtown2 ports on C-Bricks.

R-Brick (Router Interconnect): the building block for creating larger systems. Has four ports for connection to the NUMAlink port on C-bricks, with the remaining four ports only for connection to other R-Bricks.

D-Brick (Disk Storage): A 4U module supporting either FC JBOD or RAID storage. THe early version had space for up to 12 drives, the later version supporte dup to 18 drives. The D-Brick is intended to be connected to an FC PCI card in an I-Brick or P-Brick.

Power Bay: Supports from 3 to 6 hot-swap Distributed Power Supplies (DPSs) to supply 48V DC power to the other bricks. Always configured as an N+1 setup to offer redundancy in the event of a DPS failure.

The following are the enhanced bricks released for the 3900 series, but which can be used with the earlier models aswell (system ID and license issues not withstanding):

Origin3900 Bricks
Cx-Brick (CPU Module): A 'super-brick' with up to 16 CPUs and 32GB RAM.

Origin3000 Rack Image IX-Brick (Base I/O Module): 11 PCIX slots over six 64bit/133MHz PCIX buses, system disk, CDROM, etc.

PX-Brick (PCI Expansion): 12 hot-pluggable PCIX slots using 6 buses. Each bus can operate at 33 or 66 MHz and is also PCI-compatible. The brick is designed to allow every bus to operate without bandwidth contention at any time.

D-Brick2 (Disk Storage): Up to 16 FC drives, each disk can be connected to either of two separate FC loops for maximum performance, supports both JBOD and RAID configurations.

V-brick (InfinitePerformance Graphics Expansion): A 7" (4U) module with up to two independent InfinitePerformance graphics pipes, each with dual-channel capability and supporting digital video I/O. A Scalable Graphics Compositor can be used to run multiple pipes in parallel. InfinitePerformance is basically the same V12 as used in Octane2/Fuel/Tezro.

And here is a summary of the various systems:

CPUs Max RAM Router-Type Max Bandwidth Origin300: 2 to 32 32GB 8-port 44.8GB/sec Origin350: 2 to 32 64GB 8-port 44.8GB/sec Origin3200: 2 to 8 16GB None 11.2GB/sec Origin3200C: 2 to 8 16GB Switch (#1) 11.2GB/sec Origin3400: 4 to 32 64GB 6-port 44.8GB/sec Origin3800: 16 to 512 1TB 8-port 716GB/sec Origin3900: 16 to 1024 ?TB 8-port 1432GB/sec (#2)


#1: Uses very fast, low-latency switches to connect multiple 8-CPU nodes together, to create cluster systems with hundreds or
thousands of CPUs. System management and storage is controlled via the SGI Advanced Cluster Environment and CXFS clustered filesystem.

#2: A newer high-density version of the 3800, supports up to 128 CPUs and 256GB RAM in one rack, ie. 4X better than the 3800. I'm not sure what it's max RAM is though, and the bandwidth figure is just an extrapolation of the O3800 number.

SGI documents say the Origin3900 was available up to 1024 CPUs, but none of the PDF documents give specification details for this configuration. Presumably though, it would just be 2X the max RAM and bandwidth of the 512-CPU setup.

Reference Documents

Here is a list of relevant PDF documents for the Origin3000 series:

Related Desktop Systems

As with Origin2000, SGI released desktop systems that used the Origin3000 architecture. The Fuel system is a single-CPU desktop, while the Tezro supports up to 4 CPUs. Both systems can have either V10 or V12 graphics, though Tezro supports dual-gfx pipes whereas Fuel has just one pipe. Both can use the dual-channel option card. Fuel's minimum CPU is the R14K/500 (2MB L2), maximum CPU is the R16K/900 (8MB L2), max RAM 4GB. Tezro's minimum CPU configuration is dual-R16K/700 (4MB L2), maximum configuration is quad-R16K/1GHz (16MB L2), max RAM 16GB.

Full details on Fuel and Tezro are elsewhere on my site, but one thing about Fuel is worth mentioning: hobbyists often think less of the Fuel because of its cheaper build design, the presence of internal cables, etc., more like a PC - criticising these aspects of Fuel is a mistake. Remember that Fuel is basically a single-CPU Origin3000 with graphics; as such, it's fast, easily outperforming an Octane/Octane2. The memory is 2X faster than Octane, the XIO speed is 2X faster, much better CPUs can be used, the SCSI is 4X better, and because of all this Fuel can take good advantage of fast SCSI drives much more than Octane, ie. an Octane's system disk would never give more than about 35MB/sec sustained (because of the limits of the UW bus) whereas even an R14K/500 Fuel can easily push a good modern SCSI drive over 100MB/sec (eg. Maxtor Atlas 15K II, diskperf gives up to 98MB/sec in Fuel, and my own tests with four such disks gave 280MB/sec). This means application loading times are much faster than Octane, while processing lots of small files is also better (eg. compiling code, searching contents of emails).

More importantly, the cheaper build design of Fuel served an important purpose: to offer a cheaper entry-level price to the SGI product line. Fuel's starting list price was far lower than the equivalent price for Octane when Octane first came out. Hobbyists tend to forget these points. Of course, everyone would love to have a Tezro, but they are harder to find, still highly valued and thus too expensive. If what you want is CPU power though, eg. for animation rendering, then keep an eye out for 2nd-hand Origin300 systems, or Origin3200 desksides.

Historical Perspective

The way in which SGI increased the CPU density in Origin3000 reveals something interesting about the history of the Origin product line. When Origin2000 was launched, it was designed with the expectation that SGI would be able to utilise the IA64 CPU series in the same chassis aswell, but Itanium generates much more heat than an R10K MIPS CPU, so the node boards were designed to hold just 1 or 2 CPUs per board. However, IA64 was late coming out and so was never put into the O2K series. This is why Origin2000 never had more than 16 CPUs per rack. With the 3900 series, SGI could finally show just how many MIPS CPUs can be packed into the same rack space, given the lower heat output per CPU. Meanwhile, the Altix product line combined the same Origin3000 NUMAflex architecture with Itanium2 CPUs, giving similar scalability options but with improved performance. Since then, Altix has improved the Origin3000 design further, eg. with the release of NUMALink4, giving 6.4GB/sec per port for scalability and device I/O.

Other historical aspects of the Origin3000 are worth mentioning. For reasons unknown, when SGI released the 1GHz R16000 CPU with 16MB L2, it was barely mentioned. Hardly any PR was written about it and the Origin/Tezro web pages were not updated to mention it, even though it was possible to order 1GHz CPU bricks for the 3000 series and one could buy a quad-1GHz Tezro. I started trying to persuade SGI to add the info about the 1GHz CPU to the Tezro/Origin home pages, but with little success (sadly, it wasn't up to the SGI web admin, David Kascht). Time passed and about a year later I tried again, eventually with better success; after 6 months, David managed to persude the marketing people to allow the site to be updated, but an even bigger surprise was Marketing's instruction that David could also update the Fuel page to mention the R16K/900MHz (8MB L2) CPU for Fuel! I didn't know this CPU was available for Fuel, and neither did David.

Why were these CPUs so under-marketed? I can only imagine it was because by that time SGI had fully commited itself to moving to Altix with Itanium2, and back then I guess it wasn't yet obvious that Prism was not going to be a success. The Itanium2 is certainly faster than the R16K, but there were/are still plenty of customers who would have been interested in the faster 900MHz/1GHz R16Ks, especially since the large 8MB/16MB L2 would have given significant speed benefits for certain tasks such as 2K film compositing. Tezro is still widely used for film/video tasks, especially HD, though it is losing out now to modern Linux PC systems because processing Sparks in Discreet applications is done in software, though many companies tell me they continue to prefer IRIX systems for reliability issues. A rough approximation would be that a quad-1GHz Tezro has about the same compute power as two modern dual-core Athlon CPUs. However, since licensing issues often incur far greater costs than basic hardware prices, Tezro will continue to be used in the film industry for some time. Indeed, many film companies are still using Octane systems, never mind Tezro.

Away from graphics/video issues though, the market has definitely moved on somewhat from IRIX/MIPS Origin3000 systems. Altix systems with Itanium2 are much faster, even when using emulation to run MIPS binaries. However, given the scalability of the O3K series, this does mean that - just as when older high-end SGIs entered the 2nd-hand market - it's now possible for individuals and small/medium-sized businesses to obtain really good O3K systems at very low prices, in certain cases offering performance levels far better than any PC, purely because of the number of CPUs, huge amount of RAM and the type of architecture involved (single-system image). Typically, a 64-CPU R14K/500 Origin3000 will have 64GB RAM; even SGI has been selling systems like this for less than $40K - that's about 50% cheaper than a maxed-out Dell Precision 690 with the same amount of RAM, but much faster (equivalent to 16 dual-core Athlons) and with great scope for expansion and scalability to deal with bigger data sets in the future.

Credits and Miscellaneous

My thanks to Toby Jennings of 3D System Sales for supplying details of the N-Brick and extra PDF documents. Asking what the N-Brick is for, Toby replied:

"Space saving and biggest factor were cost. The n-brick is a simple
way of adding extra graphics pipes to a machine. Normally you'd need
to connect to either the C-brick or I brick, but if you don't need
more CPUs or I/O but do need the extra pipe this acted as an interface.
It's basically the same as the R-bricks but for graphics pipes."

More later...

Ian's SGI Depot: FOR SALE! SGI Systems, Parts, Spares and Upgrades

(check my current auctions!)
[WhatsNew] [P.I.] [Indigo] [Indy] [O2] [Indigo2] [Crimson] [Challenge] [Onyx] [Octane] [Origin] [Onyx2]
[Future Technology Research Index] [SGI Tech/Advice Index] [Nintendo64 Tech Info Index]