The source scene file is now available for download (1.4MB gzip file) (my thanks to John Harwood for giving permission) so feel free to run your own tests and send me the results! To run the tests, there must be a subdirectory called 'pix' in the same directory that contains the scene file, "DNA_Green_Purple_Test_Sm" (remember to gunzip the archive first). Thus, if the scene file is in /var/tmp, one would enter:
cd /var/tmp mkdir pix raytracer DNA_Green_Purple_Test_Sm
or for systems with multiple CPUs:
The normal output from each command gives the total render time. NB: to obtain consistent and sensible times, it is a good idea to shut down all unnecessary background processes before commencing the test (mediad, httpd, etc.)
Here are the results:
Num -------- CPU -------- Time System CPUs Type MHz L2/L3 (h:mm:ss) Run with... NOTES Origin350 32 R16000 700 4MB 0:01:35 powertracer Tezro 4 R16000 1000 16MB 0:02:16 powertracer Tezro 4 R16000 700 8MB 0:02:49 powertracer Node board came from an O3K CX-Brick, hence the 8MB L2. Origin300 8 R14000 500 2MB 0:03:05 powertracer Origin300 4 R14000 600 4MB 0:03:20 powertracer Origin300 4 R14000 500 2MB 0:04:18 powertracer Onyx2 4 R14000 500 8MB 0:04:30 powertracer Onyx2 4 R12000 400 8MB 0:04:54 powertracer Tezro 2 R16000 700 4MB 0:05:07 powertracer Origin300 2 R14000 600 4MB 0:05:39 powertracer Onyx 16 R10000 195 2MB 0:05:40 powertracer Tezro 1 R16000 1000 16MB 0:06:03 raytracer Fuel 1 R16000 900 8MB 0:06:27 raytracer Onyx 12 R10000 195 2MB 0:06:43 powertracer Onyx 20 R10000 195 1MB 0:06:57 powertracer Onyx 16 R10000 195 1MB 0:07:09 powertracer Octane 2 R14000 600 2MB 0:07:12 powertracer Fuel 1 R16000 800 4MB 0:07:28 raytracer Origin350 1 R16000 700 4MB 0:08:08 raytracer Onyx 12 R10000 195 1MB 0:08:09 powertracer Fuel 1 R16000 700 4MB 0:08:20 raytracer Onyx 8 R10000 195 2MB 0:08:41 raytracer Origin300 1 R14000 600 4MB 0:09:15 raytracer Octane 2 R12000 400 2MB 0:09:16 powertracer Fuel 1 R14000 600 4MB 0:09:27 raytracer Onyx 8 R10000 195 1MB 0:09:34 powertracer Octane 2 R12000 300 2MB 0:11:18 powertracer Onyx 4 R10000 195 2MB 0:11:30 powertracer Octane 1 R14000 600 2MB 0:11:34 raytracer Octane 2 R12000 350 1MB 0:12:32 powertracer [hinv] (CPU mod, stage 2. Overclocked from 250 to 350) Onyx2 1 R12000 400 8MB 0:12:36 raytracer Octane 1 R14000 550 2MB 0:13:38 raytracer Fuel 1 R14000 500 2MB 0:13:49 raytracer Onyx 4 R10000 195 1MB 0:14:41 powertracer Octane 1 R12000 400 2MB 0:14:44 raytracer Onyx2 2 R10000 195 4MB 0:15:29 powertracer Octane 2 R12000 250 1MB 0:15:36 powertracer (CPU mod, stage 1. Not yet overclocked) Octane 2 R10000 250 1MB 0:15:41 powertracer Octane 1 R12000 360 2MB 0:15:52 raytracer Octane 2 R10000 195 1MB 0:18:37 powertracer Octane 1 R12000 300 2MB 0:18:48 raytracer Octane 2 R10000 175 1MB 0:20:58 powertracer Octane 1 R10000 250 2MB 0:23:07 raytracer Onyx 4 R4400 250 4MB 0:24:00 powertracer [hinv] Onyx2 1 R10000 195 4MB 0:25:57 raytracer Octane 1 R10000 250 1MB 0:26:13 raytracer O2 1 R12000 400 2MB 0:28:36 raytracer Octane 1 R10000 195 1MB 0:34:52 raytracer Octane 1 R10000 175 1MB 0:34:52 raytracer Onyx 1 R10000 195 2MB 0:35:27 raytracer O2 1 R7000 600 256K/1MB 0:38:49 raytracer [hinv] Indigo2 1 R10000 195 1MB 0:42:36 raytracer O2 1 R12000 300 1MB 0:44:12 raytracer Onyx 1 R10000 195 1MB 0:45:09 raytracer O2 1 R12000 270 1MB 0:47:22 raytracer O2 1 R10000 250 1MB 0:47:39 raytracer O2 1 R7000 350 1MB 0:48:26 raytracer O2 1 R10000 225 1MB 0:53:21 raytracer O2 1 R10000 195 1MB 0:53:56 raytracer O2 1 R5200 300 1MB 0:59:25 raytracer O2 1 R10000 175 1MB 1:12:27 raytracer O2 1 R5000 200 1MB 1:12:49 raytracer O2 1 R10000 150 1MB 1:17:36 raytracer Indigo2 1 R4400 250 2MB 1:28:00 raytracer O2 1 R5000 180 512K 1:29:31 raytracer Indigo2 1 R4400 200 2MB 1:38:34 raytracer Indy 1 R5000 180 512K 1:48:43 raytracer Indy 1 R5000 150 512K 1:48:54 raytracer Indigo2 1 R4400 200 1MB 1:50:02 raytracer Indy 1 R4400 200 1MB 1:50:32 raytracer Indy 1 R4400 150 1MB 2:18:54 raytracer O2 1 R5000 180 - 2:20:14 raytracer Indy 1 R4600 133 512K 2:21:21 raytracer Indy 1 R5000 150 - 2:37:42 raytracer Indy 1 R4000 100 1MB 3:24:16 raytracer Indy 1 R4600 133 - 3:58:26 raytracer Indy 1 R4600 100 - 3:59:22 raytracer
This is a rather complex scene file, with results in stark contrast to the Maya render test.
Notice that at lower clock speeds (eg. 300MHz) multiple CPUs scale quite nicely in Octane for this particular Alias scene, but at higher clocks (400 and 600) the dual-CPU Octane doesn't scale so well, suggesting memory bandwidth and/or speed may be becoming a bottleneck, ie. it's likely the render is doing a lot of main memory access.
Also, the table shows a Fuel at 600MHz is 18% faster than an Octane with the same speed CPU. Based on the Octane R10K/250 results for 1MB vs. 2MB L2, the data suggests that one third of the speedup is due to the Fuel's faster memory, while two thirds of the speedup is due to the Fuel's larger 4MB L2. This is confirmed by comparing Onyx R10K/195 with 1MB vs. 2MB L2. So, faster memory definitely helps, but for this particular test a somewhat larger L2 is twice as useful, which bodes well for Origin3K systems that have 8MB L2 per CPU. Indeed, despite having the same older O2K architecture as Octane, the quad-400 Onyx2 does reasonably well most likely because of its much larger 8MB L2, though the data confirms this test definitely benefits from faster RAM access; this is why, despite a much smaller L2, the quad-500MHz (2MB) Origin300 is faster than the quad-400 Onyx2.
Especially interesting is that a dual-600 Octane is not that much faster than a 700MHz Fuel, and so as expected a Fuel/900 is faster than a dual-600 Octane. Thus, I expect a Fuel/800MHz would beat a dual-600 Octane aswell. Likewise, a dual-600 Origin300 is much quicker than a dual-600 Octane (again, this is due to the larger L2 and faster RAM in the O300).
Meanwhile, a quad-600 O300 does offer a good speedup over a dual-600 O300, though the performance improvement is starting to tail off slightly; if the increase was linear, then the quad-600 O300 would give 2 mins 50 secs when infact it gives 3 mins 20 secs, but that's still almost 70% faster for 2X more CPUs. powertracer can use any number of CPUs, but for rendering a single image there is definitely a degree of diminishing returns (see below for more on this re the Onyx rack results).
With respect to O2, just like other complex fp tests I have done, it is clear O2 is not a good solution for this sort of task. Even the most expensive R12K/400 O2 is barely half the speed of an Octane with the same CPU (which is much cheaper) and can't even beat an R10K/250 Octane. O2 has a much slower conventional CPU/RAM link compared to Octane, with higher latency aswell; O2's strengths lie with tasks involving video and complex 3D, video as texture, MJPEG processing, real-time and volumetric imaging, etc.
The Onyx results are most intriguing. Having more CPUs does speed things up, but the gains quickly tail off beyond 8 CPUs and eventually max out with 20 CPUs (the result for 24 CPUs is slower, 7 mins 5 secs, so the overhead processing presumably outweighs the benefits). The results for the R10Ks with 1MB L2 imply that such a system would be better exploited by rendering frames using separate groups of 4 or 8 CPUs; indeed even using 4 CPUs is 3 minutes slower than a 4X linear increase over 1 CPU. Thus, overall, if one is rendering lots of frames, it's best to render on a system like this using raytracer with 1 frame per CPU (and to some extent this is true of all the multi-CPU systems, but if one is rendering a single frame (eg. large advert poster) then obviously using multiple CPUs with powertracer is very useful indeed.
One final example of the difference L2 makes: R4600PC/133 Indy is painfully slow, but with an added 512K L2 the speedup is phenomenal - almost as fast as an R5000PC/180 O2. At one point it was expected SGI would release an R4600 at 250MHz or higher, which would have been quite good with a 1MB or 2MB L2. In the end though, SGI switched to the R5000 instead for low-end desktops, though it's strange that the R4600SC/133 was also available for Indigo2.
Using powertracer may involve some overhead compared to raytracer, so here's a test to check this, running raytraver vs. powertracer for the same system/CPU:
Num ------- CPU ------- Time System CPUs Type Speed L2 (mm:ss) Run with... Octane 1 R14000 550MHz 2MB 13:38 raytracer V11 Octane 1 R14000 550MHz 2MB 14:31 powertracer V11
Thus, the powertracer overhead is 6%, which should be taken into account when comparing raytracer vs. powertracer results, though it's not a huge issue most of the time. Other renderers may be more or less efficient.
One other factor which might affect the results is to what extent powertracer is even using multiple CPUs at each stage of the render. I tested this and found that, apart from a short period at the beginning and end of each test (just a few seconds), both CPUs are used just fine.
Lastly, some unanswered questions:
Feedback is most welcome! :)
Dual-R12K/360 Dual-R12K/270 Dual-R10K/225 Single-R12K/270 Single-R10K/225
Fuels not yet tested:
O2s not yet tested:
Indigo2s not yet tested:
R10K/175 R4K/175 (1MB) R4K/150 (1MB) R4K/100 (1MB) R4K/100 R4600SC/133 (512K) R8000/75 (2MB)
Indys not yet tested: