Moana data set used as benchmark
Recently I had an opportunity to get my hands on Threadripper 3970X and test it for few days. This cpu is a monster with brilliant both single-core and multi-core performance. I was trying to benchmark its performance with different cpu modes and various fan configurations. Main goal was to test system stability, compare speed improvements and maximum temperature.
Stability itself could be tested with Prime95. However for performance tests I needed something that always performs the same amount of calculations (to be able to compare time). I have quickly realized that monitoring temperature is going to be a problem since I wasn’t aware of any benchmark that would keep this cpu under full load long enough to reach its peak temperature. Please note that Cinebench R20 multi-core finishes on Threadripper 3970X in about 23 seconds. It is just too fast for cpu cooling test.
One could of course install Houdini and go all crazy with benchmarking. However in this particular case it wasn’t desirable - test had to be easily portable (without being locked to specific Houdini version). This was a perfect opportunity to test huge Moana Island Scene with pbrt-v3. This article will describe few simple steps needed to get this “benchmark” up and running on Windows 10.
Download BASE and PBRT packages from link above, decompress and merge them into single island directory.
Compile pbrt-v3. This part is pretty straight forward and also covered in this video by Stig Atle Steffensen.
- At this point you should be able to render Moana island with pbrt. However it seems like that there is a problem when using higher thread count (Threadripper 3970X with SMT enabled has 64 threads). After a certain point of scene parsing I started to get bunch of
"Too many open files"errors. Similar problem is described here. Scene itself somehow starts to render, but no output is produced due to
failed to open file for writingerror.
- Edit source code and recompile. One way to overcome this limitation is to specify maximum number of simultaneously open files at the stream I/O level. On Windows this could be done by making a call to _setmaxstdio() (which is provided by stdio - Standard Input and Output Library in C language).
By default, up to 512 files can be open simultaneously at the stream I/O level. C run-time I/O now supports up to 8,192 files open simultaneously.
Even though I have fairly limited C++ knowledge, I tried to add _setmaxstdio(8192) into pbrt.cpp and was surprised to see it compiled. (I am pretty much sure developers of pbrt-v3 would complain about my _setmaxstdio placement - but it is just fine in this case.) And it didn’t just compile - it also worked as expected! No errors were raised during render and output image was correctly produced.
- Render scene. After you have compiled new version of pbrt-v3, you should be able to render Moana data set also on CPUs with high thread count (such as Intel Xeon, AMD Epyc or Threadripper). Just open Power Shell and start render:
- Customize render for your needs. This involves editing island.pbrt file.
- You may want to increase render time to see peak CPU temperature, or to better test system stability. Just increase resolution or samples and you are good to go. (Default camera in 2K resolution with 1024 samples renders for about 1 hour on Threadripper 3970X.)
- If scene doesn’t fit into your memory, try commenting some parts of the scene to lower its footprint. (Please note that full scene could use up to 107GB of RAM.)
- Do you want to test single core performance? No problem, just add flag
--nthreads <num>to your command.
- I haven’t tried it but you might want to use pbrt-parser in order to minimize time spent on scene parsing.
- Monitor your system with HWiNFO. HWiNFO is amazing & free hardware monitoring application. You can use it to record logs and then inspect them using Generic Log Viewer.
I hope this was helpful and you will enjoy benchmarking with this beautiful production scene. If you want to run this on another computer, it is now just a matter of copying data set and binaries.
If you are interested in how Threadripper 3970X performed in my tests, here are some quick thoughts.
I was using Asus ROG Zenith II Extreme motherboard and Noctua NH-U14S TR4-SP3. They are both excellent in my opinion. I have done tests with PBO both disabled and enabled and was satisfied to see how well (and stable) PBO performed. For more information on PBO take a look at this article by AMD.
PBO’s best use case is short multi-core load (just like Cinebench R20 multi-core benchmark), during which it could provide significant performance boost. When rendering for longer time (such as with Moana data set) it provides less performance gain since temperature won’t allow for much higher clock speeds. Nevertheless even during long time periods under full load PBO performs slightly better (cpu cooling setup is a key factor here). When rendering Moana data set with PBO disabled, cpu temperature was about 70°C and cpu power consumption 280W. Whit PBO enabled, values raised to 80°C and 360W. During Cinbench R20 could cpu power consumption (with PBO enabled) easily get over 400W.
Auto OC is on the other hand targeting short single-core performance boost as it could automatically overclock CPU by 200Mhz. However I didn’t notice any clock / speed increase when using Auto OC. I guess my cpu temperature just doesn’t allow for any additional clock increase. However this experience might be totally different with some exotic water cooling setup.
When it comes to Noctua NH-U14S TR4-SP3, two Noctua NF A15 PWM produced best possible results. The one that comes with cooler itself has maximum of 1500rpm. However, the second one should have 1200rpm for best acoustics. You might be tempted to rather buy Noctua NF-A15 HS-PWM chromax.black.swap in order to get 1500rpm also on second fan, but I strongly advice against that. Vibrations produced by two 1500rpm fans are quite loud. Thick rubber pads used in conjunction with second 1200rpm fan eliminate noise caused by vibrations. Temperature difference between 1500rpm and 1200rpm variant (used as second fan) is negligible. In case you can’t use thick rubber pads (because of RAM slots), you could also use Noctua NF-A12x25 PWM as second fan. It doesn’t produce any unwanted vibrations while providing the same cooling performance as 1200rpm Noctua NF A15 PWM.
|Threadripper 3970X||no PBO (1500 A15)||with PBO (1500 A15)||with PBO (1500 & 1200 A15)|
|Cinebench R20 (multi-core score)||17,053||18,000|
|Cinebench R20 (single-core score)||518||518|
|Moana “benchmark” (time)||3,512s||3,460s||3,452s|
|Superposition 720p low (score)||24,320||24,394|
|Superposition 1080p extreme (score)||6,950||6,950|
All tests above were averaged from multiple results.