I posted this on software but as it is directly relevant to the hardware topics touched upon in this thread, I' ll also post the references in this place. Pretty interesting is that Linpack seems to be used for benchmarking. In fact this evolution might very quickly progress into something totally unthinkable a short while ago. You remember the 66MHz anno 1994? Nitro, keep on going, the wind may blow into your direction. http://www.internetnews.com/ent-news/article.php/3414721 The blue gene/L runs linux. (the /L stands for linux) Read more about blue gene /L at: http://news.com.com/IBM+details+Blue+Gene+supercomputer/2100-1008_3-1000421.html?tag=st.rc.targ_mb Be good, nononsense
64-bit Windows running on Quad Opteron: ====== Report file for timing the various timers. *** Key number is the avg time. The smaller this number, the faster the timer. QueryPerformanceFrequency() freq = 0 1804000000 method 0: QueryPerfCntr..() 100 times tot: 0 37602 avg: 376.020000 avg time: 2.08437e-007 method 0: QueryPerfCntr..() 500 times tot: 0 184766 avg: 369.532000 avg time: 2.0484e-007 method 0: QueryPerfCntr..() 1000 times tot: 0 368402 avg: 368.402000 avg time: 2.04214e-007 method 0: QueryPerfCntr..() 10000 times tot: 0 3680579 avg: 368.057900 avg time: 2.04023e-007 method 1: GetTickCount() 100 times tot: 0 2276 avg: 22.760000 avg time: 1.26164e-008 method 1: GetTickCount() 500 times tot: 0 7209 avg: 14.418000 avg time: 7.99224e-009 method 1: GetTickCount() 1000 times tot: 0 13423 avg: 13.423000 avg time: 7.44069e-009 method 1: GetTickCount() 10000 times tot: 0 130595 avg: 13.059500 avg time: 7.23919e-009 method 2: TimeGetTime() 100 times tot: 0 13170 avg: 131.700000 avg time: 7.30044e-008 method 2: TimeGetTime() 500 times tot: 0 53613 avg: 107.226000 avg time: 5.94379e-008 method 2: TimeGetTime() 1000 times tot: 0 106387 avg: 106.387000 avg time: 5.89728e-008 method 2: TimeGetTime() 10000 times tot: 0 1060592 avg: 106.059200 avg time: 5.87911e-008 method 3: Pentium internal high-freq cntr() 100 times tot: 0 1223 avg: 12.230000 avg time: 6.77938e-009 method 3: Pentium internal high-freq cntr() 500 times tot: 0 4580 avg: 9.160000 avg time: 5.07761e-009 method 3: Pentium internal high-freq cntr() 1000 times tot: 0 8026 avg: 8.026000 avg time: 4.449e-009 method 3: Pentium internal high-freq cntr() 10000 times tot: 0 78559 avg: 7.855900 avg time: 4.35471e-009
Iâve attached my timer report. Your quad system probably gets lower overheads due to its higher clock speed. My CPUs run at 1.6 GHz. What CPUs are you using? 850s? Yesterday I added 1 GB, 2x512MB to my existing 4x256MB (two per cpu). First I move 2x256MB from CPU1 to CPU0, then install the 2x512 on CPU1. Immediately I get a blue screen whenever running matlab with CPU0 affinity for more than a minute or two. Running matlab with CPU1 affinity caused no problem. What's strange is that the new DIMMs were installed on CPU1. Now I had been getting âmachine checkâ warnings since I got the machine, but never any blue screens or crashes. This is the error: Event Type: Warning Event Source: WMIxWDM Event Category: None Event ID: 106 I should have investigated these. I suspected some of the DIMMS were bad, removed 2x256 on CPU0 banks 2 and 3. Blue screens and machine check errors were eliminated. One of the DIMMS had some oxidation spots on the contacts and heat spreader. I believe some thermal paste dripped from CPU1 onto this dimm. I did not assemble this board. Iâll get the vendor to replace the DIMMs. It's interesting this DIMM worked for two months on CPU1, with very heavy use without ever crashing the system until I move it to CPU0.
The very first number QueryPerformanceFrequency() freq = 0 1804000000 tells you the frequency of the CPU, or 1.8 Ghz. These are then 844s. The 8xxs series Opterons support 4way+ and have an extra line on the HyperTransport over regular Opterons. I am not sure any of that makes any difference in this case. It is probably one case where CPU clock speed is all that matters. My goal was to get the most bang for the buck now, and spend on dual cores when they came out to get in effect an 8-way system. nitro
My $60 Celeron does 2.4 BILLION operations a second (actually more, because it can do some operations in parallel)
Ok, Just finished colocating the Tyan. It was an excercise in patience. I had not opened the package that had my rack rails in it. I soon discovered while I am at the colocation center that Tyan had put in a 1U rack kit instead of a 2U kit. In theory this does not matter, but apparently it does. The included screws did not fit my case, so I had to go running all over the place looking for screws that fit. I finally found some at a local Ace Hardware. I had to be careful because I could not find screws smaller than 3/8's in length with the required width and I needed to make sure that I was not going to run a screw into anything inside the case. All went well except for one of the rear holes on the rack. For some unknown reason, we could not get this screw in. Probably the hole was simply too small and it kept stripping the screws. Fortunately, we had attached enough of the rail screws in to the case so that not fastening this one mattered much. Although I am very sastisfied with the Tyan, they can be sloppy with their Q/A and with the case in general. nitro
Speaking of Q/A and attention to detail, all motherboard manufacturers, Tyan included, need to document BIOS settings much better. The manual that came with my Tyan K8W MB was a whole 63 pages long. Only two pages were devoted to node/bank interleaving and ECC settings, neither of which are explained in any depth. They need to devote a whole page to each BIOS setting, especially the memory and HyperTransport stuff, describing performance, stability and compatibility issues for each setting. Otherwise we are stuck wasting our time benchmarking these settings. Testing for stability differences can take a long time. Nitro, do you remember your MBâs memory configuration: bank interleave and node interleave settings? The choices for both are disabled or auto. I have bank interleave set to auto, node interleave set to disabled. Node interleave will interleave memory addresses between CPUs, defeating NUMA. I get better performance with this disabled (NUMA enabled). Tyanâs FAQ http://www.tyan.com/support/html/f_s2885.html alludes to these settings, but never actually documents them: Apart from this I canât find any better explanation of these settings. Have you seen any?
All of my settings are the default settings. Using 64-bit Operating Systems, I get tremendous thoroughput by comparison to other machines it's class. But if I run 32-bit Windows, this machine is average by today's standards. As I mentioned earlier, you not only have to have NUMA turned on in the BIOS as well as the correct interleaving of memory, but if you are runnig windows you have to turn on PAE (Physical Address Extension). nitro
I am installing gentoo linux distro now onto an old machine in order to gain experience installing it and optimizing it before I install it on the Tyan. I can tell you that installing gentoo is definetly not for beginners. Relatively easy for me, but man this better be WAAAAY superior than my old friend FreeBSD considering all this effort. The reason for gentoo is to see if I can get every ounce of performance out of the machine, and linux in general because I cannot wait around for a retail 64-bit version of Windows. I am going to experiment with several "linux" distro's to see which one gives me the best SMP performance. Really, what I need is the best I/O (networking) performance and the best kernel. I have been away from *inux systems for so long that I feel a little bit out of touch. nitro