Registered: Jul 2011
09-16-12 03:56 PM
Quote from pfranz:
I quickly read this long thread, don't know whether I'm repeating points from someone else.
When programming complex systems like nowadays computers,you never know the outcome of a particular way of programming until you test. And testing shows that often, ways that look cumbersome produce the fastest results.
This totally supercedes the language you are using. So, though I believe that C# wastes machine resources, it can have parts correctly optimized which outperform standard C++.
Let me give you an example. Some years ago I read a study from AMD where they tried to do a simple task, copying memory, as fast as possible. They were just using Assembly.
They began with standard REP MOVSD and compared results with memory bandwidth: speed was much lower,so went on experimenting.
Any x86 programmer knows that REP MOVSD sucks a lot and is there for compatibility only.So he would think that using MOV, some jump instructions, and a bit of loop unrolling, interleaving instructions so that the superscalar pipe doesn't stall, would solve the problem.
That's what the study tried. And it improved the results, which remained far from the bandwidth limit.
To cut a long story short, they ended up examining the cache structure, and adding a loop - before the actual copy - which read a word from each cache line IN REVERSE ORDER to fill up the cache, using some specific AMD instructions (MFENCE, I believe).
This complicated program would nearly reach memory bandwidth.
Had someone written all that stuff in C (or even maybe C#), would have outperformed a straightforward ASM loop with MOV instructions,loop unrolling,instruction interleaving.
Yet I still use ASM in my Visual Basic 6 software, and get improvements in speed, which are quite useful for reducing order submission latency and dealing with many symbols data in fast markets, even on old hardware.
It's good to know someone out there is using ASM.
As for your example, I agree, we (or actually I) completely forget sometimes that people who build compilers have entire dedicated teams for each of the simple functions, one of which is the one you mentioned with copying memory. A lot of these standardized tasks are handled superbly by C++ and are often faster than a straight forward ASM code, but, certain loops and complex conditionals get bugged up and slowed down with C++ compilers, so that's the only case I'd say ASM can be useful, and only if it's critical, ie your latency issue is a matter of nano-seconds just because of that little part. So the whole practice of trying to do ASM ends up being a fun experiment more than a truly useful functionality, in most cases.
In your case however, coding in anything other than VB6 would be faster (except in a few special cases). ;)