Oh but i'm pretty sure you will. That's how many ASM coders get their jobs. Compilers "try" to do it right and there are many good ones, but better parallelization is always done by a human. There are just too many variations on any number of tasks for a compiler to take into account all of them. Try decompiling some code you think should be faster than ASM and looking at it in Olly Dbg (disassembly). It's just impossible, it's machine language, humans do a better job at compiling (by coding machine language) due to the nature and range of different OpCodes available. A compiler isn't creative and will not get ideas along the way, a human will/should be.
Of course you can find use cases for using assembly for offline research but you didn't specify the criteria and my response was a reflection of my personal perspective only - it was a mistake for me to imply I was answering on behalf of everyone else. Most use cases that I'm aware of though do not justify the use of assembly for offline data mining. And yes, I have done a fair amount of professional data mining and never needed to use assembly to achieve it. That doesn't mean your experience is different. These days hardware is very cheap. If the cost of the hardware is what is driving you to use assembly then so be it, congratulations. If you think you can be as productive writing assembly compared to C, more power to you - I have no idea why C and other higher-level languages were created. But seriously, why bother with assembly when you can produce hardware that does the job even faster for you? In the cases I have experience with, renting commodity hardware in the cloud has been more than sufficient. I see what the real purpose of the thread is now - for you to tell everyone about your superior assembly programming skills. Congratulations on that! You should be able to find failing strategies twice as fast as those hedge funds you're talking about.
We use inline assembly and GPGPU programming where appropriate. Taking it a step further, our friends at www.tradingsystemlab.com run machine code with genetic programming. Their trading system designed with machine code holds the #1 Futures Truth position at 276.55%
Right... he clearly is asking a question he already knows the answer to (or should know at least). I know probably less about programming than anyone in this thread, but I know that it's impractical and many don't code in ASM for reasons already stated. It's very good to know ASM but the practicality isn't there as obviously the community moved on to other things... The people that I know that code assembly use inline as mod said above as now you can usually combine languages. Execution of code vs hardware latency obviously was not a big enough issue to prevent moving to higher level language standards.
One-time hardware investment is of no concern when you are running billion $$$ hedge funds. The issue with data mining has primarity to do with bias in the results. Hedge funds do not care if the data mining software takes one more hour as long as the results have low bias. I also know of no hedge funds that employ real-time data mining because of the bias issue. In other words, results must be manually checked. Thus, I see no point in developing in assembler other than with HFT applications. What have you got by the way? Can you post the results from a system your data-mining application has come up with? I hear a lot of talk in these threads about this and that but where is the meat?
Quite possibly true. I haven't really thought about scaling up, so i guess hardware would/could come out cheaper in the end. Oh no, that wasn't the purpose of the thread, i really didnt want to come accross that way. I guess a bit of elitism does come out as much as one tries to hold it back. Spending time talking to other ASM coders does that to a person. I was really looking for opinions like yours, so i think its reasonable to say coding in ASM isn't quite as important for some of the trading applications as i thought it would be.
I'm a little apprehensive about showing the meat. I didn't mean to say i had anything but statistically interesting results in the first place, so maybe i don't. If i did, i still would like to keep it to myself for a while to be sure i'm not talking out of my behind (and to reduce risk of various online privacy infringing threats). Isn't it like that always though? Success is something people will hide once they find it in this industry, for a whole host of reasons.
The odds that assembler code will have a programming error are much higher than a higher level language. Its not an either-or decision, you can use C/C++ and write critical sections in assembler. As already mentioned, with today's complex processors it is difficult for a human to optimize the code with respect to the target cpu. In the past optimizing assembler used to be a straightforward proposition, not anymore. Its much cheaper, quicker and safer to throw more hardware at a problem then switch to assembler. Only those who are running the fastest everything already (cpu, disk, network) should even consider worrying about the difference between compiler optimized code and assembler. There aren't very many people at all who are skilled enough in assembler to beat the compiler optimized code for any non-trivial amount of code. In short in looking at a cost/risk/benefit analysis I doubt that switching to coding in all assembler wins out very often these days.
I mostly agree with what you said, but for a non-commercial application, the processors and the hardware will be very specific, like for a pre-determined type of data centre etc. In that case you don't have to worry about the processors much except for the possibility that future hardware upgrades will require changes. Even then, the instruction sets in the past several years have been fairly similar among new processors and it's not as much of a hassle to upgrade code as it sounds, usually involving minor changes if the initial code was written with any kind of responsibility. Finally, for offline data mining I agree that throwing hardware at the problem can often be a solution, but there ARE numerous good and low paid ASM developers out there which could potentially reduce the hardware cost by multiples of 5 to 20 for longer operations. On a case to case basis, there are certainly cases where coding in ASM could be benefitial, but it all depends (and it could be rare, agreed). Anyway, thanks for all your opinions.