General purpose solutions are always a compromise. From other side a custom tailored solution is not easy. But after weighting the options we came to conclusion the time learning and testing all these products would be better spend if we begin coding from scratch a custom tailored solution.
I'm glad to see this thread is still going... The timestamp issue is a problem, but it happens to not be much of a problem for me. A lot of the data I'm working with is from iqfeed which only has second resolution. The data does have implied order within a second but I threw that out. I really should have added a column to store the relative ordering, or a subsecond column for my data that has accurate time stamps. That would at least solve the problems of being able to retain order using ORDER BY, but you would have to work to get it into a datetime type in C#, for example. I can't imagine MySql will ignore milliseconds forever, so maybe there will be a better solution in the future. One thing I learned with this is that I had to stop thinking in the row-based mindset and include as many columns as I think could ever be useful. I really didn't go far enough when designing the schema, even with 14 columns total. After using this more, I would have added more columns (dollar value of the trade, where the trade occurred relative to the book, etc). For most types of data it is essentially free to add columns, so don't hold back if you are starting from scratch! Queries that only operate on a single column are always going to be faster than queries that operate on multiple columns because of the way it stores data. At one point I put together a binary flat format that was extremely fast for the type of query I was running, but in the end, I've quit using it and have moved completely over to ib. My main use case so far is to take a (relatively) large amount of data, perform a transformation on it that returns a tiny fraction of the data, and analyze it offline. I've ended up storing the transformed data in both ib and sql server for analysis, and I imagine that I will do the same as I find more transformations that I want to apply. I have really come to appreciate that IB isn't a time series database. Adding computed columns to hold commonly used data is a partial solution, but the bottom line is that it sucks at things that time series databases really excel at. It will return results in a heartbeat for all sorts of complex (and useful) queries, but try to get it to compute a moving average or tell you at what price MSFT was trading at every time the S&P down ticked yesterday and you will cripple it. If I ever have the type of resources to invest in kdb, I'll ditch ib and switch without thinking twice. Needless to say, I'm not at that point yet, and I haven't come across any better solutions so far.
I am glad too it is so important. Column oriented is a part of the picture - it is still too general. Time series oriented comes closer to what we need. Still there is one more component - computations. kdb has its own language. If we have to use a computer language the first thing I am looking for is it popular and if it is live. kdb is very proprietary and practically dead.
I am analysing a large database of historical options data. I am storing it in a MySQL ENGINE=MyISAM. It is horribly slow.
My experience with RDBMS's with large amounts of price data was similar -- slow and large memory/disk footprint.
For analysis, you might wanna check out SQL Server Analysis Services, or a database solution made for analysis/olap