> I'm building "KDB for the rest of us" Already done.. called QuantServer and processes 1m ticks per second www.smartquant.com Cheers, Anton
Anton, Would you care to elaborate on your technology? An architectural overview would help. Unless of course you were just dumping serialized .NET objects to disk and compressing the data stream. I don't think this would compare to KDB. Thanks, Joel Disclaimer: For a year I sold the predecessor to QuantDeveloper and that was all I did. I also looked at the source code and still chat with customers from time to time. There are some long-term disagreements between Anton and I. Anton: I _specifically_ do not want this thread to degenerate into a "mine is bigger" competition. Please post an overview of your technology for the benefit of everyone.
I'm more interested in how to describe chart patterns in computer programs, the time frame of the chart is irrelevant. Could anyone provide some resource including source code on this?
That's exactly what I think. In fact the initial poster correctly talks abould gigabyte sizes. In fact I'm running closer to 100 Gigabytes. I couldn't imagine any rational way of handling and exploiting this store of tickdata WITHOUT a sophisticated database infrastructure. If you try to do this without a database, you're back in the stoneage of computing and you'll end up nuts in reinventing a poor database kludge. As to 'speed', I honestly don't see the problem. I'm collecting huge amounts of tick data in real time and I'm still very far from reaching the capabilities of my db. Of course, it is easy to write a cripple's piece of software that chokes up. IMHO, it means that you have to think harder. WITHOUT DATABASE, YOU'LL STAY A LOSER
>All due respect Anton, but isn't there any place else to peddle your way over-priced sw? ok, continue discussing open source and 99$ solutions. The right price for a product is the one that market accepts... IMHO If you think that it's overpriced, the only way to prove this statement is either write it yourself or point to a cheaper alternative. KDB is in 100K range, QuantServer is in 1K range. Both perform about the same when it comes to market data capture and playback for strategy simulations and historical data requests. As for underlying technology.. Well KDB writes a large flat binary file with time ordered data records, thus data processing operations go with SCSI/IDE IO speed, no surprise. I guess DateTime search looks like Stream.Seek(...). QuantServer introduces buffering and compression. Underlying technology is not a secret and it's based on root.cern.ch TTree concept. CERN guys write and process terabates of data with Gig/sec incoming load (nuclear events). QuantServer uses similar approach tuned for time series financial data processing. So here it is. No need to discuss which one is bigger (partly because you don't have any at all to start with ) - go and get it for free. PS. I don't think that Joel's comments are relevant. He has left SmartQuant LTD before we launched QuantServer and QuantDeveloper projects, so "looking into the source code" is somewhat misleading Regards, Anton
nononsense, Thanks for your informative post. With the risk sounding like a "loser", I'm just learning about DB and Access. Looks cool and reasonable enough. Note: NOT THAT I WOULD use ACCESS for tickdata storage or anything that serious. I understand all the benefits of db - security, blah blah,etc. What I don't understand is the connection between the db and the backtesting software. So, don't you still have to pull the data out of the db and store that in some format? an array? Or some complex data structure and populate it? I'm a bit confused. So, one would write SQL commands to pull data then put into a complex data structure then use that to do tick level backtesting? If you can clarify that would help a lot. Also, doesn't the sql is more interactive prompt at the DB end. But not at the programming language end like VB, C++, Java, python, etc.? Don't one has to use some kind of ADo.net and other DB API? please help! thanks. trader99
I'm working on a system now that stores current data in a SQL Server db and then I use cubes for my archives. Not sure this is the best approach but its what I am familiar with. I'm a developer but this will be my first trading app (personal use only).
From Wikipedia: "Although T-trees seem to be widely used for main-memory databases, recent research indicates that they actually do not perform better than B-trees on modern hardware" ... Is anyone interested in join this project of a local market data server built on top of the Microsoft SQL Server 2012 and .NET/C++/C# ? http://github.com/kriasoft/market-data
Just got done architecting an expansive equities tick repository. Some stats: Symbols 20,653 Period: 2008 - 2012 Bars25ms 27,741,118,213 Bars1Sec 14,007,345,833 Bars1Min 2,141,219,516 Messages 742,640,774,253 Ask Changes 24,825,915,500 Bid Changes 24,722,845,608 Orders 37,906,709,939 Volume 10,485,098,567,764 Dedicated Servers: 5 Data Storage: 20TB We chose a hybrid Hadoop style implementation with SQL access. Being I/O bound was an understatement. We are now able to locate and access any tick of any instrument nearly instantaneously (<10ms). The data is stored multiple times using different optimizations for accelerating performance. Different Structures are used for pairs analysis, graphing bars, index analysis etc. Extensive Use of Covering Indexes (where the index contains the answer data). One of our driving forces to build out this data repository was that the consolidated data commercially available was fundamentally flawed being built around last trade data. Exchange Tape data is too slow to process for most of our algos. We build out our bars differently using ask/bid changes as the trigger and not last trade data. Consequently our back tested results nearly match our real time executions. This is especially true when trading pairs and other cross exchange correlated instruments. We're contemplating making access to these structures available as a service... Renting out VM's with direct access to our 20TB repository... Send me a PM If your interested.