Tick Database Implementations

Discussion in 'Data Sets and Feeds' started by fatrat, Nov 25, 2006.

  1. nitro

    nitro

    People are mixing two different uses of "Tick Database" in this thread. As I pointed out somewhere in this thread, databasing (persisting) the ticks to disk is trivial and is of no consequence (to me.) Where the real value is being able to extract intelligence from the data, and since this is a time domain, to be able to do it in such a way that makes sense in the markets. [Although I will add that it would be nice to be able to turn a time stream into a time-frequency domain stream and either do research on this stream or trade the time-frequency domain stream in realtime].

    So for me, the actual database to disk strategy I couldn't really care. For example, I am talking to the Xenomorph Timescape people (linked to above) where the actual database will be SQL Server, Oracle, HBase, or perhaps their proprietary format. Who cares? What I care about is that if I stop using Xenomorph Query Language software, I can still get at my data. That probably rules out their proprietary data store format. The real value these things add is the query language that sits on top of some database that allows you to ask questions of your data very simply, and then to be able to turn that question(s) into a trading system with little or no modification of your code whether it is from Excel or C# or whatever. So the code to do research and the code to turn that into a trading system should be almost identical.

    TradeStation is probably one of the most seamless ways to do research and turn that research into a realtime system, but it is not an institutional grade product because you are stuck with EasyLanguage and all analytics have to be applied to a chart instead of a database. If TradeStation turned their EasyLanguage loose on a generic database [more accurately, generic stream] without having to have charts, it would be far more interesting. But even there, EasyLanguage is not semantically rich enough as these time base query languages that allow for far more complex mathematical analysis.

    Vector or column databases and the (proprietary more often than not for market analytics) language to query them is why they are not commonplace.
     
    #101     Jun 1, 2012
  2. you mentioned things others misunderstand, database solutions you would not use, but I was wondering do you mind sharing for the benefit of all what specific databse/query language you yourself favor and would use?

    I strongly disagree with your point on the separation between storage technology and query language. Most always the two are inseparable for efficiency purposes (not saying separation does not exist but the most efficient solutions cannot be split). You can always segregate by running a binary data store and translate back and forth between binary read/writes and on the other side queries but this is for sure nothing that comes even close to being most efficient (fast or memory efficient). Kx's query language is very closely integrated down to the finest detail of its storage logic in memory and physical medium. I think my point is clear when considering that running sql queries on a Kx database is orders of magnitudes slower than using the built in query logic.
    Thus, I disagree with your point on this issue.

    Care to share what specific solution you propose to tackle issues I described in my previous post when I described typical use cases. I am well aware that logic involving historical data based queries can be very different from supplying data to charting libraries (via databindings).

     
    #102     Jun 5, 2012
  3. why wouldn't you just continue to use your binaries?

    if you can push 6.5M ticks, i'd think you'd be competent enough to write yourself a loader/scanner ("cep engine").

    my2c... hype and marketing slow down real work much more than they solve it. keep it simple. roll your own.
     
    #103     Jun 6, 2012
  4. with all due respect but I think you did not get my point. I have no problem implementing scan algorithms or other apps. I am looking to exchange ideas about efficient database structures for tick data and custom time series for read and write purposes with APIs that expose rich query functionality. Rolling my own is a huge time waster especially when something already exists. Even my own binary data store and reader implements open source components that I did not develop myself. I am wondering whether anyone has experience using Redis and RavenDb and what they have to say about their efficiency regarding time series data storage and data retrieval.


     
    #104     Jun 6, 2012
  5. nitro

    nitro

    #105     Jun 7, 2012
  6. nitro

    nitro

    #106     Jun 7, 2012
  7. #107     Jun 7, 2012
  8. #108     Jun 7, 2012
  9. nitro

    nitro

    It is free, performant, reliable, it is meant for Time Series, and it sits on top of HBase which has its own advantages.

    The problem is its lack of resolution. Market Time Series need to timestamp at least on the micro-second resolution, and I argue even on the nano-second resolution.
     
    #109     Jun 8, 2012
  10. Just store time stamp as string... What timer do you plan to use for ns precision?


     
    #110     Jun 8, 2012