Why use a database?

Discussion in 'Data Sets and Feeds' started by onelot, Oct 9, 2004.

  1. Isn't mySQL fairly simplistic in comparison to the also free PostgreSQL ?
    I remember mySQL being more used for small tasks and web-oriented deployment. Also, the licensing of mySQL requires that you stay away from commercial deployment, while PostgreSQL uses Berkley license ?
    Also, mySQL only supports row-locking, while PostgreSQL have more advanced locking as well as optional row-locking. PostgreSQL has PL/SQL stored procedures.

    But I guess both free alternatives are not perhaps the thing one would look for if the database grows very big, with lots of intraday data and for many symbols. Then backup, replication etc. might be things that weigh in as the data themselves becomes more of an investment of time, resources and money.
     
    #81     Oct 15, 2004
  2. linuxtrader

    linuxtrader Guest

    They are both useful ... The newer versions of mySQL offer familiar features found elsewhere .....
     
    #82     Oct 15, 2004
  3. I'm not that familiar with recent versions - as I've only used years and years back, but I still get the impression that mySQL is faster on simple databases and simple operations than PostgreSQL, which has some more advanced features.
    I guess the licenses also determines usage a lot. Two great and free products anyway. :)
     
    #83     Oct 15, 2004
  4. prophet

    prophet

    The point I keep trying to make is that optimization is actually easier, much easier than you portray. Yes, beginning programmers shouldn’t have to worry about it. However, it is wrong to say they should neglect it completely. They should know just enough about proper algorithm design and use a profiler as a learning tool such that their code is decently optimized from the start. I’m not suggesting heavy profiling to get every last drop of performance or generating less readable code. I mean a basic algorithm understanding and exploratory profiling to see what runs fast and what doesn’t. Not too much trouble if you bother to learn about it. What problem would anyone have with this philosophy?

    Maybe we’ll never agree on stuff… sadly. I feel that flat files have tremendous advantages starting out. Ascii files can be and edited, understood and debugged better than databases. I store all my market data in CSV files and cache it in binary files as it’s used. Why? I sometimes need to correct for market data gaps, patching my primary server’s data with the backup server data. Just edit the CSV file and delete the binary cache file. I use flat binary files because they are fast, and I can manage them easier, compressing and archving data I don't need anymore. If I had to use a database, I would be adding, subtracting, purging, and archving many GB of data per day through the database. Lets hope the database is smart enough to handle the internal layout so it doesn't take hours to process.

    I agree on that last point.

    You know I hate saying this. Unfortunatley, here you go again portraying things as black-or-white, one-or-the other. The truth of the matter is that one can achieve a very nice combination of both performance and expression, with surprisingly little effort. Both performance and expression can speed the path to profitability. How does pushing performance limits necessarily interfere with profitability? I don’t see how one can negate the other, except in the case of poor, unplanned or uneducated designs... incompetence.

    You have it backwards. More market data allows greater statistical significance. Less data leads to over-fit results. Please prove your opposite point of view.

    Sure, plenty of quants have thrown hardware at a failed system, only to still end up with a failed system. It was their design at fault, not the use of hardware. I threw hardware at my systems, which resulted in smoother and more substantial returns, by virtue of greater diversification to more markets and more systems per market. You’ll find examples either way. The question is does extra computation necessarily hurt? If yes, then why do you use any computation to begin with?

    It is ridiculous to claim there is a certain optimial amount of computation, beyond which is detrimental to profitability. That is what you are suggesting. It doesn't make sense logically, unless you are assuming an inherent amount of incompetence. In that case the problem is core incompetence of the designer, not the amount of computation. They will fail with any amount of computation.
     
    #84     Oct 15, 2004
  5. linuxtrader

    linuxtrader Guest

    All of the debate can be boiled down to saying that if the system meets your present and future needs then you are done: no need to optimize further, change the design or do anything else.

    A good system design balances cost, and performance and also scales appropriately to meet changes in capacity and demand which are within the design limits of the system.

    You can use any combination of flat files, database systems and algorithms that accomplishes your design goals. Unless another system is identical to yours then comparing designs is largely irrelevant.
     
    #85     Oct 15, 2004
  6. nice to learn about your tricks.
     
    #86     Oct 15, 2004
  7. prophet

    prophet

    Future needs? Markets change in unpredictable ways, especially long-term. You can’t predict that.

    Like I said to Sparohok earlier, what happens when the market changes, rendering your systems unprofitable? You will fault yourself for not trying to improve profitability in the past while there were more market opportunities to profit from, and you could have ramped up your analysis efforts.

    You and Sparohok are both suggesting something very dangerous... namely contentment with the status quo. Many people and fortunes have been destroyed by that.

    Design limits? Cost? "Scales appropriately"? We’re not talking airplanes here! In the pursuit of trading systems there are often NO design limits when it comes to achievable processing speed, scalabilty, amount of data processed, number of systems traded, profitability, etc. Your motivation is your only limit. You mentioned cost. Computing hardware, books and self-education are cheap. There's no excuse for not using them. As a Linux user you already know how to cut costs.

    A better educated programmer can do the work of 10 or 100 less educated programmers, and multiply the effective processing power of a computer by factors of a thousand or more.
     
    #87     Oct 15, 2004
  8. kc11415

    kc11415

    kc11415>>1) Is your timing of this query fresh after the database is started?

    marist89>Give me a little credit.

    Please accept my apologies. I was just curious ;-)
    _______________________________________________

    linuxtrader>As far as answering a laundry list of inquiries about database configuration and query execution my advice to the other poster that asked is to read their oracle documentaion: all of those issue are discussed if they need confirmation of how to handle implementation and optimization - which you graciously answered .....

    LinuxTrader, FYI: That tiny "laundry list" would not be known to someone who hadn't already RTFM'd ;-)

    However, in hindsight the question about hash vs. bitmap index shouldn't have been asked since good index performance accompanied by a high degree of selectivity implies b-tree hash rather than a bitmap index.
     
    #88     Oct 15, 2004
  9. Grizli

    Grizli

    We are trying to find the best decision for future bargains using backtesting. Many people told the good decision for the past period cannot be the same in the future. The question is the following: what information do you hope to get using historical database, for example?
     
    #89     Oct 15, 2004
  10. linuxtrader

    linuxtrader Guest

    On the first point I can tell you that I would never approve a project where the design engineer did not know the limitations of their system design: If they cant predict how their system will respond to a spike in demand/load or a change in the problem regime then I simply send them back to their desk to rework their idea before I approve a dollar of funding towards implementation time.

    On the second point our experience differs: at a certain level in most businesses people are not wholly incompetant. I've never met anyone that produces a design that can not be improved upon in subsequent iterations. However if you start with a good system design that matches the problem regime and you are careful in your implementation then you can arrive at something that requires very little change over a broad spectrum of applications. The idea that most programmers are incompetent is not true today: very few techniques or practices are secret today ... part of the reason why software people are commoditized and seeeing their incomes decrease or stagnate.
     
    #90     Oct 15, 2004