Java - Storing data in memory for post-runtime access

Discussion in 'App Development' started by jtrader33, Jul 16, 2013.

  1. lwlee

    lwlee

    Geez, talk about twisting words. RESTful API is only if he wanted to take this app to another level. Using basic HTTP server which took him a single DAY (with the code sample I had, prob more like 2-3 hrs) to complete was the easy answer.

    I hate the idea of juggling multiple open source DBs. I already have MySQL, Postgres, SQL Server, Oracle on my machine. Need I have to deal with another specialized database. But I'm not gonna hate on a product that I haven't used. If OP is willing to experiment with RedisDB, I would hope he gives the pros or cons of using such a solution.

     
    #31     Jul 28, 2013
  2. mutluit

    mutluit

    don't know if that is possible in Java, but In C/C++ you can split it into a main program, and the rest into a dynamic link library (DLL).
    The main program would accept commands from the user.
    One of the commands could be "exec my.dll"
    So, data is kept in memory, the changing logic must be placed into the dll.
    Besides the mentioned socket solutions (tcp/ip, http etc.)
    alternatively one also can add a scripting language to the main program, but this slows down everything.
    The DLL approach is the most elegant and fastest method IMO.
     
    #32     Jul 28, 2013
  3. * No serious profiling platform loads data over a REST API. The question here is not how to serve a profiling platform frontend but how to load data into a back testing platform. REST is probably the worst choice among all (because slow).

    * You are not alone to dislike the idea to run multiple databases. Well, first of all the advice was not pointed your direction. If you struggle with too many databases to handle then maybe you should cut down on some. I have no idea why someone would run SQL Server AND Oracle at the same time (hinting at the fact, given that no corporate entity would ever approve the cost of a user running 2 platforms at once, that you are either having too much money to spend, you made up a story, or use illegally downloaded copies).

    *If you have never heard of Redis and never used it then I seriously doubt you are much into programming. It is one of the most lightweight, fastest, in memory key/value stores there can be and it has hooks pretty much in every conceivable language.

    Wanting to keep this discussion on a civil level I do not try to accuse you of anything I merely say that recommending REST to solve this particular problem is a bad recommendation. The easiest and simplest of all solutions is a binary flat file data store. (Look at the following for something already done: http://discretelogics.com/teafiles/). ( I have no affiliation with them, just came across that library when I was looking for charting solutions for my own binary datastore)

    Next up, he should use an in-mem DB if he thinks he can/wants to store all data in memory. Simple as that, lets not over-complicate things with clients/servers, http, REST, ....





     
    #33     Jul 28, 2013
  4. Craig66

    Craig66

    For what it's worth, myself, and all the people I know who do this sort of thing all implement some version of the flat binary file idea.
     
    #34     Jul 28, 2013
  5. Which makes a lot of sense...it all comes down to using the right tools to handle tasks. Relational databases to load time series? NO! REST APIs for high performance throughput? NO!

    Why I mentioned Redis is because I know guys who have small enough time series data to profile strategies over to fit into memory. Given that is the case, one can capitalize by using one and only one solution (Redis in this case) to write back to such in memory database strategy results and other strategy parameters which can easily be queries out of R and statistically analyzed. This is somewhat of an advantage over having to store such data in a file and read it back into a R legible format. But in the end both approaches need mapping that translates the binary format back into R data sets.

    I myself use my own binary flat file storage device for all historical data loadings and use Redis to transport and hold temporary strategy results that I further visualize and statistically analyze in R/Matlab.


     
    #35     Jul 28, 2013
  6. lwlee

    lwlee

    I did a little digging into RedisDB. Apparently it falls under the relatively recent NoSQL phenomenon i.e MongoDB, CouchDB, etc. Your presumption that those who haven't heard of it as being "non-programmer" is fairly arrogant. You should work on that. I've worked as a programmer on Wall Street for 16 years. Most financials aren't on the bleeding edge. Consider it took Spring at least 3 years to gain traction before it now accepted as a Enterprise Java solution. While RedisDB may have potential to fill the OP's need, it is a niche product that isn't widely used. Dice has maybe 20 jobs in the NYC area asking for its expertise and they are ancillary skills.

    One of the criterias that OP needed was to be able to continuously create new application builds. This requires starting/stopping his main profiling engine. Loading data by constantly serializing/deserializing data during many restarts seems like a waste of time hence going with a simple lightweight client/server setup, again which he was able to accomplish fairly rapidly.

    The fact of the matter is that RedisDB ON PAPER seems to be an ideal fit for what OP wants to do. It specifically is an in-memory datastore for collection classes. But you need to understand that it is a form of client/server. RedisDB == server. Starting a RedisDB server

    As for the performance of RESTful, you seem to have a weak implementation in mind. RESTful is basically HTTP. It's extremely fast. There is little overhead to it versus something like SOAP which is slower.

    Next time just present your thoughts on why RedisDB might be a good solution rather than try and hate on other people's ideas. Your tone has been pretty arrogant.

     
    #36     Jul 28, 2013
  7. Craig66

    Craig66

    I use Rcpp to load the binary data into R.
    This seems to be a fairly popular solution.
     
    #37     Jul 28, 2013
  8. * You are probably right that most financials operate on legacy technology as long as it does not directly solve imminent front office problems. That is the reason why DB and other IBanks run tens of thousands of applications, not because they are needed but because nobody dared to consolidate and cut legacy back compatibility at the expense of higher IT budgets. Its something that is very unfortunate because it is a huge cost driver due to expensive bugs and errors that originate from such chaotic stack but its most likely not gonna change because IT managers are paid to get problems today solved, not the ones of tomorrow.

    * May I please correct you: Redis is not a niche product: Its used by the back-end of some of the highest traffic websites worldwide and also by quite a number of large cap corporations. Most everybody who is involved in C++, Java, .Net, has heard of Redis. Lol, and why no experts are sought for Redis on Job Boards is because you dont need any expert!!! Its easy to configure and just works contrary to the total SQL mess. I looked at Entity Framework and its laughable how complicated problems are solved in the ORM world just to pull some data out of traditional RDBMs. I am not saying RDBMS do not have their place but certainly not in anything regarding time series data management.

    * You do not seem to understand the task at hand, hence my questioning of your expertise: NOBODY needs to serialize data when loading historical data into a profiling/testing platform. The data is deserialized from binary data structures and loaded into the platform. This process is an order of magnitude faster than any REST, any RDBMS could ever be. No Oracle, no SQL Server, no other database can by definition compete with a flat binary data structure because of all the overhead such databases force you into.

    * Now you suddenly advocate Redis as the solution for the OP? Confused....

    * Lol, dude, REST is the preferred way to serve content on web pages over SOAP these days but it is incredibly slow to solve the problem OP has at hand which is loading historical data from a data source. Please get real!!! I am not sure what kind of data throughput is acceptable to you but if you have no problem with parsing to and back from string based data then we are already in different camps.

    I do not hate your ideas, I simply say that your idea is a poor choice to solve this problem. And your post to which I just replied shows me that you yourself may want to inform yourself a bit better before you make claims that are not substantiated at all.



     
    #38     Jul 28, 2013
  9. Fair point, I like that package as well. In the end it comes probably comes down to the same time consumption (Redis/Mem-mapped vs. files to direct data from your app to R), difference is whether you want to persist all data on disk or just run a quick investigation in R.

     
    #39     Jul 28, 2013
  10. Another thing everybody has overlooked is that his current system may not have enough RAM to hold all the data. If so, a goodly portion may be swapped to/from disk at some point during the tests.....

    In the long term, he'll want a system with 64-256 GB of RAM if he wants to use memory store in the future *AND* have the system usable for other things...

    In the short term, he can just buy 3-6 SATA III SSD drives, configure as RAID-0, and go the binary file store route.... Plenty of space and speed....
     
    #40     Jul 28, 2013