Discussion in 'Programming' started by chromosome, Dec 11, 2011.
Either for strategy backtesting/development or actual algo trading through an API?
All the way.
Best kept secret out there, although not sure if one can call it a secret, because it's open source, most likely bunch of people just don't go around evangelically preaching about it. By the way, since living completely in the open source world for several years, I can breathe again.
Back to Python environment.
Beautiful simple syntax, easy to use, gazillion libraries. Most important, excellent performance, contrary to most uninformed opinions out there. That would be most obvious in backtesting where you would most likely face some serious volume. In order to get there you have to grasp properly issues surrounding large volume processing (keyword "vectorization") and get yourself some popular libs (scipy/numpy). As a result, you have the ridiculously easy environment to code with performance approaching C levels. If you really crave terminal speed, with some expanded coding you can use Cython and get the C compiled code (not really necessary in most cases). If not the most popular environment in scientific community, it is definitely one of the most popular. Best combo of simplicity in use and performance.
Coming from the 25+ years in corporate world, I really, really can't stand what's going on within the IT these days, namely and primarily with Java crap, all those stupid frameworks and insane levels of OO-obsession. I dont' care at all about the closure, apparently it's a big deal for some people. That's exactly the thinking that derailed the whole thing after C++ became fashionable. The OO-obsession has it's place, of course, but instead of having it in moderation and in places where it is appropriate, there comes the Java and now I can not even fart without wrapping it into the class and let my kids inherit it instead of wafting it in a strightforward and efficient manner. Whatever man ...
If you try using Python native code and build some objects, inherit them 55 times, add couple of loops and you will be waiting for half an hour to sum up single symbol daily ticks. I do not suggest going through this painful discovery exercise, my comparison is similar to the database world: "normalize until hurts and then de-normalize until works", where in programming "try objects
until hurts and go back to functions until works". Don't get me started on the web crap either, somebody is going to try to show me how much I am missing in XML "programming".
I am rather grumpy these days, because I am not getting good results out of some strategies I was hoping for, hence the negative tone, but I really mean it. At the same time I do not have intention to convince anybody because I do not care, neither do I have intention to respond or argue, because I do not care. Adult people should use and do whatever they see fit.
Having said that, you've asked about the python and here are some practical steps, don't have time to put the proper links, so google is your friend (but only as a search engine):
- Although it works perfectly on Windowz, try to move yourself to Linux, you will feel much better (I just can't resist not recruiting) and every now and then there are some things that are less awkward and patchy, hence more natural in that environment. Best example is multi-processing (not talking about multi-threading) where Linux forks processes much more efficiently and Python code is almost laughably simple, unlike Java counterpart (I just can't resist).
- Python3 is well advanced, but I still use 2.6.6 and do not feel any pain.
- SciPy lib is must, or at least NumPy which comes with SciPy.
- As I said, if you crave terminal speed, get Cython.
- If you need Large Hadron Collider speed, get F2PY and use existing or write your own Fortran procedures. Hell, there is a PyCUDA library too.
- If you want/have to use database, Sqlite3 has a module that comes with standard library. If you need a big one, I recommend PostgreSQL and supporting module psycopg2.
- I do incorporate fair amount of AI and ML into my strategies and in that area SciKit_Learn covers almost everything. PyBrain is well documented and versatile, very good for development and learning, but slow and sluggish if you hit some volume (due too much Python objects usage). FANN is a very good C library with multiple Python bindings. There are also smaller, isolated modules (Kohonen SOM, some clustering, etc).
- For adhoc charting/plotting get Matplotlib (see attached sample).
- For GUI development wxWidgets is the king and wxPython is Python bindings for it (see attached sample).
- For charts within the GUI app, I recommend ChartDirector, which is the only non open source software of all mentioned, but is available for unlimited trial, and it's dirt cheap, can't believe the quality for that price. Still goes against my open source principles, I guess you can't win them all. It is one of the best libraries I have ever worked with. I am not associated with the author and I ended up only prototyping some stuff.
That will pretty much cover you from head to toe, however there is bunch of other stuff too. It's just too much to squeeze into single list. As for myself, besides proprietary stuff, I've developed some API modules that I intend (eventually) to release into public domain, but I am nasty busy right now. Also, they are not polished to perfection, but mostly in pretty good shape.
- IB TWS API client. I have struggled for a while with that horrible peace of software. The protocol design is horrendous and all ongoing patches and extensions made it into a Frankenstein creature. I have finished a working version and have parked it aside, because I will not be using IB broker for the time being.
- FIX API client. As you probably know, that's the big boys standard protocol. It is (as every other should be as well) platform agnostic and rather well conceived, however it (inevitably) sustained some silly/stupid extensions. It ended up somewhat bloated and not compatible between the parties involved, because everybody decided to pick and choose the standard parts that they are going to conform and to do custom extension for everything else. Hence it's very though to have a version that is compatible between different brokers, unless of course you shove the whole protocol in the library, in which case it's not lean and mean any more. Initially I've done the version for Deutche Bank FX retail brokerage, which is actually the FXCM white label software. After I was almost done, I bailed out when I realized (my impression at least) how amateurish is the whole department (running the whole thing over unsecured connection and wondering why I am asking). Next, I have Dukascopy FIX client and that worked pretty well, however I've decided to switch to their java client (arrrggghhh) for specific reasons. I believe this FIX client code base can be adjusted with minimal effort to work with OANDA FIX API as well as IB FIX API.
- Dukascopy Java API client. Just to be clear, my client is Python and their client/trading app is java. I had to write two "strategies" in java that are running the TCP server inside the trading app, which I connect to with my client.
All those API clients are scalable, multi-processing, using non-blocking TCP sockets, SSH included, database or binary packed files as output, and other bla,bla... As I said I do not have intention to commercialize that software and will be looking to setup development project in public domain.
There you have it, I feel much better now. Going back to cave to continue fighting my strategies. Those neural networks are not that clever as many people would like us to believe, trust me...
If you are serious with Python and need some help contact me on python at grupadinar dot com, and time permitting I will try to help.
In the spirit of the open source community, anybody else as well...
great post 6yaNYCjm5m.
I am starting the process of converting over.
research platform is almost all python with some R.
As for FIX (or connectivity), I use java in the gateway for orders and market data and then convert all messages to an internal format and redistribute then consumers(bots) can be written in anything
Wow you wrote a charting application using wxpython.. looks pretty nice! How's the graphics performance for things like scrolling around, zooming, redrawing, etc?
My point with communication protocols is they should be plain character, with good message structure, verification, etc., where parties obey to the clear docs how to process the messages. Simplified, but that's the way we are platform agnostic, I couldn't care less if you are using a wooden stick on your side, as long as you obey the protocol. And I am not forced to use any "client" libs or any specific environment. As a matter of fact, ALL connectivity, deep down IS plain character, that's the only way how two machine can talk to each other.
Point in case is IB TWS client, which is masking the complete absence of any structure in the communication protocol itself, by having all the logic crammed in web of convoluted classes. As a result, you have a bad code, patched all over the place, forcing you to live with bugs in every new release. It's a travesty that such a big retail business is running on something like that, but hey, what can I do about that...
every place I have been converts all external messages to an internal (in house) format (ie. xml, json, binary something) so a market data message from IB looks like a market data message from FIX4.3 or whatever.
Wow, great post 6yaNYCjm5m. I'm very familiar with the modules you mention. Another one of note is CVXOPT which is an optmization package.
That's only the prototype, not having really intention to develop a full blown app, although 75% is already there. It's utilizing ChartDirector lib, which is a C++ lib and wxPython is just a wrapper for wxWidgets, which is also C++, so the whole thing is pretty fast. The charting lib is not natively built for real-time charting, but it can generate an image and pass it as a binary string to the wxwidget and it works OK. I actually keep it at 1sec refreshing time interval, although it could easily go down to 0.2, maybe even more, depending on how much stuff you load into it. That's when you would be pumping the real-time feed, in offline mode, it's irrelevant, you cant tell any lag while scrolling, zooming, etc. And I don't have any novelty graphic card, still running passive cooling, cant' stand the noise...
What people are missing in real-time charting is, it doesn't matter that CME pumps out at 4ms, you have network latency to start with and your screen can not refresh at that speed and your eye blinks in 150ms and your reaction is another 100 ms and people still believe they can scalp something if they click fast enough...
If doing something serious (maybe one day) I would probably develop core stuff from scratch mixing Cairo and wxPython, and that would look nice, be super fast, beat the crap out of .NET and Java app, run on everything.... As you can see I am constantly mean to Microsoft, Sun/Oracle and above everything to the new mean FASHION empire of Apple
Separate names with a comma.