Fully Automated Stocks Trading

qlai · Jun 15, 2020

guru said:
but they're very realistic real-time paper trades
More...

Does this mean you are getting real time market data for all stocks? What about options market data?

Same Lazy Element · Jun 15, 2020

Very exciting.
- what do you think are the median market cap and ADV among your traded names?
- what is your pnl/tradeval and do you think that correlates with ADV or market cap?

guru · Jun 15, 2020

qlai said:
Does this mean you are getting real time market data for all stocks? What about options market data?
More...

Yes, I’m getting real-time market data for all stocks, but not for options.
It actually would be too much data to process all the options in real-time, so I’m testing them with eod data, but I’m able to estimate the price I’d want to pay for some option combos and can place orders that may go through at any time during the day. Basically trying to scalp them, or at least get a decent price.

ValeryN · Jun 15, 2020

Wow, so many questions in just under 24 hours!

I didn't expect that at all.

After replying to the last one I'll try to sit down and write a first piece on something valuable about automated trading and mechanical strategies, as initially intended.

Same Lazy Element said:
Very exciting.
- what do you think are the median market cap and ADV among your traded names?
- what is your pnl/tradeval and do you think that correlates with ADV or market cap?
More...

When exploring raw statistical edge I certainly do see that there is more inefficiencies in small caps, the smaller the better But it might be impossible to trade them live profitably due to slippage and lack of liquidity. For example when I started I used to backtest including some stocks with volume like 1,000 or 5,000 shares per day but quickly discovered that live trades were either not entered or had just terrible slippage like few %..

Here is PL distribution by market cap for 2020 + stats.

Note that I'm grouping trades by "bins" here. I'll explain how it works just in case - all trades are sorted by MarketCap, divided into 20 equal groups, then each group is assigned a range for MarketCap and Value as average PL%. MarketCap is in Billions $. For example first "bin" on a left represents 29 trades with lowest market cap ranging from 0.07-0.33 Bil $ with average PL% yield ~0.75% (Y-axis)

ValeryN · Jun 15, 2020

guru said:
...
I came up with my system from scratch and didn't read any books, as I wanted to make sure that I come up with original ideas on my own. But it started with brute force testing of every indicator and combination of indicators, as well as coming up with my own indicators. Then evolved into a self-adjusting or self-evolving system.
....
More...

What an interesting path! Thank you for sharing with details and context.

Same Lazy Element · Jun 15, 2020

ValeryN said:
Note that I'm grouping trades by "bins" here. I'll explain how it works just in case - all trades are sorted by MarketCap, divided into 20 equal groups, then each group is assigned a range for MarketCap and Value as average PL%. MarketCap is in Billions $. For example first "bin" on a left represents 29 trades with lowest market cap ranging from 0.07-0.33 Bil $ with average PL% yield ~0.75% (Y-axis)
More...

Thanks! I assume you are scaling your target by the signal strength and attenuating your position size by the realized impact? That's what I'd do if I was running a small account and targeting smaller names, but there are many ways to skin that cat.

I don't know how much GMV that is, but it definitely looks like you are perfectly taking advantage of the less liquid space which is great. For comparison, an institutional stat arb book would be using some sort of liquidity scaling. E.g. usually it's some function that targets the smaller of volume participation and max dollar exposure - anywhere from 60-80% of the names traded would get clipped based on the lack of volume.

Are your alphas mostly single-name or cross-sectional?

ValeryN · Jun 16, 2020

There are some really good posts on this forum about trading systems development, so I am not going to start a big post on that, just a small intro into it. Btw my favourite posts are from early 2000s by Acrary. Now it is 2020 and, perhaps, a lot of people asking themselves if that stuff still works and I’ll start with - yes it does and it is certainly worth studying.

I feel like general trading systems development and assessment process doesn’t change that much. New technics are being added, alternative sources of data, alternative methods of optimizations, edge exploration etc, but in essence it is about a trader going thru numerous cycles with improved efficiency over time as

Understanding of the markets evolves

Good assumptions are being developed and validated with live trading

Trader becomes proficient at running those cycles

Trader finds what he is comfortable with and what he wouldn’t be able to tolerate in life trading

Trader develops a good sense what looks right and what isn’t

Here is how typical cycle looks like:

Looking for an edge which can be formally described and validated

Packing it into a trading system

Assessing if it is possible to exploit after factoring in trading costs and real life assumptions

Running a lot of various types of testing

Looking at correlations with current systems

And then if all acceptance passed adding it to a system of systems

On top of if there can be different levels of automation. In my case everything after #6 is automated, meaning data updates, setups generation, entry/exits management and most of the routine reporting. Research, for them most part, is never automated, but some steps can be.

Over time I’ll try to write on edge exploration, types of testing I do, data, typical mistakes, tools and automation, but let’s start with something really basic to set a common terminology for future posts.

Below is an example of a complete mechanical system. “complete” doesn’t mean it is suitable for life trading or performing well. It just has a minimal set of things we need to define so we can consistently trade it day after day and expecting that life results will not deviate too much from the model if we would run it over the same time period. Having a complete system is a pre-requisite for trading it and future automation.

Notice the minimal essential parts that needs to be defined. Such as

Trading universe

Setup

Entry condition

Exit conditions (PT, SL, timed, etc..)

Position size and daily/total limits

Costs such as commissions / slippage

This is certainly not a rocket science and is well described in some good books on mechanical/automated trading, but I assume not everyone read them and having those established will be important for future references

Now here is an example of this system equity curve and some popular stats. Looks nice right? But this data is garbage. Even though it does include reasonable trading costs assumptions but represents only period of 2 years. Big problem with that is it doesn’t give us enough data over different market cycles. I intentionally picked a good looking one to show how Return/DD/Sortino/Sharpe/Win%/Expectancy and many other popular metrics can only be useful within a very specific context but almost never are telling enough to conclude if system is good to trade or not.

Now lets look at some other period of a few years. 2012-2015 for example. Same system. Expectancy now is negative, win% lower, max DD has doubled.

I hope those examples I started with will give people some idea on how deceiving equity curves and widely used stats can be. Also, they don't have predictive power on their own. Meaning - if someone tells you their systems has this win%, max DD, ARR, expectancy etc - if doesn't mean it will be this way in the future.

Nevertheless tests like that are a pieces of a puzzle and ability to produce them + backtesting skills are important for building mechanical systems.

ValeryN · Jun 16, 2020

Same Lazy Element said:
Thanks! I assume you are scaling your target by the signal strength and attenuating your position size by the realized impact? That's what I'd do if I was running a small account and targeting smaller names, but there are many ways to skin that cat
More...

Could you explain what that means? I feel like there is something I can learn here.

In a general sense I'm trying to keep things simple and avoid some of execution challenges by using "micro" positions. Many traders mention - don't risk more than 2% per trade, my whole individual trade value might be 5% of my account, that helps tremendously with exploiting small opportunities.

I don't know how much GMV that is
More...

If I'm guessing correctly that GVM is "Gross merchandise value" then just to give some idea - I don't have any positions bigger than 30,000$ USD in value. And I don't do penny stocks. So typically orders size are not exactly disrupting the market.

There is no doubt that I'm taking advantage of small account size.

Are your alphas mostly single-name or cross-sectional?
More...

if you mean single stock or multiple - I rarely have positions in the same stock. Depends on a strategy but in general most of individual stocks have very few lifetime opportunities I'm looking for.

My universe is large enough to generate lots of trades. +I trade on long/short side and multiple strategies.

Same Lazy Element · Jun 16, 2020

Sorry, working in the industry you get used to a specific jargon and expect other people to follow - my bad.

ValeryN said:
Could you explain what that means? I feel like there is something I can learn here.
More...

The idea is simple enough.
- Lets assume that you have a collection of buy/sell signal that are continuous in nature (z-scores or something similar) and that you can have an existing position in the secutiry
- Instead of defining a binary threshold, you would take bigger position if the signal is stronger and a smaller position if the signal is weaker, with some reasonable cap to avoid extremes/errors.
- Define your target size is a function of that signal, with some throttling to avoid flicker and with some max sizes to avoid taking positions that are too large from risk or liquidity perspective. Also, apply your name-level risk limits:
Code:
mod_target = max(min(target + trade_threshold, cur_pos), target - trade_threshold)
lim_target = sign(mod_target) * min(ADV * adv_lim, MV * mv_lim, abs(mod_target))
trd_value = lim_target - cur_pos
- You break down your trd_value into chunks/orders and once you start executing, you monitor your realized price impact and stop if you exceed some pre-defined level of impact.

This way, you take advantage of illiquid names to the max before you start pushing them. As I said, there are many ways to skin this cat and you can come up with different approaches depending on your specific setup.

ValeryN said:
If I'm guessing correctly that GVM is "Gross merchandise value" then just to give some idea - I don't have any positions bigger than 30,000$ USD in value. And I don't do penny stocks. So typically orders size are not exactly disrupting the market.
More...

GMV is gross market value, i.e absolute sum of all of your positions in dollar terms. It's a pretty reasonable way to think about the overall exposure. I was primarily asking about your total size to get a liquidity perspective (for example, at my previous job, an average statarb PM would have several billion in GMV) but it also common to think about your returns/drawdowns etc in terms return on GMV.

ValeryN said:
There is no doubt that I'm taking advantage of small account size.
More...

Which is smart, it's a big part of your edge.

ValeryN · Jun 17, 2020

Same Lazy Element said:
GMV is gross market value, i.e absolute sum of all of your positions in dollar terms. It's a pretty reasonable way to think about the overall exposure. I was primarily asking about your total size to get a liquidity perspective (for example, at my previous job, an average statarb PM would have several billion in GMV) but it also common to think about your returns/drawdowns etc in terms return on GMV.
More...

Average overnight exposure as a % of account is 77.19% this year. With very high historical volatility there were occasional spikes up to ~190% which normally might not happen over 5-10 years.

Designing and testing over last 70 years of data certainly helped a lot with a current market as I wasn't trading neither thru 1987, .dom bubble or 2008.

In $ terms even including margin this is <1mil$ exposure. Attached overnight exposure graph by strategy. Above 0 - long, below - short.