Building AI Trading Systems

henning|5 years ago

I tried doing some forecasting with various neural network models after assembling what I thought was a good amount of forex data. The neural net (I tried various architectures) couldn't do any better than chance. After playing around with it and trying to double-check everything, that was as far as I could get. This puts me ahead of most traders, since most of them lose money, then quit.

This makes me wonder what kind of trading systems can actually have any kind of edge, since some kind of autoregressive time series forecasting system seems pretty unreliable.

On a more general note, how do you move beyond it being gambling? Just because a system backtests well doesn't mean a phenomenon will continue to happen, especially if your system will significantly impact the market you're in. If you make a trend-following system, every time you trade, you're gambling that the trend is more likely to continue than not. If you're right, you'll come out ahead over many trades. If you don't have enough capital to withstand drawdown the way most beginners don't, you won't be able to last long enough for whatever phenomenon you've found to average out.

It takes a lot of time, effort and risk to do all this, so, this is a long-winded way of saying I don't think it's for me. If you build a SaaS product and it fails, at least you can talk about what you learned from building it and use that in future endeavors. If you lose money trading because your algorithm doesn't work, what do you learn from that besides that your algorithm doesn't work?

smabie|5 years ago

The most popular kind of quant trading is using a factor model. The first step is developing some alpha factor, a number that is predictive of how much money you'll make from each stock. So let's say my alpha factor is "companies with good earnings per share will go up." So I first take the EPS for all the stocks in my universe and maybe rank, then zscore. Now I have some positive numbers and some negative numbers. These represent the weights of my portfolio. The positive weighted companies I go long, the negative ones I go short. The bigger the number, the larger my allocations.

Now that I have my alpha factor I backtest it and whatever. Since the mean of a zscore is zero, I know I'm market neutral, so (ignoring some stuff) my factor should have little exposure to the market.

If I think it's good, I add it to my other alpha factors and combine them somehow. Could be as simple as adding them all up, or maybe something like using random forests to figure out the best way to combine them, or whatever. Now that I have a bunch of alpha factors all combined, I can run them through the optimization engine.

The optimization engine will adjust the weights of my "ideal" portfolio in order to reduce exposure to various risk factors (thus lowering volatility). My optimizer will also figure out how often I need to rebalance. There's generally a bunch of terms in there that try to reduce trading costs and zero out exposure while not diluting the "ideal" portfolio too much (or else the alpha could be wiped out).

Now, after all of this, I'm ready to trade.

In short, what we're trying to do is reduce our exposure to as many factors as possible and just get exposure to our alpha factor. We don't want the market, price of oil, sex scandal of a CEO, or anything else affecting our portfolio. We are trying to dig up this latent, unearthed, alpha that exists in the market, but doesn't belong to one company or asset.

darkteflon|5 years ago

Most people seem to think indexing is a boring cop-out, but imo it’s the place of humility you get to after you’ve dashed yourself against the rocks of trying to outperform for a few years - or decades - and then realising the whole endeavour is insanity.

It’s accepting that you’ll receive what the market gives you and not a dollar more, and that that’s the best you’re ever going to get.

wavepruner|5 years ago

You need more data to input besides just the price time-series. Successful human traders balance and synthesize a myriad of data sources to make decisions.

I depend on an in-depth understanding of human psychology as one of my data sources. You can't turn something like that into data and input to a model. It is something learned through life experience and study.

tcgv|5 years ago

> since some kind of autoregressive time series forecasting system seems pretty unreliable.

A few months ago I tried to evaluate autoregressive behavior in stock returns. To my surprise it seemed strong on some periods, but then weak on others [1], and as you said not reliable enough to rely on.

My impression is that a lot more information aggregation and processing is required to obtain a sustainable edge worth tranding on than what a single developer can achieve in his/her spare time.

Top investment shops have dedicated teams of sw engineers just to deal with the infrastructure that support their data pipelines, financial model backtesting and deployment.

[1] https://thomasvilhena.com/2020/01/likelihood-of-autoregressi...

gorgoiler|5 years ago

In a literal market, you can find alpha in tomato trading by:

(1) monitoring other people’s tomato transactions in as much detail as possible;

(2) spying on people who are about to buy tomatoes and rapidly make changes behind the scenes just before they make a purchase; or

(3) pointing voice analysis at The Food Network looking out for recipes that call for fresh tomatoes, tracking tomato tankers in major sea lanes, monitoring storm tracks in the top tomato growing zones, etc, and adjusting your position appropriately.

It sounds like you are trying (1) when (3) might be better, or even (2) if you are not jail-averse (or your local jurisdiction has institutionalized high speed market fiddling to the point of being legal.)

KKKKkkkk1|5 years ago

Let's say I want to predict the crop yield of a field. Sure, looking at the yield in previous years would help. But the yield is just a nonlinear projection of a point in high-dimensional space that has dimensions like weather, water availability, pest infestations, farmer skill, etc. All of these dimensions are incredibly relevant to forecasting, but once we've projected our points onto the yield axis, most of this information is gone. So if you want to take advantage of this information, you need to do your fitting in the original high-dimensional space.

user5994461|5 years ago

>>> This makes me wonder what kind of trading systems can actually have any kind of edge.

The secret is simply to have an edge.

If you're trading on behalf of clients. You don't care what happens to the market because you don't depend on the high or low to make money.

If you're buying or selling for yourself, same thing. Guess who's buying coal and oil, power plants and refineries and assimilated. They sell what they have and buy what they need.

If you're making money on arbitrage, making sure the New York and the London stock exchange have the same USD to GBP to EUR price and vice versa. You could make money but you better be faster than other corporations and more careful at the same time because you're not the only one doing that. Anytime you buy one side, the other side might have changed because you can balance out.

There are clear factors that drive many markets. When the weather is cold people consume more energy for heating. When it's hot, they go out to make barbecues and buy more sausages. When there is a drought or crop sickness, wiping agriculture exploitation, prices of food and meat go up. That's some examples that are easy to understand.

The stock market is not about speculation. It's about buying real items in the real world and providing services.

hogFeast|5 years ago

There are lots of ways to produce an edge. Forex is slightly different because you are trading a currency (this actually makes things easier in some ways) but, a few years ago, a lot of the cutting edge was news releases.

So inflation comes out at X% and then you try to jump ahead of other people reacting to the news.

Speaking very generally, you are looking for data that has information about future returns. So this may include past values of the time series (this is kind of complex though because a stock price does trend, that company is investing capital to earn a return which compounds in the price so stationarity is...complex) but may include other time series/their past values i.e. price of other stocks, economic data, etc.

So this could be responding to changes in liquidity, it could be seeing some repeatable behaviour by investors and jumping ahead of it, etc.

Quant is not about adding to the efficiency of markets though. They aren't using these models to determine the value of something, they are more about looking at the value of other things to determine the value of a given asset. So these strategies end up being correlated to liquidity in a lot of instances (but not all). This is a generalisation but...it is a very odd thing to have occurring in society...would this exist if investors didn't have an irrational demand for microsecond liquidity? Probably not.

Also, determining whether something is a real signal is just part of statistics, isn't it? This has definitely been an area where there has been quite a lot of innovation as increases in computational power has made non-parametric stuff more feasible (I am not an expert on this, it is just my understanding).

Btw, I should add I used to work in finance and I have some experience with this kind of thing as I do quite a bit of "quant investing" but in gambling (it is far easier to just copy what people do in finance and apply it to gambling then come up with it yourself). And just based on my experience, it makes most sense to employ a mixed approach. So learn about the business valuation, and then build a five-factor model...watch what it does, and then filter its picks with your knowledge. A lot of quant strategies are vaguely ludicrous if you have an understanding of the fundamentals of investing, like you are trying to use a computer to replicate a human...and people wonder why it doesn't work? It is an overcomplicated shortcut (to give you a concrete example, the blowup of value and funds like AQR was very obvious...you just had to look at the utter garbage stocks they owned). So I think a combination of human and computer beats either separately (one fund that does is Marshall Wace).

dbs|5 years ago

You dont learn. There are two rules, you dont talk about what works and you dont talk about what doesnt work. I worked professionally in the quant space up to 2008, and i still get calls for interviews, with people wanting to dig out what i am doing nowadays. What is popular or known loses alpha pretty quickly due to overcrowding.

unknown|5 years ago

[deleted]

anonu|5 years ago

Just a reminder: nobody ever wrote about their super successful trading strategy. Its just never happened. If you have the wherewithal to research and build a trading system that works, then you're smart enough to know that the moment you reveal your edge to the world - it disappears. Even if you dont discuss the innards of your strategy, but you talk about your process or the system youre strategy is built on, you've revealed too much.

dennybritz|5 years ago

Yup. The only reason I am starting to write about it now is that I am no longer running the system. You could argue that it's not useful to write about systems that worked in the past, but I would disagree. New systems can work 99% the same way, but get an additional edge from somewhere else, like new data or better models. Most of the engineering will always be the same.

kristjansson|5 years ago

With the notable exception of Ed Thorp, who managed to write Beat the Market first, and then start a hedge fund to exploit the strategy 7 years later, and only when a reader proposed they go into business together.

Though it helped that the period was 1967 to 1974. The piranhas were a little slower back then

Erlich_Bachman|5 years ago

This is a simplified version of the truth. There is a lot of information that you can safely share because the number of people that will know where to look for it, know how to implement it, what to even do with it, how not to make any one of 100possible stupid mistakes while implementing it - is very low.

Example in point: Warren Buffet. All of his process is public knowledge, he constantly writes and talks about it. And yet somehow it didn't make him lose his edge.

marketgod|5 years ago

I agree except if your strategy is something everyone uses then it becomes a self-fulfilling prophecy of winning.

GrumpyNl|5 years ago

First thing that jumped my mind was moneytron, they were predicting the market, turned out to be a fraud.

unknown|5 years ago

[deleted]

halfcat|5 years ago

I find most of this article to be “successful people can’t explain why they are successful so they say a bunch of arbitrary things they’ve noticed”.

He found success pursuing relative advantages, infrastructure advantages, and building custom tools from scratch.

But absolute vs relative advantages, plumbing together canned solutions vs building your own from scratch, infrastructure-level advantages vs decision making advantages...all of those contrasts exist in other businesses everywhere. None of those are specific to trading.

> “in my experience, nothing beats learning by doing or finding a mentor”

This hits the nail on the head.

The best way to become a profitable trader is with a mentor, but it’s nearly entirely luck. You drive an Uber or tend bar and happen to make friends with someone successful who is willing to guide you. Trying to seek out a mentor online is nearly impossible, as everyone who is findable and willing is almost certainly a better marketer than trader.

The other way to become a profitable trader is to start trading with real money. It’s amazing how quickly one can learn how to mend a boat, when the boat starts sinking.

shoo|5 years ago

readers may also be interested in Benter's paper "Computer Based Horse Race Handicapping and Wagering Systems: A Report" -- https://www.gwern.net/docs/statistics/decision/1994-benter.p...

> This paper examines the elements necessary for a practical and successful computerized horse race handicapping and wagering system. Data requirements, handicapping model development, wagering strategy, and feasibility are addressed. A logit-based technique and a corresponding heuristic measure of improvement are described for combining a fundamental handicapping model with the public's implied probability estimates. The author reports significant positive results in five years of actual implementation of such a system. This result can be interpreted as evidence of inefficiency in pari-mutuel racetrack wagering. This paper aims to emphasize those aspects of computer handicapping which the author has found most important in practical application of such a system

Arguably the paper describes the state of the art from three decades ago, applied to betting on Hong Kong horse races, not market price movements.

mafm|5 years ago

The parent comment is the most useful one in the thread so far for anyone who seriously wants to learn about quantitative trading.

Sports betting is essentially the same thing as proprietary trading in financial markets. The paper gives a good summary of a technique that was very successful in its day.

There is very little publicly available material on quantitative techniques that are useful for proprietary trading. Lo and Mackinlay's "non-random walk down wall st" was good, but that's 20 years old.

The mathematical literature on gambling is a lot more accessible. It's also probably easier to consistently make at least small money gambling, because the barriers to entry are lower.

linus_torvalds|5 years ago

Yeah this is a great paper on the subject. Although horse betting is different than financial markets due to the parimutuel system.

rezahussain|5 years ago

Writing ai trading systems is the coding I do for fun since 2012. I'm a little under break even so far but I keep at it because find it so interesting. Since I started every single week I have learned a new way of thinking about a problem I encountered or a new approach to problems that still stand in my way.

Questions like, how do you choose a stoploss? Well you can pick it statistically based on history or you can use a supervised label. You can even use stock A calculated stoploss to pick the stoploss you use on stock B because you found a condition under which those two stocks became almost identicall correlated. How do you want to pick the supervised label? You can do spectral analysis to pick the stoploss too. You can use sentiment as a stoploss, source from google news or twitter or stocktwits.

It doesn't have to be, 'well I measured the average profitable stoploss to use over the last 10 years across all stocks and that isn't working so I quit'

Things like that, you get to fit the ideas together and then test them in the real world.

There are some things I would like to share.

1. Just because you have a good forecast doesn't translate into cash. It has to be paired with a trading strategy. This is probably why the author thinks the answer is RL, because coincidentally if you approach this problem with RL, it does the forecasting + strategy.

2. I have measured a correlation between heavier processing(using a higher big O) and better out of sample performance.

The criticisms with the NN approach like non stationary data have obvious solutions that a 'by the book' trading approach + ml approach don't really teach beginners so they dismiss it.

It is my belief right now that there are people who are prepping data from sources like iextrading then using things like sagemaker to develop good enough forecasting and combining it with a statistics+rules based trading strategy to make living wages.

That said, I have 5k account size for my NN obsessions, and my 401k is 'by the book'.

person_of_color is totally right when he says it is a Moby Dick of programming.

dennybritz|5 years ago

> Just because you have a good forecast doesn't translate into cash. It has to be paired with a trading strategy. This is probably why the author thinks the answer is RL, because coincidentally if you approach this problem with RL, it does the forecasting + strategy.

Exactly, this is one of the nice things about RL. You don't to do a bunch of handwaving to turn your predictions into a strategy.

discordance|5 years ago

It sounds like a lot of fun! I love the idea that there’s one metric ($) to measure the effectiveness of your strategy/code.

Any recommendations or hints on where to get started (assuming I’m decent with python/pandas etc)?

person_of_color|5 years ago

Don't do this. It's the programmers Moby Dick. You are better off self learning stats/ML skills in your free time and joining a quant fund than to try and do it yourself.

keyle|5 years ago

Agreed. It's a goose chase, the house always wins and even a winning system works one week and not the next.

DoctorOetker|5 years ago

I would love to try trading as a hobby with a little side money, but I would abhor a hobby that reduces to effectively buying the trader-feel-good experience, where you're essentially sponsoring incumbents as a fanboy chipping in his pocket money.

What I would require from a trading platform:

1) decentralized and permissionless 2) provably fair trading

With 'provably fair trading' I mean the protocol should be such that I can prove you are not simply held captive by an intermediary, regardless in what shape or form. It should also be fair with respect to latency.

For example consider a trading market where token X can be exchanged for token Y and vice versa. Each holder of X demands her minimum of Y per X, and each holder of Y demands his minimum of X per Y. What if everyone salty hashed their demands, and pays the market contract (proportional to how much they will actually be allowed to trade) to register their salted hash. When the round has closed, people reveal their salt and plaintext, and the incompatible trading offers get their money back (minus a usage fee perhaps). The compatible ones can have their trades go through at the rate of 'total compatible X offered' to 'total compatible Y offered' (or some variation thereof, say rewarding those that helped close the gap). In this way there is no high frequency trading, and you could have a family of such markets operating at different timescales...

mfalcon|5 years ago

I've never tried the AI trading path but I imagine that you can't get huge gains with public data, unless you find a way to extract "hidden" information by processing real time news.

I wonder nevertheless if there's a sweet spot where you can build a simple AI trading algorithm and get modest earnings from it.

nv-vn|5 years ago

I think the answer is yes & no. If you come up with a sufficiently clever strategy using public data that other people haven't thought to use it's definitely doable. For example, someone with a good understanding of meteorology would've had a significant advantage a few decades ago (though trading firms have since caught on). You wouldn't need a perfect data set if the strategy isn't being used.

In terms of strategies based purely on market data, you are definitely correct. Any publicly (freely/cheaply) available market data is low resolution, lacking the full data from any point in time, and generally based on poor approximations of the actual data (elsewhere in this thread someone mentions that IEX's data is based on trades that get routed through the IEX exchange, which obviously misses any data you could get from the markets that make up 99% of the volume, dark pools, etc.).

I think the "sweet spot" is simply coming up with a strategy that nobody else has thought about, or else executing a better-known strategy more effectively than other market participants. Both are hard, but somewhat in the realm of possibility. The problem is that many people think there's free money to be made without either of these.

unknown|5 years ago

[deleted]

pinouchon|5 years ago

For the last year or so I have been working on a ML-based trading system in the domain of crypto with two friends. I made more in 2 months than I used to in a year. This is after thousands of "full positions swings" and millions of trades (short and long). We are now experimenting with different classes of trading strategies to reduce risk.

We would like to find 1 or 2 more people to work on this project, we need people who can tolerate risk and skilled at data engineering: data pipelines, psql, pandas, numpy, data visualisation, setting up servers. Ideally also skilled at machine learning / deep learning and who has tried his hand at trading systems. If interested, my email is in my about info.

thedudeabides5|5 years ago

"Actually, many months my PnL graph looked something like this: (this is generated to get a point across, but my real data looked extremely similar):"

I'd love to see the actual data

star-trek-fleet|5 years ago

What's the real performance of the system so far?

nv-vn|5 years ago

Probably not very good. Voleon does all ML-based trading and what I've seen of their returns does not give me any confidence in ML-based trading having alpha. I would estimate that at best in a good year returns would be like 5% y/y in the long term, much less than the sustained ~7% that index funds offer especially when adjusting for risk. Just speculation but there's a lot of firms with much more capital, better tools, and teams of extremely intelligent people who have pretty poor returns because of how good the competition already is.

mraza007|5 years ago

Just curious to know do financial firms have implemented something similar to this

nv-vn|5 years ago

Modern finance is built on top of this type of technology. There are hundreds if not thousands of firms participating in "quantitative finance," attempting to use computers/statistics to predict markets. The vast majority of trades go through major practitioners of this exact idea.

MichaelRazum|5 years ago

To be honest. I don’t believe a word about the performance using AI. Especially if the article doesn’t present the features and the NN architecture. Its always the question: would a super simple model perform the same way? And very ofter the answer is yes.

rawoke083600|5 years ago

Lol this.. I once build a model (btc) that assumes this "we can't know the direction of market so we might as well guess", after spending months reading papers and trying to be "clever"

It starts off picking a random market direction (up/down) places bid (sorry I mean makes a trade). Then based on lots of tuning/backtest decided how long to be in position and what is the stoploss.. Think in the end the most "profitable settings" where something like :

$proft_size = 0.38% $stop_loss_size = 0.35%

Win-Continue-Direction = 3 rounds (after winning/losing do we change direction) So it probably in the end was Markov-model with random-start - if we had to label it :)

Oh and for fun it would also "martingale for x rounds" :P

Worked quite well for 3-4 days and was fun implementing it while watching "Billions" on TV in the background :D

unknown|5 years ago

[deleted]

The_rationalist|5 years ago

[deleted]

linus_torvalds|5 years ago

"Then, profits started decreasing and I decided to move on to other things and I lacked the motivation to go back into it."

Is this post about the one with decreasing profits, or a new one that is profitable?

unknown|5 years ago

[deleted]

known|5 years ago

trading != investing

dilandau|5 years ago

In some markets it is necessary to put the same length of fiber-optic cable between the colocated servers, so that being closer to the exchange's cabinet doesn't translate into an advantage. So obviously we're talking about extremely low latency, high-frequency trading. This carries a huge amount of prerequisites to even get started.

Not only are that, but there are many different order types besides "buy at market price, sell at market price". Then there's options, short sales, and more.

It goes deep. People devote 30 years of their career to this. Read the authors experience as a kind of warning, if you will.

nv-vn|5 years ago

Agree to an extent, but not all money in quant finance is generated through HFT. Notably, I don't think funds like Rentec are really doing much to get low-latency [1]. Latency obviously does matter for any kind of quant trading, but to my understanding a good enough strategy + slippage models & the likes can overcome this.

[1] You can find a list of NYSE broker/dealers here: https://www.nyse.com/publicdocs/nyse/markets/nyse/members/NY... -- any firm where latency matters will need to be on this list to colo on the exchange.

103 comments