Big Data and Deep Learning, a technology revolution in trading or yet another hype?


  • BigData and DeepLearning are popular buzz words nowadays. But the number of the genuine success stories is relatively small.
  • In trading the BigData technology is mostly associated with automatic analysis of the news and sentiment in social networks. But unless you are Google or Reuters, you will never be the one who gets the news first. Additionally, a market reaction both to news and sentiment is often vague and amorph.
  • Large deep neural networks closely resemble a human brain, which also has a lot of neurons, interconnected in many layers. But it doesn't mean a breakthrough to a real artificial intelligence: all is not gold that glitters.
  • a positive side: trading is only a part of the financial world. Likely, BigData + DeepLearing has a high potential in adjacent areas like risk profiling and credibility analysis.

More and more we hear about the big data revolution. There is a little bit less (but still a lot of) hype about the deep learning because the journalists and the "strategists with vision" can apprehend the Big Data more easily. (Indeed, it is easy to imagine and communicate a huge server cluster of a big brother watching you. But in order to grasp the idea of deep learning one should know at least the basic principles of the neural networks). Anyway, both BigData and DeepLearning are popular buzzwords.

Before I start with my critical review let me briefly tell about my background. I am not a specialist in deep neural networks. But I am an experienced data analyst, IT specialist and a successful portfolio manager. In particular, I built my own Hadoop cluster and can beat the DAX.

Let us begin with the deep neural networks. There is one obviously genuine success story: image recognition and classification by convolutional neural networks. But can it (straightforwardly) be applied to the trading?! Well, chart pattern recognition is only case that occurs to me. However, a big question is whether the patterns of technical analysis really work. Lo et al(2000) affirm they are statistically significant (which does not yet mean that they are applicable in practice) but I tried to reproduce their results with more recent data and obtained no statistical significance, let alone practical applicability.

Less known but also successful are recurrent neural networks. They are used for speech recognition and text processing and they are distinguished for their long memory: "LSTM can learn "Very Deep Learning" tasks that require memories of events that happened thousands or even millions of discrete time steps ago. Problem-specific LSTM-like topologies can be evolved  LSTM works even when there are long delays, and it can handle signals that have a mix of low and high frequency components". Thus they may be applied to the analysis of (financial) time series, which (in my opinion) do have a long memory and obviously have high and low frequency components (short term speculators and long term investors). However, (to my knowledge) there is still no prominent scientific researches on this topic. Probably it is due to a well-known dilemma: in trading those who know do not speak (and those who speak often do not know).

A special case is AlphaGo, which has won against the Go champion Lee Sedol. AlphaGo used a complicated mixture of algorithms, inter alia the convolutional neural networks (in a sense analogous to image recognition). However, AlphaGo is exclusively tuned for playing Go and unlikely can be easily re-adjusted for another kind of problems.
As to BigData, there are more real success stories. Obvious cases are the search engines and social networks. Netflix is also fabled for its success.
As to finance, I, myself, hold the credibility analysis of retail borrowers in developing countries for very promising. Many of them have no credit history but their behavior in social network may very likely tell something about their credibility.
Also in high-frequency algorithmic trading there is no way without big data infrastructure. However, the genuine question is whether there are really genuine signals at the tick-level data! For example a trend (signal) can dominate over volatility (noise) only in relatively long-term, on the shorter time frame the situation is converse. I know a couple of cases as high-frequency traders deliberately aggregated the data to minute or even daily frame because on this scale the signals were better visible (let alone it drastically reduced computational costs)...

Last but not least there is very much hype about the news and sentiment analysis (that engage both BigData and Deep Learning). I immediately see at least two problems with it.
First, you never know how a market will react to the news. Here are just some examples of controversial market reaction: the news were (very) positive but Infineon and Daimler stocks fell (links are in German)!
Second, the sentiment in social networks is actually the sentiment of the crowd. And it is an eternal question whether one should invest with crowd or be a contrarian.

To sum up I can give only one universal advice: if you hear about "a breakthru in technology that can predict the market" or something like this, don't get too excited even if it is based on DeepLearning and BigData. First of all have a closer look at its historical track record, then observe it live for at least several months and only if it does keep performing well, there may be something genuine.

Like this post and wanna learn more? Have a look at Knowledge rather than Hope: A Book for Retail Investors and Mathematical Finance Students

FinViz - an advanced stock screener (both for technical and fundamental traders)

Author: Vasily Nekrasov

Founder of