“Prediction is very difficult, especially if it's about the future.”
— Niels Bohr, Nobel Prize-winning physicist
Watch out, investment professionals — machine learning is coming to a company like yours. This subset of artificial intelligence isn't just for programming self-driving cars or sorting cat pictures. It's entering the investment management space, and its disruptive potential is only beginning to emerge.
From Siri and Alexa to Amazon and IBM's Watson, computer programs driven by artificial intelligence draw on massive amounts of data to solve previously intractable problems. Machine learning gives computers the additional ability to learn without being explicitly programmed. This type of AI enables computers to change — to learn — when exposed to new data.
The technology behind machine learning is being propelled by major algorithmic innovations that allow machines to synthesize extremely large data sets and reveal patterns, trends and associations that are relevant to prediction problems. And the increasing ubiquity of inexpensive parallel computation is making this technology accessible to even lean startups.
The technology already has transformed many industries, from the medical to the automotive. In addition, machine learning is widely seen as a leading driver of revenue at Google, Facebook and Amazon. However, its adoption in investment management so far has been limited. With the exception of a few leading hedge funds, the industry has failed to recognize machine learning's potential to drive investment decisions.
Algorithms that continuously improve
ML automates the discovery of predictive algorithms that are able to continuously improve as they get access to more data. Recently, the focus has been on automating many of the tasks traditionally performed by data scientists, including data cleaning, model selection, data clustering, automatic feature generation and dimensionality reduction.
One technique, deep learning, has been responsible for many recent breakthroughs, including learning to play the game of Go well enough to beat the world's third-ranked player. Deep learning is enabling image recognition that is on par with human abilities and is significantly improving speech recognition and language translation; it is also permitting better story and ad targeting at places like Google and Facebook. Part of what makes deep learning so powerful is that it can organize and aggregate large unlabeled data sets into abstracted forms, which are more useful for prediction. The results have been stunning, both in speed and accuracy.
What does this mean for investment management?
We believe that machine learning will transform the way investment strategies are administered by all types of managers. Even the most fundamental, non-quantitative managers will be generating ideas from data that originally was sourced and synthesized via ML. For example, deep learning's ability to create structured data could be used to extract topic and sentiment from text sources such as earnings calls, SEC filings and social media; or for the analysis of satellite imagery for parking lot or crop data; or to evaluate location data from mobile phones.
While quant managers will certainly use these new data sources, they will also be able to use machine learning to conquer the classic hazard of overfitting. The problem with overfitting lies in the temptation for investment committees, or their portfolio managers, to believe they could have discovered a data-driven relationship before they really could have or to mistake spurious correlation for causation.
Overfitting often starts when data scientists take their favorite methodology, set up a number of input features and put in place the parameters of their method and a utility function. Then they run their algorithm on a part of their data (the training set), and look at results on another part of their data (the hold-out set). When invariably, things don't work as predicted, the researchers will tweak their parameters, or the features of their data, or the algorithm being used. Over time, the results are overfitted, as the value of the hold-out set becomes contaminated with each new test.
A solution to overfitting
With machine learning, we can minimize overfitting by restricting the role of the human to setting the overall investment framework. This framework will include the stock universe, trading frequency, performance benchmarks, data sources for signals and risk constraints. We leave the discovery of the specific formulation of the investment strategy to the system.
The framework will also specify the types of ML techniques that the machine will have access to as it discovers new strategies and decides which ones to trade. Essentially, the role of the quant will move to a higher-level function. The more routine steps of testing, scoring and tuning different structures will be handled by the machine, which will iterate through history in much the same way as humans do in real time (i.e. without knowing about the future).
Advances in machine learning will allow further automation of tasks, including feature discovery, algorithm selection and even the optimization of trading code that implements a signal. With these newer methods, humans can spend their time creating frameworks and obtaining new data sets. Automated methods also reduce the number of quants needed to run a firm, which, given the high cost of salaries, is increasingly important as well.
The tools that enable automated strategy discoveries will also enable customized solutions. This is because the criteria for a successful strategy can be specified upfront in the framework. For instance, one approach could search for a strategy with a specified maximum deviation from a classic index, such as U.S. small-cap value, with as much additional alpha as possible. This type of customization could lead to a new class of retail investment products.
We believe machine learning will become increasingly important for asset management and that most firms will be utilizing either machine learning tools or data within five years. Human involvement will still be critical for risk management and framework selection, but increasingly the strategy innovation process will be automated.
David Andre is CEO and director, and Conrad Gann is chief operating officer, at Cerebellum Capital, San Francisco. This content represents the views of the authors. It was submitted and edited under P&I guidelines, but is not a product of P&I's editorial team.