September 28, 2020

ML misconceptions (3): NNs come in many architectures

by Sam Sandqvist

The are a lot of misconceptions regarding neural networks. This article, and subsequent ones on the topic will present the major ones one need to take into account.(1)

Blog 10-1

Image source:

There are many different neural network architectures and the performance of any neural network is a function of its architecture and weights. Many modern day advances in the field of machine learning do not come from rethinking the way that perceptrons and optimisation algorithms work but rather from being creative regarding how these components fit together.

Let us review some of the most common NN architectures, starting with one that most people regard as the canonical NN.

Recurrent Neural Networks

Blog 10-2

Recurrent Neural Networks, or RNNs, where some or all connections flow backwards meaning that feedback loops exist in the network. These networks are believed to perform better on time series data.

As such, they may be particularly relevant in the context of the financial markets. Some examples of this architecture are shown below.

Note that these simple examples do not really have hidden layers. Networks with multiple hidden layers are call ed deep.

Deep neural networks

Blog 10-3

These are neural networks with multiple hidden layers. Deep neural networks have become extremely popular in more recent years due to their unparalleled success in image and voice recognition problems.

The number of deep neural network architectures is growing quite quickly but some of the most popular architectures include deep belief networks, convolutional neural networks (CNNs, very popular for image recognition), deep restricted Boltzmann machines, stacked auto-encoders, and many more.

An example of a deep CNN network architecture is shown below.

One of the biggest problems with deep neural networks, especially in the context of financial markets which are non-stationary, is overfitting. Overfitting essentially means that the network has more free states than data presented to it in the training phase. Consequently, it will ‘learn’ all the data and make perfect outputs for the training data set. However, they are then very poor for data inputs outside that set, because the net has not learnt to generalise, something that is crucial for proper use of a NN on unknown data.

Adaptive neural networks

They are neural networks which simultaneously adapt and optimise their architectures whilst learning.

This is done by either growing the architecture (adding more hidden neurons) or shrinking it (pruning unnecessary hidden neurons).

Adaptive neural networks aremost appropriate for financial markets because markets are non-stationary. This because the features extracted by the neural network may strengthen or weaken over time depending on market dynamics.

The implication of this is that any architecture which worked optimally in the past would need to be altered to work optimally today. Examples of adaptive nets are shown below.

Cascade neural network and self organizing map

In summary, many hundreds of neural network architectures exist and the performance of one neural network can be significantly superior to another.

As such, quantitative analysts interested in using neural networks should probably test multiple neural network architectures and consider combining their outputs together in an ensemble to maximise their investment performance.

As is so often said, all your models are wrong, but some are useful…

(1) The inspiration for the misconceptions is adapted from an article by Stuart Reid from 8 May 2014 available at

Sam Sandqvist

Sam Sandqvist

Dr Sam Sandqvist is our in-house Artificial Intelligence Guru. He holds a Dr. Sc. in Artificial Intelligence and is a published author. He is specialized in AI Theory, AI Models and Simulations. He also has industry experience in FinServ, Sales and Marketing.