November 16, 2020

ML misconceptions (10): neural networks are not hard to implement

by Sam Sandqvist

The are a lot of misconceptions regarding neural networks. This article, and subsequent ones on the topic will present the major ones one need to take into account.(1)

Blog 17-1
Speaking from experience, neural networks are quite challenging to code from scratch.

Luckily there are now hundreds open source and proprietary packages which make working with neural networks a lot easier. They may be implemented in virtually computer language, whether the ever-popular Java and Python, to C# and C++, and statistics oriented packages like R and Matlab.

Below is a list of packages which you may find useful for diverse applications. Many are intended as learning tools, but some are robust, industrial-strength implementations.

The list is NOT exhaustive.

Name

Source

Description

Caffe

 http://caffe.berkeleyvision.org/

Caffe is a deep learning framework made with expression, speed, and modularity in mind.

Encog

 http://www.heatonresearch.com/encog/

Encog is an advanced machine learning framework that supports a variety of advanced algorithms, as well as support classes to normalize and process data. Machine learning algorithms such as Support Vector Machines, Artificial Neural Networks, Genetic Programming, Bayesian Networks, Hidden Markov Models, Genetic Programming and Genetic Algorithms are supported. Most Encog training algoritms are multi-threaded and scale well to multicore hardware. Encog can also make use of a GPU to further speed processing time. A GUI based workbench is also provided to help model and train machine learning algorithms.

TensorFlow

 http://www.tensorflow.org

TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code.

DMTK

 http://www.dmtk.io/

DMTK includes the following projects: DMTK framework(Multiverso): The parameter server framework for distributed machine learning. LightLDA: Scalable, fast and lightweight system for large-scale topic modeling. Distributed word embedding: Distributed algorithm for word embedding. Distributed skipgram mixture: Distributed algorithm for multi-sense word embedding.

Azure Machine Learning

 https://azure.microsoft.com/en-us/services/machine-learning

The machine learning / predictive analytics platform in Microsoft Azure is a fully managed cloud service that enables you to easily build, deploy, and share predictive analytics solutions. This software basically allows you to drag and drop pre-built components (including machine learning models) and custom-built components which manipulate data sets into a process. This flow-chart is then compiled into a program and can be deployed as a web-service. It is similar to the older SAS enterprise miner solution except that is it more modern, more functional, supports deep learning models, and exposes clients for Python and R.

Theano

 http://deeplearning.net/software/theano/

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation. Theano, is more broadly applicable than just Neural Networks. It is a framework for implementing existing or creating new machine learning models using off-the-shelf data-structures and algorithms.

Torch (PyTorch)

http:/www.torch.ch

Torch is a scientific computing framework with wide support for machine learning algorithms ... A summary of core features include an N-dimensional array, routines for indexing, slicing, transposing, an interface to C, via LuaJIT, linear algebra routines, neural network, energy-based models, numeric optimization routines, Fast and efficient GPU support, Embeddable, with ports to iOS, Android and FPGA. Like Tensorflow and Theano, Torch is more broadly applicable than just Neural Networks. It is a framework for implementing existing or creating new machine learning models using off-the-shelf data-structures and algorithms.

SciKit Learn

 http://scikit-learn.org/stable/

SciKit Learn is a very popular package for doing machine learning in Python. It is built on NumPy, SciPy, and matplotlib. Open source, and exposes implementations of various machine learning models for classification, regression, clustering, dimensionality reduction, model selection, and data preprocessing.

Neurolab

https://pythonhosted.org/neurolab/index.html#

A library of basic neural networks algorithms with flexible network configurations and learning algorithms for Python. To simplify the using of the library, interface is similar to the package of Neural Network Toolbox (NNT) of MATLAB. The library is based on the package numpy, some learning algorithms are used scipy.optimize

My personal favourites? TensorFlow and Scikit Learn. These provide a robust platform, as well as a host of additional tools for the data science practitioner.

(1) The inspiration for the misconceptions is adapted from an article by Stuart Reid from 8 May 2014 available at http://www.turingfinance.com/misconceptions-about-neural-networks/.

Sam Sandqvist
AUTHOR

Sam Sandqvist

Dr Sam Sandqvist is our in-house Artificial Intelligence Guru. He holds a Dr. Sc. in Artificial Intelligence and is a published author. He is specialized in AI Theory, AI Models and Simulations. He also has industry experience in FinServ, Sales and Marketing.