Machine Learning Toolkits

Machine learning is often used for various aspects of a dialogue system, for example, in recognizing or selecting a dialogue act given an utterance or other dialogue state information. There are a wide variety of technique and tools available. This page lists some of the more common and popular tools that are available, but there are many many others.

Name Language License Overview
Caffe (external) C++ BSD 2-Clause license A high performance neural network library from the Berkeley Vision and Learning Center. It is particularly well suited to convolutional networks that are common for image processing tasks.
Deeplearning4j (external) Java Apache v2.0 A high level neural network library written in Java that supports both CPU and GPU backends. It is not used much (at all) in academia, but it is actively developed and looks promising.
JSAT (external) Java GPL v3 A good general machine learning toolkit, but not a lot of documentation. However, if you are familiar with other toolkits it is not hard to use. It has more up to date algorithms than WEKA and usually faster implementations. While the algorithms don't appear buggy other aspects of the toolkit seem less polished (e.g., some serialization problems).
Keras (external) Python MIT A high level wrapper around lower level neural network libraries. It started out just for Theano, but I it now supports TensorFlow, which is now the most common backend.
Lasagne (external) Python MIT Another high level wrapper around Theano for designing and working with neural networks.
Mahout (external) Java Apache v2.0 This contains several traditional supervised and unsupervised machine learning algorithms that are designed to be highly scalable and used on large clusters.
Libsvm (external) C Modified BSD One of the most commonly used Support Vector Machine libraries with bindings for many languages.
Mallet (external) Java Common Public License 1.0 A traditional machine learning toolkit especially good for graphical models and conditional random fields.
Scikitl-learn (external) Python BSD An easy to use traditional machine learning toolkit with a large number of algorithms and great documentation.
Stanford Classifier (external) Java GPL v2.0 (commercial also available) A standard maximum entropy classifier.
Theano (external) Python BSD Theano is also a machine learning framework for constructing computational graphs that are well suited to developing neural network algorithms. It is relatively low-level (compared to Keras), so it is easy to customize, but also more difficult to learn. It is especially good at recurrent neural networks, at least in relation to Caffe and Tensor Flow. It is one of the more popular frameworks so there is a lot of documentation and tutorials for it.
Torch (external) Lua/C BSD Another machine learning framework oriented towards neural networks and deeplearning. It is highly recommended by many prominent deeplearning researchers.
Weka (external) Java GPL v3 It is a standard toolkit used in many courses and lots of formal and informal documentation about it. It has a good GUI for running exploratory experiments, although the API is clunky and not as easy to use as Scikit. It also tends to be out-dated with not a lot of modern classifiers and many of the implementations are not particularly efficient. It is better as a learning tool than a usable toolkit.
PyTorch (external) Python/Java/C++ BSD PyTorch is an open-source machine learning library based on the Torch library that is used for applications such as computer vision and natural language processing. It is primarily developed by Facebook's AI Research lab (FAIR). It is free and open-source software released under the Modified BSD license.
Tensorflow 2 (external) C++/Python Apache v2.0 The new Tensorflow 2 removes redundant APIs, makes APIs more consistent (Unified RNNs, Unified Optimizers), and better integrates with the Python runtime with Eager execution.
Tensor Flow (external) C++/Python Apache v2.0 TensorFlow is a free and open-source software library for dataflow and differentiable programming across a range of tasks. It is a symbolic math library, and is also used for machine learning applications such as neural networks.