Kaldi is a toolkit for speech recognition written in C++ and licensed under the Apache License v2.0. Kaldi is intended for use by speech recognition researchers. For more detailed history and list of contributors see History of the Kaldi project.


Kaldi is similar in aims and scope to HTK. The goal is to have modern and flexible code, written in C++, that is easy to modify and extend. Important features include:

  • Code-level integration with Finite State Transducers (FSTs)
    • Compiles against the OpenFst toolkit (using it as a library).
  • Extensive linear algebra support
    • Includes a matrix library that wraps standard BLAS and LAPACK routines.
  • Extensible design
    • As far as possible, algorithms are provided in the most generic form possible. For instance, decoders are templated on an object that provides a score indexed by a (frame, fst-input-symbol) tuple. This means the decoder could work from any suitable source of scores, such as a neural net.
  • Open license
    • The code is licensed under Apache 2.0, which is one of the least restrictive licenses available.
  • Complete recipes
    • The goal is to make available complete recipes for building speech recognition systems, that work from widely available databases such as those provided by the Linguistic Data Consortium (LDC).