{"id":48,"date":"2016-06-09T20:44:36","date_gmt":"2016-06-09T20:44:36","guid":{"rendered":"http:\/\/dialport.ict.usc.edu\/?page_id=48"},"modified":"2018-03-05T22:54:45","modified_gmt":"2018-03-05T22:54:45","slug":"speech-recognizersasrs","status":"publish","type":"page","link":"https:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/","title":{"rendered":"Speech Recognizers\/ASRs"},"content":{"rendered":"<p>Automatic Speech Recognition (ASRs) are tools that allow spoken words and sentences to be transcribed into text. This can make it easier to communicate with a dialogue system\u00a0by allowing the user to speak audibly instead of typing\u00a0out utterances.<\/p>\n<p>[table sort=&#8221;asc&#8221;]<br \/>\nName[attr style=&#8221;width: 250px;&#8221;], Level, Location, Cost, License, Overview<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/google-asr\/\">Google Speech<\/a><strong>,<\/strong> High, Cloud, <a href=\"https:\/\/cloud.google.com\/pricing\/\">Paid<\/a>, Proprietary, A powerful neural network based cloud service that recognizes over 80 languages<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/bing-speech-api\/\">Bing Speech<\/a>, High, Cloud, <a href=\"https:\/\/azure.microsoft.com\/en-us\/pricing\/details\/cognitive-services\/speech-api\/\">Paid<\/a>, Proprietary, Also adds formatting to text (eg. punctuation; capitalization; masking profanity; etc.)<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/api-ai-asr\/\">API.AI<\/a>, High, Cloud, Free, <a href=\"https:\/\/creativecommons.org\/licenses\/by-sa\/4.0\/\">CC 4.0<\/a>, Open-source model based on Kaldi with open source scripts to run it also available<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/alexa-voice-service\/\">Alexa Voice<\/a>, High, Both, Free, <a href=\"https:\/\/developer.amazon.com\/public\/support\/pml.html\">Custom<\/a>, Amazon&#8217;s Alexa Voice Service\u00a0includes a full set of NLP tools including ASR<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/wit-speech-api\/\">Wit Speech<\/a>, High, Cloud, Free, <a href=\"https:\/\/wit.ai\/terms\">Custom<\/a>, Facebook&#8217;s wit.ai includes a &#8220;Speech to JSON&#8221; feature<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/watson-speech-to-text\/\">IBM Watson<\/a>, High, Cloud, <a href=\"https:\/\/www.ibm.com\/watson\/developercloud\/speech-to-text.html#pricing-block\">Paid<\/a>, Proprietary, IBM offers first thousand minutes of speech-to-text\u00a0for free each month<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/htk\/\">HTK<\/a>, Low, Local, Free, <a href=\"http:\/\/htk.eng.cam.ac.uk\/docs\/license.shtml\">Custom<\/a>, The\u00a0Hidden Markov Model\u00a0ToolKit works with HMMs\u00a0geared toward speech recognition but is\u00a0flexible<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/cmu-sphinx\/\">CMU Sphinx<\/a>, Low, Local, Free, <a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/cmu-sphinx\/cmu-sphinx-license\/\">BSD-style<\/a>, Low resource speech recognition that can even be used on mobile<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/kaldi\/\">Kaldi<\/a>, Low, Local, Free, <a href=\"https:\/\/www.apache.org\/licenses\/LICENSE-2.0\">Apache 2.0<\/a>, Open source project designed to be as flexible in its use as possible.<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/julius\/\">Julius<\/a>, Low, Local, Free, <a href=\"http:\/\/julius.osdn.jp\/LICENSE.txt\">Custom<\/a>, High-performance open source large vocabulary continuous speech recognition software<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/speechmatics\/\">Speechmatics<\/a>, High, Cloud, <a href=\"https:\/\/www.speechmatics.com\/pricing\/\">Paid<\/a>, Proprietary, Recurrent Neural Network based speech recognition and text-video time alignment<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/vocapia-speech-to-text-api\/\">Vocapia<\/a>, High, Cloud, <a href=\"http:\/\/www.vocapia.com\/speech-to-text-api.html\">Paid<\/a>, Proprietary, Offers cloud-based speech recognition and other features including language recognition<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/simon\/\">Simon<\/a>, High,Local, Free, <a href=\"https:\/\/userbase.kde.org\/KDE_UserBase_Wiki:Copyrights\">GNU 1.2<\/a>, Wrapper around low-level tools including <a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/cmu-sphinx\/\">CMU Sphinx<\/a>\u00a0<a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/julius\/\">Julius<\/a>\u00a0and <a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/htk\/\">HTK<\/a><\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/jasper\/\">Jasper<\/a>, High, Local, Free, <a href=\"http:\/\/jasperproject.github.io\/documentation\/license\/\">MIT<\/a>, Speech recognition designed for easy use with Raspberry Pi<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/openears-ios\/\">OpenEars<\/a> (iOS), High, Local, Free, <a href=\"https:\/\/www.politepix.com\/openears\/support\/#Q_What_license_does_OpenEars_use\">Politepix<\/a>, Uses\u00a0<a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/cmu-sphinx\/\">CMU Sphinx<\/a>\u00a0in a free-to-use mobile framework for iOS<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/resources\/speech-recognizersasrs\/apple-diction\/\">Apple Dictation<\/a>, High, Local, Free, <a href=\"https:\/\/www.apple.com\/legal\/sla\/\">Proprietary<\/a>, The\u00a0built-in dictation feature of OSX<\/p>\n<p><a href=\"http:\/\/dialport.ict.usc.edu\/index.php\/microsoft-speech-platform\/\">Microsoft Speech Recognition<\/a>, High, Local, <a href=\"https:\/\/azure.microsoft.com\/en-us\/pricing\/\">Paid<\/a>, <a href=\"https:\/\/github.com\/Azure-Samples\/Cognitive-Speech-STT-Windows\/blob\/master\/LICENSE.md\">MIT License<\/a>, Microsoft&#8217;s speech recognizer. <\/p>\n<p>&nbsp;<\/p>\n<p>[\/table]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Automatic Speech Recognition (ASRs) are tools that allow spoken words and sentences to be transcribed into text. This can make it easier to communicate with a dialogue system\u00a0by allowing the user to speak audibly instead of typing\u00a0out utterances. [table sort=&#8221;asc&#8221;] Name[attr style=&#8221;width: 250px;&#8221;], Level, Location, Cost, License, Overview Google Speech, High, Cloud, Paid, Proprietary, A [&hellip;]<\/p>\n","protected":false},"author":19,"featured_media":0,"parent":19,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-48","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/dialport.ict.usc.edu\/index.php\/wp-json\/wp\/v2\/pages\/48","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dialport.ict.usc.edu\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/dialport.ict.usc.edu\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/dialport.ict.usc.edu\/index.php\/wp-json\/wp\/v2\/users\/19"}],"replies":[{"embeddable":true,"href":"https:\/\/dialport.ict.usc.edu\/index.php\/wp-json\/wp\/v2\/comments?post=48"}],"version-history":[{"count":0,"href":"https:\/\/dialport.ict.usc.edu\/index.php\/wp-json\/wp\/v2\/pages\/48\/revisions"}],"up":[{"embeddable":true,"href":"https:\/\/dialport.ict.usc.edu\/index.php\/wp-json\/wp\/v2\/pages\/19"}],"wp:attachment":[{"href":"https:\/\/dialport.ict.usc.edu\/index.php\/wp-json\/wp\/v2\/media?parent=48"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}