AT&T Watson

AT&T WATSON^SM converts between different communication modalities, allowing for humans and devices to interact more readily. It consists of a general-purpose engine and a collection of plugins, each of which performs a conversion or analysis task. These tasks, many involving speech and language, can be combined in various ways, depending on what information is being communicated.

One common use of WATSON is to convert human speech to text that can be readily interpreted by a device or other machine. In this case, the output might be simple text, or WATSON can perform the additional step of parsing the text so the human’s intent can be determined and communicated to the device. It works the other way, too; WATSON can take content generated by a machine and convert it to speech or text for humans to understand.

Essentially WATSON takes some input, analyzes it, performs one or more services, and returns a result, all in real time.

WATSON can not only convert from speech to text but can combine speech with other modalities, such as a touch-screen tap (“show me the closest Starbucks, here”) or other gesture, and send the information to a device. WATSON also converts from speech to speech to do translations, even involving multiple languages: speech input in one language can be converted to text in real time, followed by a text translation (with little delay), followed by the spoken translated sentence at sentence end.