Dragon Professional Speech Recognition software logo.

For many years, the only speech-to-text applications available were costly proprietary apps such as Dragon Speech-to-Text software. Speech recognition software is now available in several apps, both on- and offline, and most of these services are free.

1. Terminology

Speech recognition (or voice recognition) software is any software able to recognise human speech. This can be used to operate software and or a device.

Speech-to-Text software types out the words that the user speaks into the microphone. Many businesspeople dictate to a typist when they need something typed out. This person would either sit with a typist who would type while they dictated (spoke) or make a recording using a dictaphone which the typist could transcribe (type out). If you watch one of the legal series or movies on TV, you will have seen the person in court who types out everything as it is spoken as a legal record.

Transcribe software is like Speech-to-Text software, but an audio file is uploaded to the software, and it returns the text.

2. Speech recognition

The following software can recognise and respond to spoken commands:

  • Siri, Apple’s virtual personal assistant
  • Cortana, Microsoft’s personal assistant included in Windows 10.
  • Google Voice Search allows users to use Google Search by speaking.
  • Google Assistant, a virtual assistant.
  • Alexa, Amazon’s personal assistant.

3. Speech-to-Text

Software capable of speech recognition that outputs a text version of the spoken words.

3.1 Word Dictate

3.2 Google Docs Voice typing

Open a Google Doc. Select the Tools menu and then select the Voice typing option. You will need to grant permission for the browser to access your microphone.

The words you speak into the microphone will be typed in the Doc.

4. Transcribe

4.1 Word Transcribe

Upload a file with audio for the text from the audio to be returned.

Microsoft Word 365 Transcribe feature.
Word 365’s Transcribe feature.
Uploading an audio file to the Microsoft Word 365 Transcribe feature.
Uploading an audio file for transcription.

4.2 Cloud Speech-to-Text

When you select the option to upload a file, the following options are available:

*.opus, *.flac, *.webm, *.weba, *.wav, *.ogg, *.m4a, *.oga, *.mid, *.mp3, *.aiff, *.wma, *.au, *.ogm, *.wmv, *.mpg, *.webm, *.ogv, *.mov, *.asx, *.mpeg, *.mp4, *.m4v, *.avi


References:

  1. Google Assistant (2023) Wikipedia. Wikimedia Foundation. Available at: https://en.wikipedia.org/wiki/Google_Assistant (Accessed: 12 November 2023).
  2. Google Voice Search (2023) Wikipedia. Wikimedia Foundation. Available at: https://en.wikipedia.org/wiki/Google_Voice_Search (Accessed: 12 November 2023).
  3. Cortana (virtual assistant) (2023) Wikipedia. Wikimedia Foundation. Available at: https://en.wikipedia.org/wiki/Cortana_(virtual_assistant) (Accessed: 12 November 2023).
  4. Speech-to-Text: Automatic Speech Recognition | Google Cloud (no date) Google. Google. Available at: https://cloud.google.com/speech-to-text/ (Accessed: 12 November 2023).

By MisterFoxOnline

Mister Fox AKA @MisterFoxOnline is an ICT, IT and CAT Teacher. He has a passion for technology and loves to find solutions to problems using the skills he has learned in the course of his IT career.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.