Auto-Transcription for Ease of Authoring

By Chandna Mitra

Are you an author or a translator in a software company, a law firm, a media house, or any other industry? Do you have an impending deadline that gives you sleepless nights? Or are you vulnerable to typing? 

Well… if your answer to all or any of these questions is YES, you must be wondering how you can improve your ability to deliver fast and accurate content. It’s possible. Yes, you can type faster, without any pauses in order to meet the deadlines and have happy customers.

As an author, you must have torn bundles of pages or re-written a content en number of times. You will agree that thinking out loud is less grueling than writing. Have you ever thought of combining the two?

Dictation is nothing new. Is it? Carl Sagan would speak into an audio recorder to be transcribed later by his assistant. With time, we found personal assistants in speech recognition tools to offer automated transcription. Initially these tools were not that great, but eventually they have come a long way to be convenient, affordable, and close to accurate. You can casually use it as an alternative approach for generating protodrafts.

A voice-to-text software actually writes down every single thought you have, or to be precise… every single word that you utter, and increases your productivity by about 300 percent. For example, if you can type 40-45 words a minute, you can surely speak 100-150 words a minute.

Tons of open-source and licensed speech recognition tools are available for Mac, Windows, mobile devices, and as web apps. The list is long that includes but is not limited to Dragon Dictate, iListen, ViaVoice, Windows Speech Recognition, Sonix, Speechmatics, Microsoft Cortana, SILVIA, and so on. 

The following video demonstrates how the IBM’s Watson Speech to Text can help you transcribe your voice/audio.

Watson Speech to Text

These tools use artificial intelligence for transcription, but most of them are either not efficient or recognize only good quality voice. It’s tough to identify the efficient ones, but I have searched five hidden gems for you.

  • I rank Vocalmatic in the fifth position. Upload your audio file, and depending on the size of your file, the transcript is ready in 5-10 minutes. Editing and downloading the edited transcript is very convenient and simple.
  • YouTube Close Captioning comes in the fourth place, according to my choice. It uses video files and helps you with creating captions and subtitles. Editing the captions and subtitles is very simple.
  • SpokenData comes in the third position. It’s better than YouTube in having additional features such as “Tracking Speakers” and many more.
  • I place Trint in the second position. 
  • Sonix is my first choice because it classifies the transcripted text into slightly, fairly, and very confident categories. This classification helps you decide where to perform the minimum or the maximum editing.

Dragon’s NaturallySpeaking is, however, the most popular choice worldwide.


Have you been chewing on a topic for weeks? Try an online or an offline transcription tool as a steno. The tool will type when and what you are speaking. You may also record your voice, upload and play the audio file for the tool to convert it into text.

Direct Speech

Keep speaking what you want to type and leave the rest on the tool.

  1. Start the voice recognition tool.
  2. Open a Word file and start speaking.
  3. Keep commanding in between to Go to Sleep Mode and Start Listening, to make some text bold or Italic, to overwrite a word with another one, to make a list, and the like.
  4. In the end, command the tool to save your file or to publish the file in your desired format.

You can not only create content by using these speech recognition tools, but you can also dictate and edit web applications such as Gmail or Yahoo mail.

Video Source: YouTube

Audio File

If you are not comfortable dictating live, you can record your voice in a recorder and create an audio file to upload it to the transcriber.

  • Speak in front of a voice recorder all the possibilities and complexities that hit your head. Don’t overthink; just go on speaking.
  • Tug on as many threads as you come across and follow them as far as they go, allowing twisting bends. You may discover new ideas along the way.
  • Feel free to leave remarks as you go, such as “Maybe this is more relevant for UNIX platform”. These will come in handy later.

To get a quality transcript, mind the following tips:

  • Use a good quality recorder. A bad or an incomplete recording is frustrating.  This may distract you from your mindful thoughts.
  • Stay close to the mic and pronounce the words correctly to avoid unnecessary edits.
  • Find a comfortable space and avoid any background noise to preserve accuracy.
  • Go for a stroll. Walking helps your blood flowing thus keeps your body active and mind fresh.


After the recording, pick one automatic speech recognition (ASR) tool and import your audio file into it. The tool uses machine learning and spits back a text transcript in a couple of minutes. This transcript though isn’t perfect, you can make it great with small edits.

You may want to export the transcript as a Word doc and revamp from there. If you feel that the content needs to be polished more and that editing the text is time-consuming, you may fire up your voice recorder again.

Tools such as Trint and Vocalmatic can transcribe your audio/video in multiple languages and accents to fully optimize your content for the globalized world.

Action Commands

Here is an example how you can command the Windows Speech Recognition tool to perform various actions other than just typing.

Video Source: YouTube

What More?

If you are thinking this is all, let me tell you that speech recognition is also available using Python.

Video Source: YouTube

Now you know how you can use auto-transcription for fast and convenient delivery of your content. Initially you may use it partially along with your keyboard and mouse. Even if you do not want to spend on a paid software, you can start with using an open-source tool. This will also make your work faster than usual. Later when you become an expert using your voice command to get good quality output, you can always go for a paid license. 

About the Author

Chandna Mitra

Currently working as a Senior Information Engineer in CA Technologies, Chandna Mitra carries 13 years of experience in technical writing. Before coming to technical writing, Chandna copy edited international scientific journals, contributed news stories and features to the renowned national newspaper the Times of India and news magazine The Sahara Time. As a journalist, Chandna interviewed eminent personalities of India. Madan Lal Khurana, Jagdish Tytler,  Prakash Javadekar, and Jairam Ramesh are a few to name.