Dragon NaturallySpeaking Versus Windows 7 Voice Recognition

In this blog post I am trying out using Dragon NaturallySpeaking version 11.5. In the past I have used the built-in speech recognition which is available in Windows 7. I was never satisfied with the quality of the speech recognition that I got out of the built-in software included free with Windows 7.

Ultimately my goal is to be able to take audio from a video course and to produce a reasonable transcription which I can then use this as the foundation for editing and producing an associated book.

I can already tell this is much much better than the built-in Windows 7 speech recognition. I am not surprised by this, obviously an external product focusing on a individual particular task can be hyper optimized for that task. I am actually quite pleased with the accuracy of this speech recognition.

So far in this session Dragon NaturallySpeaking has not made a single mistake. One thing that the Windows 7 software does which is a little bit better, is that it is better at recognizing and automatically inserting appropriate punctuation. It may be available as an option within Dragon NaturallySpeaking, but at this point I am having to specify every punctuation mark manually. For example to insert a period at the end of the line I have to actually say the word “period.”  ( I had to edit that last sentence in order to fix the punctuation the Dragon NaturallySpeaking tried to insert.)

However this is amazing. Dragon NaturallySpeaking is doing so much better than the previous voice recognition software I was using. I am finding that this may well be a very usable tool in producing text quickly and accurately. I’ll read up a little bit more about how to use it properly but even as it is this is starting to really be great.

There is one difference between the different versions of Dragon NaturallySpeaking, and that is that the home edition only accepts input from a microphone, while the other additions also accept input from files such as prerecorded wav files.

Since my goal is to take a prerecorded video course and to produce solid text from that course, I did need a version which would handle inputting files instead of speaking it directly into a microphone. No big thing, just something to be aware of.

Okay, my first impressions are that this software is very powerful, will be very helpful, and is much much better than the speech recognition built directly into Windows 7.  I had a very small amount of cleanup to do on punctuation and capitalization for this post, but the accuracy of the transcription was infinitely better.

Update:  I found the option to turn on automatic commas and periods.  Not good at all – I have turned it off again.  The Win7 speech recognition does a better job of adding punctuation and understanding grammatical context — at least for me right now.  If I am dictating, that is not a big deal – I can say “period”, “comma”, “new line”, etc.  But if I am using it for automatically transcribing video, it means I’ll have a bit more editing on the backside.  Not a deal killer, but I was hoping that they would be as good as Win7 in that area.

Tags: , , , ,

Leave A Reply (2 comments so far)

Please show that you are a human. :>) *

  1. Walter Busby
    4 years ago

    I am severly hearing impaired and want to use the software as a caption tool so my friends can talk and I can read what they say on my computer. My concern is the speed. My caption phone has a delay of 2-3 seconds which is not perfect but manageable. Any longer than that would be akeward. Would any of the Dragon prograns have instant transcription?

    • Leo Wadsworth
      4 years ago

      The speed of Dragon Naturally Speaking depends on the computer it is run on, and what other software is running. However, any reasonably decent system running DNS should respond faster than your phone. The big caveat is that I don’t know how well it would work as a caption tool for you — it really needs to be trained for a single voice to work well. It doesn’t do the “recognize all possible voices” task – it does the “recognize a single voice well” task.