Can Voice Recognition Technologies Make Transcription Services Redundant?

Can Voice Recognition Technologies Make Transcription Services Redundant?

Many businesses need to convert recorded voice to text and have long been looking for ways to do it quickly and inexpensively. Transcribing medical dictation is a prime example. Some years ago, when voice recognition software became commercially available, most people expected that the solution had finally arrived. Businesses looked forward to cutting down on transcription costs and everyone who hated typing looked forward to getting rid of their keyboard. Unfortunately, the reality turned out to be rather different. Voice-to-text technology has been a big let down so far. The fact is, voice recognition software is easily thrown off track by many different factors. If you don't speak clearly and distinctly, it may not give you the right output. If you try using it in a noisy place, it will fail more often than not. If you have an accent, it may not understand you. Even if you have a bad cold, you'll find that the software may give incorrect results! In other words, voice recognition software works reasonably well under ideal, laboratory conditions, but not in a typical home or business setting! Healthcare professionals who attempted to use voice recognition technologies to eliminate transcription services found that they need to "train" the software to function well. That takes a long time and a lot of work. Most wound up continuing to outsource their medical transcription work. Of course, there are many other types of situations where transcription is needed. Examples include recordings of seminars, teleconferences, interviews and classes that need to be converted to text. In natural speech, people tend to use lots of "aahs" and "umms" as well as unnecessary phrases like "you know". Current voice recognition technology is just not capable of filtering out such irrelevant sounds or words. In addition, people also string together several sentences using "ands". The software can't break up such speech into meaningful sentences. Nor can it break up speech into meaningful paragraph units the way a transcriptionist can. And if the recording is filled with background noise, or if more than one person is talking at the same time, the software will not function reliably and consistently. Maybe sometime in the future someone will invent voice recognition technology that can handle all the above issues. Till then businesses will need to use transcription services, particularly for work like medical transcription, where accuracy is critical.