In a year’s time, a new app will be launched that is going to be the death of ALL transcribers. In a year’s time we are all going to be out of a job. Sad, but true. Here’s why.
A couple of years ago I contacted Baynes – a Ukrainian Mathematics graduate student at Kiev University who moonlights as a app creator on Elance, to create an interactive transcript app. Unfortunately , and mainly due to some technological limitations, we were not able to complete the app. I did not recognize the omen; unfinished projects usually come back to haunt you!
We however kept in touch, we shared a common love for Sherlock Holmes – especially the new BBC series, which is awesome. But I digress. I got in touch with Baynes a month ago to see how he was doing, in light of the political turmoil in his country, and he was doing fine. Despite the fact they had been invaded by Russia!
Anyway, he told me that he had been working on a new app with the IBM Watson API on speech recognition. That pricked up my ears because I have a vested interest in speech recognition technology. Watson is a “cognitive technology that processes information more like a human than a computer—by understanding natural language.”
If you are a fan of Jeopardy!, Watson was the IBM supercomputer that crushed 2 of the best human Jeopardy! contestants. After being humiliated by the “thinking machine”, one of the contestants made this prophetic statement: “quiz show contestant’ may be the first job made redundant by Watson, but I’m sure it won’t be the last.”
On November 14th, IBM announced that it was going to give app developers access to Watson in the cloud. New Watson-driven apps are targeted to enter the market in 2014. The end is nigh.
You might have used speech or voice recognition technologies on your mobile device, to search the internet, or on YouTube. It’s become ubiquitous. Windows Vista, 7/8 are all shipped with a built-in speech recognition software. It’s not a stretch of the imagination to envision a software that can transcribe audio into text.
If fact there are a number of software in the market that do exactly that, Dragon being one that comes to mind. But they all have a major drawback. They are very inaccurate. Especially when an audio file has more than one speaker, for instance in a research interview. And even when there is only one speaker present in the recording, you need to train the software (minimum 12 hours of training) to recognize the variations in the individual’s speech pattern.
I’ve always been amused at YouTube’s attempts at speech recognition. Google speech engine spews out garbage, funny garbage though. You’ve got to watch this:
How accurate is speech recognition software. About 80% with clear one speaker audio (remember you need to train the software). Throw in some background noise, accents, or a multiple speakers and that accuracy rate goes down to 50% – 60%. In contrast a professional transcriber transcribes audio to at least 98% accuracy level.
A Death Foretold
Back to Baynes. Using the natural language capabilities of IBM’s Watson, coupled with the university’s supercomputer, Baynes has been able to get Watson to transcribe audio to text with 98% accuracy. At least that’s what he told me a week ago. My reaction, un be live able! His response, “I can prove it.”
So I sent Baynes a 1 hour one-on-one Skype interview. The audio quality was not that great and the line dropped a few times during the interview. I struggled to transcribe this particular interview. 10 minutes later, he sent me a transcript of the interview. I compared his transcript to my transcript using the compare feature in MS Word. The transcript was 98.32% accurate! And it only took him 10 minutes to transcribe an audio. It took me 6 hours.
I was stunned. I had a moment of derealization. I felt like Santiago Nasar reading Gabriel García Márquez’s Chronicle of a Death Foretold. I realized that my days as a academic transcriber are numbered. What to do!
Yesterday I got in touch with Baynes, after a few days brooding. I had one question for him, how long before the app hits the market? “A year at most,” was what he told me. So I have until April 1st 2015 to learn a new skill. And I need to learn something that’s Watson proof. Any ideas?