[Originally appeared on Slaw.ca, December 29, 2011]
Voice recognition has been getting lots of press recently, thanks to the release of theSiri software with the latest Apple iPhones. It has been a mainstay of discussions around legal technology for more than a decade and yet continues to be a point of uncertainty for lawyers. In particular, how do they use voice recognition in their practices. Ben Schorr discussed some of the challenges of using voice recognition earlier this year. I also received a question at a recent seminar about voice recognition so I thought I would take another look at it.
Speech to Text Conversion Software
Let’s look at the technology first. Nuance’s Dragon Dictate is probably the best known speech to text conversions software available for lawyers. They have absorbed other speech to text software companies, such as Jott, which allowed you to create a voice mail to yourself and have it converted into text. In fact, other than Siri and Dragon Dictate, the voice recognition field remains relatively uncluttered. You can perform voice recognition in a variety of situations, whether on telephone support lines or your mobile phone or in certain locations on the Web such as Google’s Web search.
Dragon Dictate is regularly reviewed for the legal community (See Jim Calloway’s article here , for instance). I decided to try the Windows 7 voice recognitionaccessibility feature as a free entry point into testing voice recognition. This feature is also in Windows Vista and the Mac OS has Speakable Items. Voice recognition works because it can convert your speech into text. The first thing you have to do with most voice recognition systems is to train the software to understand how you speak.
Windows 7 has two short scripts that you read and, after about 30 minutes of reading, you are ready to proceed. In reality, you can start without the training. However, I noticed significant improvement by spending just a bit of time reading into the computer. The speech recognition software will also review documents and e-mail on your computer to try and improve recognition.
The need for training has remained constant over the years with voice recognition. Another frequent discussion focuses on the quality of the microphone. I selected a cheap Staples brand headset microphone and it worked fine. I didn’t notice any problems with voice recognition learning the way I speak and the words that I use.
We Don’t Need No Stinking Training
Of course, the holy grail of voice recognition is being able to speak into a device without having to train it. The Siri application represents that next step, where you can immediately start speaking and the device will respond. The words that you speak are transmitted to a central server, where commands from all other Siri users is aggregated. The software learns based on usage by all iPhone users. The mobile applications that handle voice recognition tend to focus on completing an action, whether it is creating an appointment, dialing a number, or opening a web page.
That really sums up the state of voice recognition. While the Windows speech recognition application, like Dragon Dictate, can do more than smaller applications, it requires an increasing investment of time and money. Time spent training the software and money potentially spent for better applications with more features and microphones. The more you rely on it, the greater your investment should probably be in software and hardware that leverages every aspect of your computer. Free, assistive technology is good but it has its limits.
Your Mileage May Vary
Like so much practice technology, though, it really depends on how you practice and how the technology can fit in. The Windows voice recognition feature works well for the way that I work. Since I mostly use Microsoft software, the integration with Windows speech recognition is strong. I can very quickly open or switch to Microsoft Word to create and edit documents. It is easy to use the Office 2007 ribbon by saying show numbers and selecting the function I want to use. However, if you do not regularly create documents, you may find voice recognition less useful in your practice.
As I got further away from Microsoft software, whether using Lotus Notes for e-mail or Google Chrome for the Web, the voice recognition became more and more difficult to use. In my case, Lotus Notes freezes as soon as it opens, so I am unable to do e-mail with speech to text conversion. Google Chrome works as well as Microsoft’s Internet Explorer, but it requires you to understand that browsers function differently. For example, I can place the cursor in the address bar of Microsoft Internet Explorer by saying address. But since Google Chrome does not call it the address bar, I have to know to speak location in order to place the cursor in Google Chrome’s location bar. Not only do you need to understand how voice recognition software can fit into your practice, you need to be aware of the types of software that you use that will or will not work well with voice recognition.
One advantage I had was that I use Auto Hotkey, which enables you to assign keyboard and mouse shortcuts to particular functions. I could access my hot keys by saying press followed by the keyboard combination. Similarly, if you take advantage of the Microsoft Word auto-correct function as an efficiency tool (see Vivian Manning’s examples), you can use your voice to activate that feature.
The rule of thumb for me is that the smaller the device or the smaller the window in which to input text, the less useful speech to text conversion will be. Document creation works fine in a desktop word processor, but voice recognition in web browsers, spreadsheets, and on mobile devices, will be limited. However, if your practice requires being out of the office all the time and appointment setting, the mobile voice recognition apps could be really valuable.
Better Speech a Law Practice Skill
There is another benefit to voice recognition, however, because it makes you speak more clearly. A litigator once said to me that, although the information I shared was interesting, I spoke too quickly and I should learn to speak like someone on the radio. Not to get the deep voice but to focus on the speed, which can seem surprisingly slow. In fact when you train the Windows speech recognition software, it makes the same suggestion. Lawyers who use voice recognition will find that they need to speak clearly in order to be successful. This can be beneficial in other parts of your practice, particularly if you don’t regularly find yourself speaking in a courtroom or in a client presentation. This can give you additional practice in speaking clearly and slowly so that the words are easy to recognize. If the computer can understand and convert what you say, your audience will probably understand too.
I have now trained both by desktop and laptop computers to support Windows 7 speech recognition. It didn’t require any special computer hardware, and the microphone was a nominal addition; it met the threshold for dabbling! I expect that it will become a part of how I do my work, particularly when I have long documents to draft. As you might expect, this was written with voice recognition and both I, and the computer, learned a lot during its creation. (Stop listening)