Madison, Wis. – To its advocates, unified communications is considered the answer to a number of business communication issues. The integration of all communications, including voice and data, over the Internet is gaining wider adoption as organizations pursue it for cost savings, businesses process transformation, enhanced collaboration, and even “green” benefits.
In business, IP telephony has reached about 25 percent of the global market, and many organizations are considering wider deployment, according to Technology Futures, Inc. However, an e-discovery concern that still is somewhat under the radar could slow adoption as companies learn that the move away from traditional phone service includes the conversion of voice mails into e-mail in the form of wave (audio) files.
In the event of litigation, they are discoverable in either case, but in electronic form these audio files must be converted to text, and the enabling voice-recognition technologies, while improving, may or may not meet varying court standards for accuracy. Due to technological limitations, it’s not an easy issue to address, but the growing complexity of e-discovery has left most companies unprepared and vulnerable in terms of e-storage.
“I sense they [businesses] are probably sloppy in this area,” said attorney Erik Phelps, a partner in the law firm Michael Best and Friedrich.
In this look at unified communications and its impact on e-discovery, WTN examines the business case for new technology, and the importance of making sure wave files are part of a business process for the proper handling of documents.
Wave files are difficult to search because they are audio files, so there is no actual text information to pull out the document during a search. The challenge is not simply to convert the audio file to text, but to recognize the voice on the file in a highly accurate way.
Traditional voice recognition has the limitation of being very poor in what is called the voice-independent scenario, meaning it cannot identify the caller, explained Paul Hager, co-owner of VoVision, a Madison-based voice transcription service. “There are lots of different ways to try to improve that process, the most common of which is just trying to identify the caller and handle it in a voice-dependent manner,” he said, “but again, that can’t be done with just voice mail, so we get these sub par accuracy rates that are frustrating.”
According to said Chris Jurkiewicz, vice president of electronic discovery technology for ONSITE3, an Arlington, Va.-based litigation support company, there are three known methods of converting audio into text:
Phonetic search – One method involves a simple phonetic search, a technology that is used by telecommunications companies. It basically analyzes the wave patterns of voice, attempts to match them to known libraries of information, and pulls out words that are, phonetically speaking, close to those wave patterns. The main downside of this method is that users get a lot of “false hits” when conducting a search. Another problem is that users don’t get to see text – it doesn’t actually show them any kind of text information that can be fed into another program that reads text – like another search engine or index. “It’s also a developing technology and still in a refinement phase where vendors are constantly tweaking it to meet customer needs and demands,” Jurkiewicz said.
Manual transcription – Another method is to transcribe the audio file word-for-word through outsourcing or offshoring. Under this model, voice mails are sent to another company or overseas to be transcribed into text, and the text information of each voice mail is returned via e-mail. In addition to the security concerns associated with transmitting e-mails with sensitive information, the negatives include the cost of transcription for what could amount to hundreds or thousands of files. “You’re dealing with labor at that point,” Jurkiewicz noted.
Automated transcription – A third method is offered by vendors who have been working to improve on the accuracy rates of voice recognition technology. ONSITE3 has developed voice-recognition software acts similarly to phonetic search, where it tries to convert sound into text. It uses the known libraries that exist, but it also tailors additional libraries to different industries. In the pharmaceutical area, for example, there are a lot of uncommon drug names and different ingredients that go into drugs that have to be spoken many times into the system. The spoken word and the typed word are entered into the system so that the software can recognize it.
VoVision tries to build on the accuracy of voice recognition from another angle – the users. As part of its voice-independent model, it uses a shared training database in which users, who train themselves within the network, can correct anything the system gets wrong. Like an anti-virus that occurs overnight, those corrections are shared among all users, increasing its accuracy each time. According to Hager, VoVision is able to see accuracy push into the 80 to 90 percent range, which he called “the Holy Grail” of voice recognition.
“There is a challenge because people call from all over the place with varying voice inflections, with varying call quality, from varying devices, and so we feel the only way to approach that properly is this Wikipedia of voice recognition type of approach, where everybody submits their corrections when they can through a web interface or through their e-mail program, and the voice model adapts over time.”
The downside here, too, is that automated technologies still are developing. In ONSITE’s case, the accuracy often depends on the clarity of the recording. For recordings of broadcast quality, it ranges from 85 percent to the high 90s, Jurkiewicz said. If it involves the conversion of an actual voice mail recording, the accuracy drops significantly – to between 40 and 60 percent.
With VoVision, potential clients have been reluctant to use a third-party service, so the company is looking to partner with a large telecommunications firm that would incorporate its technology into, ironically, a unified communications platform.
Part of the business case for addressing this issue is increased compliance, especially in regulated industries like insurance and finance, but the benefits extend beyond that. The touted business benefits start with the potential cost savings of narrowing the scope of the messages that law offices must listen to. In the case of e-discovery, cost savings also include cost avoidance now that courts are imposing tough sanctions – fines and disadvantageous instructions to the jury, or both – for companies that don’t fully comply with an e-discovery request or that don’t produce files in a text format.
“The savings, I know, are pretty easy to figure out,” Jurkiewicz said. “In the past, clients would have to listen to every single wave file that was in their document set. Nowadays, with unified messaging, your going to have your e-mail riddled with these wave documents anytime a voice mail message is left. So you’re going to have tons of attorney hours there listening to messages.”
Best practices, best protection
Failing a completely accurate technology option, the best protection is a sound business process for the handling of voice mails in any form. Attorneys and vendors alike recommend making the handling of voice mails on wave files part of your document retention (and deletion) policy.
Like any document, voice mails can provide the smoking gun that determines the outcome of a lawsuit, so it’s important to store, handle, and delete them in the proper way.
Since traditional voice mail already is discoverable, Phelps doesn’t believe there is an increased legal risk by going to unified communications; in fact, organizations are storing more voice mail as a result of the conversion to unified communications. He noted that converting voice mail to electronic form makes files more accessible to more people, and makes them easier to store and move around, if not search. With increased accessibility comes the need for training on the proper employee handling in accordance with document retention policies and legal obligations.
The decision on what files to store and what and when to delete should boil down to the business value of the information contained in them, Phelps said, but businesses do have to establish legally sound policies. Some wave files are of the simple “return-my-call” variety and can be deleted as soon as a call is returned, but employees also must be trained as to what is potentially discoverable.
“They have to think broadly about e-storage,” Phelps stated, “and make sure policies and procedures for producing electronic information run that broadly.”
Since voice recognition technology is relatively new and immature, and because there are a limited number of vendors in this space, there has not been a large number of production requests to convert electronic audio files to text. That means there still is time for companies to prepare.
“It’s still not heavily asked about in a lot of cases,” Jurkiewicz said, “but there have been a few that have we’ve worked on.”