WSIL-TV (ABC Network affiliate) is using AUDIMUS.MEDIA software since July 2016 to produce captions automatically on their live programs. A year after the acquisitIon, WSIL’s Chief Transmitter Engineer, Earl Flanigan, shares his opinion about VoiceInteraction’ solution.

THE CHALLENGE

For roughly the past year, we have been using a little known software package created by VoiceInteraction to caption our newscasts and special reports in real-time to satisfy the FCC requirements for captioning. We spent a considerable amount of research to find what would work and not be a financial drain. We found that many voice captioning packages where inflexible – requiring either a dedicated announcer or exhaustive “teaching” of the software for each voice to be captioned. Then there were the horrendously expensive off-site captioning services, which were nowhere near our budget constraints.

 

 

The solution came quite unexpectedly in the form of a new startup company for the N America Market, which had made considerable inroads in Europe and South America, called VoiceInteraction. Our Chief secured a demo version and tossed it to me: “Try out this captioning software for me.” Personally, I groaned inwardly as all of my past experiences with captioning software were on the level of a root-canal without benefit pain killers.”

THE SOLUTION

“The cynic in me evaporated within minutes of starting it up. With just off-air audio running into the system with several different voices in the one show, the accuracy was in the neighborhood of 70%. It was then time to start requisitioning better hardware than what I was using (a WinXP Pentium D box with a no-name sound card). After obtaining a better computer and a broadcast-grade sound card, the accuracy approached 80%. Why only 80%? Here’s why.

VoiceInteraction’s software speech database wasn’t wholly friendly to “Southern Illinoisan-ese”. More than a few words spoken here do not follow the standard “Oxford Dictionary” pronunciations. In this area, Cairo is pronounce KAY-row, New Madrid is pronounced New MAH-drid – you get the idea. The other factor is sound that is not speech, such as music. Music will absolutely make the software go catatonic until it is removed or lowered to the point the software can make sense of what is being spoken. The first fix was a simple matter of sending a video file to VoiceInteraction which displayed the person speaking along with the erroneous captioning so they could update/change the database. The second was simply to make certain the voices were well above any music bed used. – which in the end was not an issue, as we simply fed all the sources into the captioner ahead of any down-mixes on a separate audio bus. After this was done, accuracy approached 90-95%. Which is (in my opinion) on an even par with what one would see on a network newscast.

The final outcome is that we have a reliable captioning system that is easy to use, at a fraction of what off-site captioning would cost, and for the time being, satisfies FCC captioning requirements.

Read the full testimony on LinkedIn

ABOUT WSIL-TV

WSIL-TV is the ABC Network affiliate serving southern Illinois, western Kentucky, and southeast Missouri. WSIL became the area’s first television station in December of 1953 operating out of studios in Harrisburg.

In October of 2010, WSIL became the first station in the area to broadcast local newscasts in high definition and continues to provide local news, weather and sports video and commercials in high definition.

 

WSIL’s commitment to local news coverage and community involvement has earned it several prestigious honors in recent years including the Illinois Broadcasters Association Medium Market Station of the Year Award and Associated Press awards for Outstanding News Operation and Best Newscasts.