Watson Generated Captions for Video
Closed captions have grown to be an important part of the video experience. It’s a great way to make video more accessible not only to a diverse viewer base like the deaf or hard of hearing or non-native speakers, but also for viewers who prefer to watch their content with closed captions and no sound. With the new generated caption feature powered by IBM Watson Speech to Text technology, you can quickly and easily generate captions for the existing videos in your IBM Watson Media account.
How do I enable Watson Speech to Text on your existing videos?
In order for captions to be generated for your videos, you must first set the language of your content from inside your account. There are three different places you can activate this:
- If you’re creating your channel for the first time, you will be prompted to enter the channel title, and your language from a drop down menu:
- If you would like to apply the feature to an existing channel, click on the Channel drop down menu in our left side bar, then click on Caption Settings. Once in the Caption Settings page, click the Change button to select your channels language. If you select the “Auto-publish generated captions” box, your generated captions will be automatically published and available to your viewers without your intervention.
- If you would like to apply the feature to only specific videos, click on Videos from your Dashboard, then select the video you’d like to enable with captions and click Edit. From the Edit screen (shown below) select the appropriate language from the drop down menu.
How do I download and publish generated captions?
Once you activate generated captions for your videos or channel, you will want to check on the status of the generation. To do this, click on Videos from your Dashboard, then select the video you would like to check the status of by clicking Edit. Then select the Closed Captions tab.
(Note: the amount of time required to generate captions depends on the length of the video. For example, a 45 minute video will typically take 45 minutes to generate captions)
You’ll see the caption status here. Once the captions have completed generating, you will be notified via email and can instantly take the following actions:
- Download captions as a .VTT file
- PUBLISH or UNPUBLISH captions so they can be seen by viewers
- Settings allow you to enable automatic publishing for videos
How can I ensure high-quality caption generation?
While Watson Speech to Text uses advanced cognitive processes to generate captions, the accuracy of the captions are directly impacted by the audio quality of the video. The best results are when the audio has a single native speaker talking at a normal pace, with good audio quality and no background noise or soundtrack. Captioning multiple speakers can be very challenging, as are brand names and technical terms.
In order to fix any glaring issues, simply download the generated captions and open the .VTT file in a plain text editor or caption file editor. Once the misinterpreted words are corrected, reupload the captions to the video by clicking the Add Captions button under the Closed Caption tab under the Edit screen of your video.
What languages are available for Watson generated captions for videos?
Channel and/or videos that have been set with the following languages will automatically receive captions:
- Brazilian Portuguese
- Chinese (Mandarin)
- English (United Kingdom)
- English (United States)
- Spanish (Castilian)
If you select a language option that is NOT one of the above languages, you may not see generated captions appear appropriately. We are currently expanding our list of supported languages with more becoming available in the future.
The following default languages will always appear towards the top of the language selector.