Audio and Video Related Content

Table of Contents

1 Overview
2 Audio Only and Video Only Content (prerecorded)
3 Video Content with Audio (prerecorded)
4 Provide a Description of the Visual Content in Videos (prerecorded)
5 Live Audio and Real Time Presentations
6 Provide Synchronized Spoken Description of the Visual Content in Videos (Prerecorded)
7 Flashes
8 How to Obtain and Edit Text Transcripts
9 Provide Audio Controls
10 WCAG Related Guidelines

Overview

Make audio and video related content fully accessible to vision and hearing impaired users.

Audio Only and Video Only Content (prerecorded)

Make information conveyed by prerecorded audio-only and prerecorded video-only content available to all users.

Alternatives for time-based media that are text based make information accessible because text can be rendered through any sensory modality (for example, visual, auditory or tactile) to match the needs of the user. In the future, text could also be translated into symbols, sign language or simpler forms of the language (future).

An example of pre-recorded video with no audio information or user interaction is a silent movie.

HOW TO COMPLY WITH THIS GUIDELINE

Alternative One

Provide a text based transcript of the audio only and video only content.

Alternative Two

For prerecorded video content, authors have the option to provide an audio track. The purpose of the audio alternative is to be an equivalent to the video. This makes it possible for users with and without vision impairment to review content simultaneously. The approach can also make it easier for those with cognitive, language and learning disabilities to understand the content because it would provide parallel presentation.

Video Content with Audio (prerecorded)

Users who are deaf, hard of hearing, or having trouble understanding the audio information must be provided with a text based alternative.

Captions

Captioning should be provided for the audio portion of a video.

The captioning must be synchronized with the audio so that someone reading the captions could also watch the speaker and associate relevant body language with the speech.

Captions (called “subtitles” in some areas) provide content to people who are deaf and hard-of-hearing. Captions are a text version of the speech and non-speech audio information needed to understand the content. They are synchronized with the audio and usually shown in a media player when users turn them on.

Automatically-generated captions do not meet user needs or accessibility requirements, unless they are confirmed to be fully accurate. Usually they need significant editing.

There are tools that use speech recognition technology to turn a soundtrack into a timed caption file. For example, some common video websites provide automatic captions.

However, often the automatic caption text is wrong and does not match the spoken audio — sometimes in ways that change the meaning (or are embarrassing). For example, missing just one word such as “not” can make the captions contradict the actual audio content.

Example of Bad Automatic Captioning :

Spoken audio:
"Broil on high for 4 to 5 minutes. You should not preheat the oven."
Automatic caption:
"Broil on high for 45 minutes. You should know to preheat the oven."

Automatic captions can be used as a starting point for developing accurate captions and transcripts.

Text Transcript

Provide a text based transcript for any pre-recorded video content with audio.

Please see more information and examples on providing captioning.

Provide a Description of the Visual Content in Videos (prerecorded)

Provide vision impaired users access to the visual information in videos.

People who are blind or who cannot understand the visual content can have it described.

Alternative One

One approach is to provide audio description of the video content. The audio description augments the audio portion of the presentation with the information needed when the video portion is not available. During existing pauses in dialogue, audio description provides information about actions, characters, scene changes, and on-screen text that are important and are not described or spoken in the main sound track.

Alternative Two

The second approach involves providing all of the information in the synchronized media (both visual and auditory) in text form. An alternative for time-based media provides a running description of all that is going on in the synchronized media content.

Live Audio and Real Time Presentations

Provide hearing impaired users with synchronized captions for audio content in real-time videos.

Enable people who are deaf or hard of hearing to watch real-time presentations. Captions provide the part of the content available via the audio track. Captions not only include dialogue, but also identify who is speaking and notate sound effects and other significant audio.

People who are deaf or have a hearing loss can access the auditory information in the synchronized media content through captions.

Provide Synchronized Spoken Description of the Visual Content in Videos (Prerecorded)

Provide people who are blind or visually impaired access to the visual information in a synchronized media presentation.

The audio description augments the audio portion of the presentation with the information needed when the video portion is not available.

During existing pauses in dialogue, audio description provides information about actions, characters, scene changes, and on-screen text that are important and are not described or spoken in the main sound track.

Provide vision impaired users with the ability to play videos with audio descriptions.
Provide a synchronized spoken description of the visual content in videos.
People who cannot see or understand the visual content can hear about it while playing videos.

Flashes

Ensure that video related content does not contain anything that flashes more than three times in any one second period, or the flash is below the general flash and red flash thresholds.

Allow users to access the full content of a site without inducing seizures due to photosensitivity.

Individuals who have photosensitive seizure disorders can have a seizure triggered by content that flashes at certain frequencies for more than a few flashes

How to Obtain and Edit Text Transcripts

Following are step by step instructions on how to obtain, edit and incorporate text transcripts in connection with video related content.

1. Place the Transcript Next to the Video

Here is an example of how we have incorporated this to http://sf.gov .

https://www.sf.gov/shared-spaces-public-service-announcements

2. Provide a Link to the Transcript

Provide a link to the transcript that is placed in close proximity to the video (example: "View video transcript")

This link could either expand and show the transcript on the page or link to a new page.

3. Obtaining a Transcript from YouTube

Here are the steps involved with getting the video transcript from YouTube.

How to Get the Transcript of a YouTube Video

4. Editing the Automated Transcript

Because the automatic transcript only transcribes the audio, you will need to manually review and edit.

Fix typos and errors.
Remove extra line breaks and create logical paragraphs.
Add any text that is displayed visually in your video but not spoken, such as title screens.
Indicate who is talking, especially if there is more than one speaker in the video.

Provide Audio Controls

If any audio on a Web page plays automatically for more than 3 seconds, either a mechanism is available to pause or stop the audio, or a mechanism is available to control audio volume independently from the overall system volume level.

Sound distracts some people, and also interferes with screen readers.

A page that plays music or sounds should not disrupt people.

If you play audio content automatically, let people turn it down or off.

WCAG Related Guidelines

1.2 Time-based Media (Level A, Level AA)

1.2.1 Audio-only and Video-only Prerecorded (Level A)

1.2.2 Captions Prerecorded (Level A)

1.2.3 Audio Description or Media Alternative (Prerecorded) (Level A)

1.2.4 Captions Live (Level AA)

1.2.5 Audio Description Prerecorded (Level AA)

1.4.2 Audio Control (Level A)

2.2.2 Pause, Stop, Hide (Level A)

2.3.1 Three Flashes or Below Threshold (Level A)

Information and Examples on Captions