Audio and Video Synchronization

Follow these guidelines to ensure correct audio-video synchronization for your media on Fire OS devices.


Precise audio and video synchronization is one of the key performance measurements for media playback. In general, audio and video that are recorded the same time on the recording device need to be played back at the same time on playback devices (for example, on TV and monitors). There is a small margin allowed so that audio and video can skew a little bit without being sensed as apart from each other.

In prevailing standards, if audio leads video by less than 30 milliseconds, or audio lags video by less than 50 milliseconds, it can be declared that AV sync is achieved. The reason that audio can lag video more time without human perception is that sound travels in the air at a speed of 1 foot per millisecond roughly. When people talk to each other several feet away, naturally sound will arrive some milliseconds later than lip movement. So human are used of the situation that audio always lags video in a face-to-face conversation.

General AV Sync Solutions

In practice, there are three AV sync solutions available. In Fire OS, solution 1 below is strongly recommended. Solution 2 is allowed especially for video only content playback. Solution 3 is strongly discouraged. Going forward in this article it is assumed that solution 1 is always used if possible.

  1. Use audio playback position as master time reference, try to match video playback to audio.
  2. Use a system time as reference, match both audio and video playback to that system time.
  3. Use video playback as reference, let audio match video.

Maintain AV Sync in Fire OS Apps

When an app is developed with an intention to run on Fire OS devices, the app developer is responsible for taking care of the AV sync performance of the app. Fire OS has several ways to ensure accurate lock between audio and video data flows. If this guidance is properly followed, there should not be any obvious AV sync problem. Below are several approaches to maintain AV sync, and they serve as the best practice guidelines.

1. If the App Has a Custom-made Media Player

This means the app has full control of audio and video data flows. The app knows how long it takes to decode audio and video packets. The app has the freedom to increase or decrease buffered video data in order to playback video frames that match audio frames timely wise. The app developer should also make sure to get Android SDK version 19 or later in order to be able to use certain APIs mentioned here.

1.1 Using the getTimestamp() Method Defined in Android SDK AudioTrack Class

If this audio route supports timestamp and the timestamps are available constantly, this approach can be used solely to determine audio playback latency. If a timestamp if available, the AudioTimestamp instance is filled in with a position in frame units, together with the estimated time when that frame was presented or is committed to be presented. This information (position in frame units and presentation time) can be used to control video frame playback to match audio.

Note the following:

  • The timestamp pulling is recommended once per 10 seconds to once per minute. Because audio/video position skewing will not happen dramatically, there is no need to pull the timestamps too often.
  • The app should trust this timestamp result if return value is true. Adding a custom time offset based on subjective experimental results is strongly discouraged, as that could break the AV sync once audio route is changed.

See the getTimestamp() method from the Android documentation for details. Note the full details explained in the Android documentation, including the Parameters and Returns tables for this method.

1.2 Using the getPlaybackHeadPosition Together with getLatencyMethod()

If an audio route does not support timestamps, the method in section 1.1 will not be applicable. An alternative approach is needed. An app needs to determine how much time an audio frame takes to go through the whole device, from the moment of this audio frame is received/read till the moment the audio frame is presented. This is the audio latency. If audio latency is known, the app just needs to control video playback path and make it have the same latency as audio does; then AV sync can be achieved.

The key here is how to calculate audio latency when timestamps are not available. The same Android SDK class AudioTrack has another method called getPlaybackHeadPosition. See the Android documentation for full details about this position, including what it returns.

The getPlaybackHeadPosition method will return current offset within the buffer if the track has MODE_STREAM, which reflects the current audio latency in frames inside the AudioTrack buffer. However, this is just partial of the audio latency in the device. To get the audio latency after audio leaves the AudioTrack buffer, another method needs to be called in order to get audio latency from the native layer. This method is not a direct member of AudioTrack class. However, it can be implemented the same way as ExoPlayer does.

Method getLatencyMethod:

if (Util.SDK_INT >= 18) {
 try {
  getLatencyMethod ="getLatency", (Class < ? > []) null);
 } catch (NoSuchMethodException e) {
  // There's no guarantee this method exists. Do nothing.  

The method getLatencyMethod() then can be used to get audio latency down the pipeline from audio track all the way to audio sink. The following usage will obtain the audio latency after AudioTrack in micro-second:

latencyUs = (Integer) getLatencyMethod.invoke(audioTrack, (Object[]) null) * 1000L- bufferSizeUs;

Be aware that the AudioTrack buffer size bufferSizeUs has to be subtracted because that is the whole buffer's latency. The buffer has only a portion of it used, and that portion is what getPlaybackHeadPosition() returns. When you combine getPlaybackHeadPosition() together with getLatencyMethod(), the audio latency in microsecond for a certain audio route starting from audioTrack can be calculated as follows:

audioLatencyUs = (Integer) getLatencyMethod.invoke(audioTrack, (Object[]) null) * 1000L- bufferSizeUs + framesToDurationUs(audioTrack.getPlaybackHeadPosition());

where framesToDurationUs()'s definition is:

private long framesToDurationUs(long frameCount) {
        return (frameCount * C.MICROS_PER_SECOND) / sampleRate;

After the total actual audio latency audioLatencyUs is calculated from above approach, this number can be passed to video processing module, so the video latency can be checked and adjusted to match audio latency.

An app should have both approaches from section 1.1 and 1.2 implemented. However, these two approaches should not be used at the same time. Only if approach in 1.1 fails to get the timestamp should the approach from 1.2 kick in.

A good example to follow for AV sync implementation is Amazon port of ExoPlayer. The details are in the AudioTrack Java class.

2. If the App Uses ExoPlayer as Its Media Player

It is highly recommended that ExoPlayer is used for Fire OS apps. Amazon has a port of ExoPlayer that is compatible with all generations of Fire TV. The Amazon port of ExoPlayer provides many fixes, workarounds, and other patches to make ExoPlayer work consistently on old and new Amazon devices.

In terms of AV sync, when either the default ExoPlayer or the Amazon port of ExoPlayer is used as the app layer media player, it will automatically perform the synchronization. The app developer does not need to manually adjust timestamps for latency.

3. If the App Uses the Standard Android MediaPlayer

The standard Android MediaPlayer classes that handle audio and video playback are supported on Fire OS. These media classes can handle basic media playback with AV sync requirements. However, for more robust media needs, the Amazon port of ExoPlayer (or one of the paid media player options) is recommended. Refer to Android Media Player.

For AV sync, an app should rely on Android MediaPlayer and Fire OS kernel. The app developer is not necessary and discouraged to manually adjust timestamps, audio latency, buffering, etc.

4. If the App Uses OpenSLES Frameworks Coming with Android NDK

When OpenSLES queries audio latency through its audio latency API, it only obtains the audio hardware latency reported by audio flinger. It cannot get audio software latency mainly incurred by audio track buffering. This is an Android flaw. In order to get accurate audio latency including both hardware and software audio delays, in Fire OS 6 the OpenSLES API android_audioPlayer_getConfig() was expanded to report appropriate audio latency.

Below are code samples in order to get proper audio latency value, which includes both software and hardware audio latencies.

4.1 If Your Audio Player Object Is CAudioPlayer Type

// Sample code for getting Fire OS software+hardware audio latency when using OpenSL framework.
SLuint32 audioLatency = 0, valueSize = 0;
// Assume ap is your audio player object with type of CAudioPlayer.
if (android_audioPlayer_getConfig((CAudioPlayer * ) & ap, (const SLchar * )
  (SLuint32 * ) & valueSize, (void * ) & audioLatency) == SL_RESULT_SUCCESS) {
 // The hardware+software audio latency is filled in SLuint32 type of variable audioLatency.
} else {
 // Call your current get_audio_latency API. You will only get hardware audio latency value.

4.2 If Your Audio Player Object Is Created by SLEngineItf Interface API CreateAudioPlayer()

If your audio player was created in the following way, such as:

result = (*engine)->CreateAudioPlayer(engine, &playerObject, &audioSrc, &audioSink, NUM_INTERFACES, ids, req);

you can use the sample code below to get total audio latency for the created audio player. Note that the variable playerObject has to be the same one as the CreateAudioPlayer() was called.

// Sample code for getting Fire OS software+hardware audio latency when using OpenSL framework.
SLAndroidConfigurationItf playerConfig;
SLint32 result = 0;
SLuint32 audioLatency = 0;

// Get the audio player's interface
result = ( * playerObject) -> GetInterface(playerObject,
if (result != SL_RESULT_SUCCESS) {
 ALOGE("config GetInterface failed with result %d", result);
} else {
 // Get the audio player's latency
 result = ( * playerConfig) -> GetConfiguration(playerConfig,
  (const SLchar * )
  "androidGetAudioLatency", & audioLatency, sizeof(SLuint32));
 if (result == SL_RESULT_SUCCESS) {
  // The hardware+software audio latency is filled in SLuint32 type of variable audioLatency.
 } else {
  // Call your current get_audio_latency API. You will only get hardware audio latency value.