MRM PlaybackDelegate Class


The PlaybackDelegate class manages playback of PCM audio.

PlaybackDelegate Class Declaration

This is a sample declaration of the PlaybackDelegate class:

class PlaybackDelegate {
public:
    virtual int getPlaybackRate() = 0;
    virtual int start(AudioFormat format, int channels, int rate) = 0;
    virtual int stop(StopBehavior behavior) = 0;
    virtual int write(int64_t pts, const void *buffer, size_t *frames) = 0;
};

How MRM Uses Your Implementation of PlaybackDelegate

On startup, getPlaybackRate() is called to get the frame rate supported by a device. For each song played, the MRM SDK calls start(), write(), and stop().

getPlaybackRate() is called from the WHA constructor before the rendering thread starts. The start(), write(), and stop() functions are guaranteed to be called from a single rendering thread.

The MRM SDK outputs audio at a rate specified by Client and performs all the needed resampling inside of the MRM SDK.

Pause/Resume

The MRM SDK calls stop() when the user pauses playback and calls start() when user resumes playback. The MRM SDK writes completely new data along with a new presentation timestamp (PTS) after resume. Thus it is not a classic pause/resume case where the hardware stops on pause, with buffers kept intact and, on resume, continues from exactly the same point it stopped at. It is a start/stop case where audio positioning must be performed again upon resume.

To prevent playing back stale data on resume, the client must drain all buffered data on stop(). Consequently, buffers should not be longer than 200ms on the client side as pausing playback will take a long time.

Handling Underruns

If there is no data available in WHA, write() is called with frames pointing to 0.

Depending on underlying platform code, a typical implementation may write some silence to prevent a restart of the audio pipeline if buffers are close to empty or it can wait for some predefined amount of time. The ALSA implementation in the sample app writes one period of silence.

As with an ordinary write() in case of error, the Client shall return a non-zero value. If an error is reported, the MRM SDK stops the playback by calling stop(), and PlaybackFailed is reported to the Alexa Cloud.

getPlaybackRate()

Implement this function to return (in Hz) the playback frame rate supported by the device or return a negative value on error. The function is called when the client instantiates the WHA class.

virtual int getPlaybackRate() = 0;

start()

Implement this function to handle a request from the MRM to start playing content. (A typical implementation is to open the hardware device and prepare it for writing.) Return zero on success and non-zero otherwise.

When the MRM SDK has content to play, it calls start(). The MRM SDK calls start() at the beginning of every song and calls stop() at the end of every song.

virtual int start(AudioFormat format, int channels, int rate) = 0;
Parameter Description
AudioFormat format Audio format.
int channels Channel count of audio data, provided via write().
int rate The rate of playback in Hz.

stop()

Implement this function to respond to a request from the MRM SDK to stop, where the handling of buffered data depends on the value of behavior. Return 0 for success and non-zero otherwise.

virtual int stop(StopBehavior behavior) = 0;
Parameter Description
StopBehavior behavior Whether to play all buffered data before stopping or to clear the buffer and stop.

write()

Implement this function to present audio data (perform audio positioning) at the local time specified in pts. Before returning, set frames to the number of frames consumed. Return 0 for success and non-zero otherwise.

The client can consume less frames than specified in frames. Consequently, the MRM SDK would call write() immediately with recalculated pts and the rest of the buffer.

Audio Playback API is an example of a push model where processing speed is controlled by the Client. If more data is available upon return from write(), the MRM SDK immediately calls write() again. If no data is available, write() is called with frames set to 0.

If the client's buffers are full, the client shall block until it can consume more data.

An approach to audio positioning:

  • Output some amount of silence if the buffer is in the future, set frames to 0, and return 0. The MRM SDK would call again with the same pts and buffer.

  • Drop the whole buffer if the buffer is in the past, leave frames unchanged, and return 0. With frames unchanged, this would indicate that the whole buffer was consumed, and the MRM SDK would call write() with the next pts and buffer.

  • Trim part of the buffer and write the rest if part of the buffer is in the past, and return 0. The unchanged frames would indicate the whole buffer was consumed, and the MRM SDK would call write() with the next pts and buffer.

  • Write the whole buffer, leave frames unchanged, and return 0. The unchanged frames would indicate the whole buffer was consumed, and the MRM SDK would call write() with the next pts and buffer.

virtual int write(int64_t pts, const void *buffer, size_t *frames) = 0;
Parameter Description
int64_t pts The local time in nanoseconds at which audio data is to be presented.
const void *buffer The audio data.
size_t *frames The number of audio frames to play. Just before returning, set this to the number of audio frames consumed by the Client. One frame is equivalent to one sample of each of the channels. Channels are in canonical order (2 channels = L, R; 5.1 channels = TBD). Audio samples are in native platform endianness.

AudioFormat Enum

enum AudioFormat {
    // Signed 16-bit integer, interleaved
    AUDIO_FORMAT_S16
};

StopBehavior Enum

enum StopBehavior {
    // Drain (play) all the samples and then stop
    STOP_DRAIN,
    // Drop all the samples and stop
    STOP_DROP
};

Was this page helpful?

Last updated: Nov 27, 2023