Merci de votre visite. Cette page est disponible en anglais uniquement.

Alexa.RTCSessionController Interface

The Alexa.RTCSessionController interface describes the messages used by Alexa to interact with endpoints capable of real-time communnication (RTC). By using the RTCSessionController interface in your applications, Alexa users can communicate remotely, for example with a visitor at their front door. Users can communicate remotely by using any Echo device, such as an Echo Dot, Echo Plus, Echo Show, or Echo Spot.

The RTCSessionController interface supports 1-way (half duplex) or 2-way (full duplex) communication. For an audio-only scenario, such as an Echo Plus connecting to a front door intercom, communication must be 2-way. For an audio and video scenario, such as an Echo Show connecting to a front door camera, only 1-way video communication is supported, and 1-way or 2-way audio communication is supported.

For the list of locales that are supported for the RTCSessionController interface, see List of Capability Interfaces and Supported Locales.

Utterances

When you use the Alexa.RTCSessionController interface, the voice interaction model is already built for you. Users can start communication with a person next to a real time communication device by talking to their Alexa-enabled device (for example, an Echo Show or Echo Spot) or by using the microphone icon when they are in live streaming mode.

Users can start conversations by using one of the following utterances:

Alexa, answer the front door.
Alexa, talk to the front door.
Alexa, talk to the backyard camera.
Alexa, talk to the baby monitor.
Alexa, get the call going with the front door.
Alexa, please call front door.
Alexa, respond to the front door.
Alexa, speak to the front door.
Alexa, talk to my front door camera.
Alexa, talk to the person at the main door.

Users can end conversations by using one of the following utterances:

Alexa, go home.
Alexa, stop.

After the user says one of these utterances, Alexa sends a corresponding directive to your skill.

Overview

Supported Communication Types

  • 1-way (half duplex) communication allows users to communicate in two directions, but not simultaneously. For example:
    • A walkie-talkie
    • A push-to-talk door intercom
  • 2-way (full duplex) communication allows users to communicate in two directions simultaneously. For example:
    • A telephone
    • A telephone door intercom

Signaling Diagram

The RTCSessionController communication is shown in the following signaling diagram.

Diagram showing order of directives and events for RTCSessionController communication

Prerequisites and SLA Requirements

To use the RTCSessionController API, you need the following:

  • A minimum timeout of one minute is required.

  • For any offer sent to your skill, you must generate an answer within six seconds.

  • Your device or platform must be WebRTC compliant or support the suite of protocols by WebRTC and all supported resiliency mechanisms used in WebRTC. Specifically,

    • Negative acknowledgement (NACK)
    • Picture loss indication (PLI)
    • Full intra request (FIR)
    • Receiver estimated maximum bitrate (REMB)
  • For resource considerations, you must support bundling and rtcp-mux. You use a bundle to send audio and video over the same connection to reduce the number of open sockets.

  • To support full-duplex communication, your device must employ effective algorithms for acoustic echo cancellation (AEC) and noise suppression.

  • To support half-duplex communication, you can use the Push to Talk feature through the typical live view scenario. Declare isFullDuplexAudioSupported as false in the discovery response.

  • To support video, you must use one of the following video codecs:

    • H264 (up to profile high, level 4.1)
  • To support audio, you must use one of the following audio codecs:

    • Opus (preferred codec)
    • PCMU/G.711
    • AAC-LC, HE-AAC
  • For Interactive Connectivity Establishment (ICE) candidates, you can use either UDP or TCP but you must use IPv4.

Properties

The Alexa.RTCSessionController interface does not define any reportable properties.

Discovery

You describe endpoints that support Alexa.RTCSessionController using the standard discovery mechanism described in Alexa.Discovery. In addition, identify if duplex is supported in the configuration of the Alexa.RTCSessionController capability.

Use CAMERA or DOORBELL for the display category. For the full list of display categories, see display categories.

In addition to the usual discovery response fields, for the RTCSessionController entry in the capabilities array, include a configuration field that contains the following properties.

Field Description Type
isFullDuplexAudioSupported True if the device supports 2-way (full duplex) communication. False if the device supports 1-way (half duplex) communication. The default is false. Boolean

Discover response example

{
  "event": {
    "header": {
      "namespace":"Alexa.Discovery",
      "name":"Discover.Response",
      "payloadVersion": "3",
      "messageId": "<message id>"
    },
    "payload":{
      "endpoints":[
        {
          "endpointId": "<unique ID of the endpoint>",
          "manufacturerName": "<the manufacturer name of the endpoint>",
          "description": "<a description that is shown in the Alexa app>",
          "friendlyName": "My front door camera",
          "displayCategories": [ "CAMERA" ],
          "cookie": {},
          "capabilities": [
            {
              "type": "AlexaInterface",
              "interface": "Alexa.RTCSessionController",
              "version": "3",
              "configuration": {
                "isFullDuplexAudioSupported": true
              }
            },
            {
              "type": "AlexaInterface",
              "interface": "Alexa",
              "version": "3"
            }
          ]
        }
      ]
    }
  }
}

Directives

InitiateSessionWithOffer Directive

Support the InitiateSessionWithOffer directive so that users can initiate a real-time communication session with a front door device.

The following example shows a user utterance:

Alexa, talk to my front door camera

InitiateSessionWithOffer directive payload details

Field Description Type
sessionId The identifier of the session that wants to connect. A Version 4 UUID.
offer An SDP offer. String

InitiateSessionWithOffer directive example

{
  "directive": {
    "header": {
      "namespace": "Alexa.RTCSessionController",
      "name": "InitiateSessionWithOffer",
      "messageId": "<message id>",
      "correlationToken": "<an opaque correlation token>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "<an OAuth2 bearer token>"
      },
      "endpointId": "<endpoint id>",
      "cookie": {}
    },
    "payload": {
      "sessionId" : "<the session identifier>",
      "offer": {
        "format" : "SDP",
        "value" : "<an SDP offer value>"
      }
    }
  }
}

InitiateSessionWithOffer response event

If you handle a InitiateSessionWithOffer directive successfully, respond with an AnswerGeneratedForSession event. You can respond synchronously or asynchronously. If you respond asynchronously, include a correlation token and a scope with an authorization token.

AnswerGeneratedForSession response event payload details

Field Description Type
answer An SDP answer. String

AnswerGeneratedForSession response event example

{
  "event": {
    "header": {
      "namespace": "Alexa.RTCSessionController",
      "name": "AnswerGeneratedForSession",
      "messageId": "<message id>",
      "correlationToken": "<an opaque correlation token>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "<an OAuth2 bearer token>"
      },
      "endpointId": "<endpoint id>"
    },
    "payload": {
      "answer": {
        "format" : "SDP",
        "value" : "<an SDP answer value>"
      }
    }
  }
}

InitiateSessionWithOffer directive error handling

If you can't handle a InitiateSessionWithOffer directive successfully, respond with an Alexa.ErrorResponse event.

SessionConnected Directive

The SessionConnected directive notifies you that your RTC session is connected.

SessionConnected directive payload details

Field Description Type
sessionId The identifier for the session from the original InitiateSessionWithOffer directive. A Version 4 UUID.

SessionConnected directive example

{
  "directive": {
    "header": {
      "namespace": "Alexa.RTCSessionController",
      "name": "SessionConnected",
      "messageId": "<message id>",
      "correlationToken": "<an opaque correlation token>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "<an OAuth2 bearer token>"
      },
      "endpointId": "<endpoint id>",
      "cookie": {}
    },
    "payload": {
      "sessionId" : "<the session identifier>"
    }
  }
}

SessionConnected response event

If you handle a SessionConnected directive successfully, respond with an SessionConnected event. You can respond synchronously or asynchronously. If you respond asynchronously, include a correlation token and a scope with an authorization token.

SessionConnected response event payload details

Field Description Type
sessionId The identifier for the session from the original InitiateSessionWithOffer directive. A Version 4 UUID.

SessionConnected response event example

{
  "event": {
    "header": {
      "namespace": "Alexa.RTCSessionController",
      "name": "SessionConnected",
      "messageId": "<message id>",
      "correlationToken": "<an opaque correlation token>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "<an OAuth2 bearer token>"
      },
      "endpointId": "<endpoint id>"
    },
    "payload": {
      "sessionId" : "<the session identifier>"
    }
  }
}

SessionConnected directive error handling

If you can't handle a SessionConnected directive successfully, respond with an Alexa.ErrorResponse event.

SessionDisconnected Directive

The SessionDisconnected directive notifies you that your RTC session is disconnected.

SessionDisconnected directive payload details

Field Description Type
sessionId The identifier for the session from the original InitiateSessionWithOffer directive. A Version 4 UUID.

SessionDisconnected directive example

{
  "directive": {
    "header": {
      "namespace": "Alexa.RTCSessionController",
      "name": "SessionDisconnected",
      "messageId": "<message id>",
      "correlationToken": "<an opaque correlation token>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "<an OAuth2 bearer token>"
      },
      "endpointId": "<endpoint id>",
      "cookie": {}
    },
    "payload": {
      "sessionId" : "<the session identifier>"
    }
  }
}

SessionDisconnected response event

If you handle a SessionDisconnected directive successfully, respond with an SessionDisconnected event. You can respond synchronously or asynchronously. If you respond asynchronously, include a correlation token and a scope with an authorization token.

SessionDisconnected response event payload details

Field Description Type
sessionId The identifier for the session from the original InitiateSessionWithOffer directive. A Version 4 UUID.

SessionDisconnected response event example

{
  "event": {
    "header": {
      "namespace": "Alexa.RTCSessionController",
      "name": "SessionDisconnected",
      "messageId": "<message id>",
      "correlationToken": "<an opaque correlation token>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "<an OAuth2 bearer token>"
      },
      "endpointId": "<endpoint id>"
    },
    "payload": {
      "sessionId" : "<the session identifier>"
    }
  }
}

SessionDisconnected directive error handling

If you can't handle a SessionDisconnected directive successfully, respond with an Alexa.ErrorResponse event.

Session Description Protocol Offer/Answer Format

The RTCSessionController interface uses the Session Description Protocol (SDP). For more information, see Session Description Protocol (SDP).

Offer/answer exchange example

v=0
o=- 3747690900 3747690900 IN IP4 0.0.0.0
s=a 2 z
c=IN IP4 0.0.0.0
t=0 0
a=group:BUNDLE audio0 video0
m=audio 1 RTP/SAVPF 96 0
a=candidate:1 1 UDP 2013266430 xxx.xxx.xxx.xxx 8620 typ host
a=candidate:2 1 TCP 1010827775 xxx.xxx.xxx.xxx 45351 typ host tcptype passive
a=candidate:3 2 UDP 2013266429 xxx.xxx.xxx.xxx 50066 typ host
a=candidate:4 2 TCP 1010827774 xxx.xxx.xxx.xxx 65157 typ host tcptype passive
a=candidate:5 2 TCP 1015022078 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=candidate:6 1 TCP 1015022079 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=setup:actpass
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=rtpmap:96 opus/48000/2
a=rtcp:9 IN IP4 0.0.0.0
a=rtcp-mux
a=sendrecv
a=mid:audio0
a=ssrc:118039096 cname:user2571875795@host-433aaf59
a=ice-ufrag:AGVf
a=ice-pwd:h3JAYGhIaQ/Nvyaz9dLoz9
a=fingerprint:sha-256 34:D4:54:17:0C:95:2A:79:FF:72:10:21:E9:6E:F3:77:86:2F:8D:6C:33:45:BA:14:1D:43:01:D7:CD:0A:1A:84
m=video 1 RTP/SAVPF 99
a=candidate:4 1 UDP 2013266430 xxx.xxx.xxx.xxx 8620 typ host
a=candidate:5 1 TCP 1015022079 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=candidate:4 2 UDP 2013266429 xxx.xxx.xxx.xxx 50066 typ host
a=candidate:6 1 TCP 1010827775 xxx.xxx.xxx.xxx 45351 typ host tcptype passive
a=candidate:5 2 TCP 1015022078 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=candidate:6 2 TCP 1010827774 xxx.xxx.xxx.xxx 65157 typ host tcptype passive
b=AS:500
a=setup:actpass
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=rtpmap:99 H264/90000
a=rtcp:9 IN IP4 0.0.0.0
a=rtcp-mux
a=sendrecv
a=mid:video0
a=rtcp-fb:99 nack
a=rtcp-fb:99 nack pli
a=rtcp-fb:99 ccm fir
a=ssrc:3643559644 cname:user2571875795@host-433aaf59
a=ice-ufrag:AGVf
a=ice-pwd:h3JAYGhIaQ/Nvyaz9dLoz9
a=fingerprint:sha-256 34:D4:54:17:0C:95:2A:79:FF:72:10:21:E9:6E:F3:77:86:2F:8D:6C:33:45:BA:14:1D:43:01:D7:CD:0A:1A:84