Ti ringraziamo per la visita. Questa pagina è per il momento disponibile solo in inglese.

Alexa.RTCSessionController Interface

The Alexa.RTCSessionController interface describes the messages used by Alexa to interact with endpoints capable of real-time communnication (RTC). The RTCSessionController interface supports 1-way (half duplex) or 2-way (full duplex) communication over audio and video. By using the RTCSessionController interface in your applications, Alexa customers can communicate with a visitor at their front door through their camera and intercom. For more information, see Announcing 2-Way Communication APIs.

For the list of locales that are supported for the RTCSessionController interface, see List of Capability Interfaces and Supported Locales.

Utterances

When you use the Alexa.RTCSessionController interface, the voice interaction model is already built for you. The following examples show some customer utterances:

Alexa, answer the front door.
Alexa, respond to the front door.
Alexa, talk to my front door camera.
Alexa, talk to the person at the main door.

After the customer says one of these utterances, Alexa sends a corresponding directive to your skill.

Overview

Supported Communication Types

  • 1-way (half duplex) communication allows customers to communicate in two directions, but not simultaneously. For example:
    • A walkie-talkie
    • A push-to-talk door intercom
  • 2-way (full duplex) communication allows customers to communicate in two directions simultaneously. For example:
    • A telephone
    • A telephone door intercom

Utterances

Customers can start communication with a person next to a real time communication device by talking to their Alexa-enabled device (for example, an Echo Show or Echo Spot) or by using the microphone icon when they are in live streaming mode.

Customers can start conversations by saying one of the following:

User: Alexa, answer the front door
User: Alexa, get the call going with the front door
User: Alexa, please call front door
User: Alexa, respond to the front door
User: Alexa, speak to the front door
User: Alexa, talk to my front door camera
User: Alexa, talk to the front door
User: Alexa, talk to the person at the main door

Customers can end conversations by saying one of the following:

User: Alexa, go home
User: Alexa, stop

Prerequisites and SLA Requirements

To use the RTCSessionController API, you need the following:

  • A minimum timeout of one minute is required.

  • For any offer sent to your skill, you must generate an answer within six seconds.

  • Your device or platform must be WebRTC compliant or support the suite of protocols by WebRTC and all supported resiliency mechanisms used in WebRTC. Specifically,

    • Negative acknowledgement (NACK)
    • Picture loss indication (PLI)
    • Full intra request (FIR)
    • Receiver estimated maximum bitrate (REMB)
  • For resource considerations, you must support bundling and rtcp-mux. You use a bundle to send audio and video over the same connection to reduce the number of open sockets.

  • To support full-duplex communication, your device must employ effective algorithms for acoustic echo cancellation (AEC) and noise suppression.

  • To support half-duplex communication, you can use the Push to Talk feature through the typical live view scenario. Declare isFullDuplexAudioSupported as false in the discovery response.

  • To support video, you must use one of the following video codecs:

    • H264 (up to profile high, level 4.1)
  • To support audio, you must use one of the following audio codecs:

    • Opus (preferred codec)
    • PCMU/G.711
    • AAC-LC, HE-AAC
  • For Interactive Connectivity Establishment (ICE) candidates, you can use either UDP or TCP but you must use IPv4.

Signaling Diagram

The RTCSessionController communication is shown in the following signaling diagram.

Diagram showing order of directives and events for RTCSessionController communication

Discovery

You describe endpoints that support Alexa.RTCSessionController using the standard discovery mechanism described in Alexa.Discovery. In addition, identify if duplex is supported in the configuration of the Alexa.RTCSessionController capability.

Use CAMERA or DOORBELL for the display category. For the full list of display categories, see display categories.

Discover response example

{
    "event": {
      "header": {
        "namespace":"Alexa.Discovery",
        "name":"Discover.Response",
        "payloadVersion": "3",
        "messageId": "<message id>"
      },
      "payload":{
        "endpoints":[
          {
              "endpointId": "<unique ID of the endpoint>",
              "manufacturerName": "<the manufacturer name of the endpoint>",
              "modelName": "<the model name of the endpoint>",
              "description": "<a description that is shown in the Alexa app>",
              "friendlyName": "My front door camera",
              "displayCategories": [ "CAMERA" ],
              "cookie": {
              },
              "capabilities": [
              {
                "type": "AlexaInterface",
                "interface": "Alexa.RTCSessionController",
                "version": "3",
                "configuration": {
                  "isFullDuplexAudioSupported": true
                }
              }
            ]
          }
        ]
      }
    }
}

Payload details

Field Description Type Required
isFullDuplexAudioSupported True if the device supports 2-way (full duplex) communication. False if the device supports 1-way (half duplex) communication. The default is false. boolean No

Directives

InitiateSessionWithOffer Directive

Initiate a real-time communication session with a front door device.

User: Alexa, talk to my front door camera

InitiateSessionWithOffer directive example

{
    "directive": {
        "header": {
          "namespace": "Alexa.RTCSessionController",
          "name": "InitiateSessionWithOffer",
          "messageId": "d1ba3aa7-bff7-4406-9425-f25f04ec8d68",
          "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
          "payloadVersion": "3"
        },
        "endpoint": {
          "scope": {
              "type": "BearerToken",
              "token": "access-token-from-skill"
            },
            "endpointId": "device-001",
            "cookie": {
                "keys": "key/value pairs received during discovery",
              }
        },
        "payload": {
          "sessionId" : "the session identifier",
          "offer": {
             "format" : "SDP",
             "value" : "<SDP offer value>"
          }
        }
    }
}

Payload details

Field Description Type Required
sessionId The identifier of the session that wants to connect. A Version 4 UUID Yes
offer An SDP offer. string Yes

SessionConnected Directive

The directive to connect an RTC session. The payload for this message contains the identifier for the RTC session, received from the original InitiateSessionWithOffer directive.

SessionConnected directive example

{
    "directive": {
        "header": {
          "namespace": "Alexa.RTCSessionController",
          "name": "SessionConnected",
          "messageId": "d1ba3aa7-bff7-4406-9425-f25f04ec8d68",
          "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
          "payloadVersion": "3"
        },
        "endpoint": {
          "scope": {
              "type": "BearerToken",
              "token": "access-token-from-skill"
          },
          "endpointId": "device-001",
          "cookie": {
              "keys": "key/value pairs received during discovery",
            }
        },
        "payload": {
             "sessionId" : "session identifier"
         }
    }
}

Payload details

Field Description Type Required
sessionId The identifier of the session that wants to connect. A Version 4 UUID Yes

SessionDisconnected Directive

The directive to disconnect an RTC session. The payload for this message contains the identifier for the RTC session, received from the original InitiateSessionWithOffer directive.

SessionDisconnected directive example

{
    "directive": {
        "header": {
          "namespace": "Alexa.RTCSessionController",
          "name": "SessionDisconnected",
          "messageId": "d1ba3aa7-bff7-4406-9425-f25f04ec8d68",
          "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
          "payloadVersion": "3"
        },
        "endpoint": {
          "scope": {
            "type": "BearerToken",
            "token": "access-token-from-skill"
          },
          "endpointId": "device-001",
          "cookie": {
              "keys": "key/value pairs received during discovery",
            }
        },
        "payload": {
            "sessionId" : "session identifier"
        }
    }
}

Payload details

Field Description Type Required
sessionId The identifier of the session that wants to disconnect. A Version 4 UUID Yes

Properties and Events

Properties

There are no reportable properties currently defined for this interface.

AnswerGeneratedForSession Event

If the InitiateSessionWithOffer directive was successfully handled, you should respond with an AnswerGeneratedForSession event. The payload for this message contains an SDP answer.

AnswerGeneratedForSession event example

{
    "event": {
        "header": {
            "namespace": "Alexa.RTCSessionController",
            "name": "AnswerGeneratedForSession",
            "messageId": "30d2cd1a-ce4f-4542-aa5e-04bd0a6492d5",
            "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
            "payloadVersion": "3"
        },
        "endpoint": {
            "endpointId" : "device-001",
        },
        "payload": {
            "answer": {
                "format" : "SDP",
                "value" : "<SDP answer value>"
            }
        }
    }
}

Payload details

Field Description Type Required
answer An SDP answer. string Yes

SessionConnected Event

If the SessionConnected directive was successfully handled, you should respond with a SessionConnected event. The payload for this message contains the identifier for the RTC session, received from the original InitiateSessionWithOffer directive.

SessionConnected event example

{
  "event": {
    "header": {
      "namespace": "Alexa.RTCSessionController",
      "name": "SessionConnected",
      "messageId": "30d2cd1a-ce4f-4542-aa5e-04bd0a6492d5",
      "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
      "payloadVersion": "3"
    },
    "endpoint": {
       "endpointId" :  "device-001" ,
    },
    "payload": {
        "sessionId" : "session identifier"
    }
  }
}

Payload details

Field Description Type Required
sessionId The identifier of the session that was connected. A Version 4 UUID Yes

SessionDisconnected Event

If the SessionDisconnected directive was successfully handled, you should respond with a SessionDisconnected event. The payload for this message contains the identifier for the RTC session, received from the original InitiateSessionWithOffer directive.

SessionDisconnected event example

{
  "event": {
    "header": {
      "namespace": "Alexa.RTCSessionController",
      "name": "SessionDisconnected",
      "messageId": "30d2cd1a-ce4f-4542-aa5e-04bd0a6492d5",
      "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
      "payloadVersion": "3"
    },
    "endpoint": {
       "endpointId" : "device-001"
    },
    "payload": {
        "sessionId" : "session identifier"
    }
  }
}

Payload details

Field Description Type Required
sessionId The identifier of the session that was disconnected. A Version 4 UUID Yes

Session Description Protocol Offer/Answer Format

The RTCSessionController interface uses the Session Description Protocol (SDP). For more information, see Session Description Protocol (SDP).

Offer/answer exchange example

v=0
o=- 3747690900 3747690900 IN IP4 0.0.0.0
s=a 2 z
c=IN IP4 0.0.0.0
t=0 0
a=group:BUNDLE audio0 video0
m=audio 1 RTP/SAVPF 96 0
a=candidate:1 1 UDP 2013266430 xxx.xxx.xxx.xxx 8620 typ host
a=candidate:2 1 TCP 1010827775 xxx.xxx.xxx.xxx 45351 typ host tcptype passive
a=candidate:3 2 UDP 2013266429 xxx.xxx.xxx.xxx 50066 typ host
a=candidate:4 2 TCP 1010827774 xxx.xxx.xxx.xxx 65157 typ host tcptype passive
a=candidate:5 2 TCP 1015022078 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=candidate:6 1 TCP 1015022079 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=setup:actpass
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=rtpmap:96 opus/48000/2
a=rtcp:9 IN IP4 0.0.0.0
a=rtcp-mux
a=sendrecv
a=mid:audio0
a=ssrc:118039096 cname:user2571875795@host-433aaf59
a=ice-ufrag:AGVf
a=ice-pwd:h3JAYGhIaQ/Nvyaz9dLoz9
a=fingerprint:sha-256 34:D4:54:17:0C:95:2A:79:FF:72:10:21:E9:6E:F3:77:86:2F:8D:6C:33:45:BA:14:1D:43:01:D7:CD:0A:1A:84
m=video 1 RTP/SAVPF 99
a=candidate:4 1 UDP 2013266430 xxx.xxx.xxx.xxx 8620 typ host
a=candidate:5 1 TCP 1015022079 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=candidate:4 2 UDP 2013266429 xxx.xxx.xxx.xxx 50066 typ host
a=candidate:6 1 TCP 1010827775 xxx.xxx.xxx.xxx 45351 typ host tcptype passive
a=candidate:5 2 TCP 1015022078 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=candidate:6 2 TCP 1010827774 xxx.xxx.xxx.xxx 65157 typ host tcptype passive
b=AS:500
a=setup:actpass
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=rtpmap:99 H264/90000
a=rtcp:9 IN IP4 0.0.0.0
a=rtcp-mux
a=sendrecv
a=mid:video0
a=rtcp-fb:99 nack
a=rtcp-fb:99 nack pli
a=rtcp-fb:99 ccm fir
a=ssrc:3643559644 cname:user2571875795@host-433aaf59
a=ice-ufrag:AGVf
a=ice-pwd:h3JAYGhIaQ/Nvyaz9dLoz9
a=fingerprint:sha-256 34:D4:54:17:0C:95:2A:79:FF:72:10:21:E9:6E:F3:77:86:2F:8D:6C:33:45:BA:14:1D:43:01:D7:CD:0A:1A:84

Error Handling

You should reply with an error if you cannot complete the customer request for some reason. For more details, see Alexa.ErrorResponse.

Interface Description
Alexa.CameraStreamController Describes the messages used retrieve camera streams from camera endpoints.
Alexa.DoorbellEventSource An endpoint that is capable of raising doorbell events.
Alexa.MotionSensor Describes an endpoint that senses physical movement in an area.