Alexa.CameraStreamController Interface 3


Implement the Alexa.CameraStreamController interface in your Alexa skill so that users can view the feed from their security cameras. If your security camera device is capable of real-time communication (RTC), implement the Alexa.RTCSessionController interface instead. For more details about security skills, see Smart Home Security Overview.

For the list of languages that the CameraStreamController interface supports, see List of Alexa Interfaces and Supported Languages. For the definitions of the message properties, see Alexa Interface Message and Property Reference.

Utterances

When you use the Alexa.CameraStreamController interface, the Alexa service provides the voice interaction model for you. After the user says one of these utterances, Alexa sends a corresponding directive or report state request to your skill.

The following examples show some user utterances:

Alexa, show the front door camera.
Alexa, stop camera.
Alexa, hide camera.

Alexa, montre la caméra de la porte d'entrée.
Alexa, arrête la caméra.
Alexa, cache la caméra.

Alexa, zeige die Haustür-Kamera.
Alexa, stoppe Kamera.
Alexa, blende Kamera aus.

Alexa, सामने के दरवाज़े का कैमरा दिखाओ
Alexa, कैमरा बंद करो
Alexa, कैमरा छिपाओ

Alexa, mostra la videocamera della porta d'ingresso.
Alexa, chiudi la videocamera.
Alexa, nascondi la videocamera.

アレクサ、玄関のカメラを見せて
アレクサ、カメラを止めて
アレクサ、カメラを隠して

Alexa, mostra a câmera da porta da frente.
Alexa, oculte a câmera.
Alexa, esconda a câmera.

Alexa, muestra la cámara de la puerta principal.
Alexa, deten la cámara.
Alexa, oculta la cámara.
Alexa, esconde la cámara.

Real time streaming protocol usage

You can build an Alexa skill for cloud-enabled cameras that stream video and audio by using the Real Time Streaming Protocol (RTSP).

Prerequisites and SLA requirements

Low latency is critical to an optimal user experience. To use the CameraStreamController API, you need the following:

  • RTSP + RTP streaming protocol.

  • Interleaved TCP on port 443 (for both RTP and RTSP).

  • TCP socket encryption on port 443 using TLS 1.2.

  • Amazon requires the following RTSP commands, DESCRIBE, SETUP, PLAY, and TEARDOWN, and recommends a full RFC compliant implementation.

  • All RTSP URI responses must occur within six seconds after you receive the request.

  • Your Lambda skill must respond to requests within six seconds. For the best user experience, respond within one second. Perform operations, such as waking a camera to begin streaming, asynchronously as background tasks.

  • Network conditions should dictate the frame rate. If network conditions are good, stream at a higher frame rate. If network conditions are poor, stream at a lower frame rate. Lower frame rates might introduce motion blur, but still allows streaming to occur.

  • Under good network conditions, the first frame should render on a device with within six seconds from when the TLS handshake completes. Optimize startup latency by adjusting key frame rates and buffer times of the stream.

RTSP URI support

The RTSP URI identifies the camera stream. Amazon recommends that you return a URI that's accessible from anywhere with an internet connection. The URI must meet all RTSP requirements, including TLS 1.2. If you return a URI that's only accessible on a local network, users won't be able to view their camera feed on all Alexa-enabled devices.

Supported streaming protocols

You can use the following supported protocol values: RTSP, HLS.

Supported authorization types

You can use the following supported authorization type values: BASIC, DIGEST, NONE.

Supported video codecs

You can use the following supported video codec values: H264, MPEG2, MJPEG, JPG.

Supported audio codecs

You can use the following supported audio codec values: G711, AAC, NONE.

Supported resolutions

The supported resolutions are 480–1080 pixels.

Properties and objects

CameraStream object

The CameraStream object defines a video and audio camera stream.

The following example shows a CameraStream object.

Copied to clipboard.

{
  "uri": "rtsp://username:password@link.to.video:443/feed1.mp4",
  "expirationTime": "2017-02-03T16:20:50.52Z",
  "idleTimeoutSeconds": 30,
  "protocol": "RTSP",
  "resolution": {
    "width": 1920,
    "height": 1080
  },
  "authorizationType": "BASIC",
  "videoCodec": "H264",
  "audioCodec": "AAC"
  }

The CameraStream object includes the following properties.

Property Description Type
uri Identifies the camera stream. Set to RTSP URL.
For a temporary URI, specify an expiration time in the expirationTime field. If the URI expires, and an error occurs, Alexa sends InitializeCameraStreams again to get a new, unexpired URI.
String
expirationTime Time that the stream expires.
Defined in ISO 8601 format, YYYY-MM-DDThh:mm:ssZ.
String
idleTimeoutSeconds Number of seconds of inactivity after which the camera stream times out. Integer
protocol Protocol for the stream.
Valid value: One of the supported protocol values.
String
resolution Resolution of the stream. Resolution object
authorizationType Authorization type for accessing the stream.
Valid value: One of the supported authorization type values.
String
videoCodec Video codec for the stream.
Valid value: One of the supported video codec values.
String
audioCodec Audio codec for the stream.
Valid value: One of the supported audio codec values.
String

CameraStreamConfiguration object

The CameraStreamConfiguration object represents the configurations that your camera supports. You identify the configurations in your discovery response.

The following example shows a CameraStreamConfiguration object.

Copied to clipboard.

{
  "protocols": ["RTSP"],
  "resolutions": [{"width":1920, "height":1080}, {"width":1280, "height":720}],
  "authorizationTypes": ["BASIC"],
  "videoCodecs": ["H264", "MPEG2"],
  "audioCodecs": ["G711"]
},

The CameraStreamConfiguration object includes the following properties.

Property Description Type
protocols Protocols that your endpoint supports. Array of supported protocol strings
resolutions Resolutions that your endpoint supports. Array of Resolution objects
authorizationTypes Authorization types that your endpoint supports. Array of supported authorization type strings
videoCodecs Video codecs that your endpoint supports. Array of supported video codec strings
audioCodecs Audio codecs that your endpoint supports. Array of supported audio codec strings

Resolution object

The Resolution object represents the height and width of a camera stream in pixels. The supported resolutions are 480–1080 pixels.

The Resolution object includes the following properties.

Property Description Type

height

Height of the camera stream in pixels.

Integer

width

Width of the camera stream in pixels.

Integer

Discovery

You describe endpoints that support Alexa.CameraStreamController using the standard discovery mechanism described in Alexa.Discovery.

Use CAMERA for the display category. For the full list of display categories, see display categories.

To let Alexa know the health of your device, also implement the Alexa.EndpointHealth interface.

Capabilities array

In addition to the usual discovery response fields, for the CameraStreamController entry in the capabilities array, include the following fields.

Property Description Type Required

cameraStreamConfigurations

Streaming configurations that your camera supports.

Array of CameraStreamConfiguration objects

Yes

Discover response example

The following example shows a Discover.Response message for a security camera that supports the Alexa.CameraStreamController and Alexa.EndpointHealth interfaces.

Copied to clipboard.

{
  "event": {
    "header": {
      "namespace":"Alexa.Discovery",
      "name":"Discover.Response",
      "payloadVersion": "3",
      "messageId": "Unique identifier, preferably a version 4 UUID"
    },
    "payload":{
      "endpoints":[
        {
          "endpointId": "Unique ID of the endpoint",
          "manufacturerName": "Manufacturer of the endpoint",
          "description": "Description to be shown in the Alexa app",
          "friendlyName": "Front door camera",
          "displayCategories": ["CAMERA", "LIGHT"],
          "cookie": {},
          "capabilities": [
            {
              "type": "AlexaInterface",
              "interface": "Alexa.CameraStreamController",
              "version": "3",
              "cameraStreamConfigurations" : [
                  {
                    "protocols": ["RTSP"],
                    "resolutions": [{"width":1920, "height":1080}, {"width":1280, "height":720}],
                    "authorizationTypes": ["BASIC"],
                    "videoCodecs": ["H264", "MPEG2"],
                    "audioCodecs": ["G711"]
                  },
                  {
                    "protocols": ["RTSP"],
                    "resolutions": [{"width":1920, "height":1080}, {"width":1280, "height":720}],
                    "authorizationTypes": ["NONE"],
                    "videoCodecs": ["H264"],
                    "audioCodecs": ["AAC"]
                 }
              ]
            },
            {
              "type": "AlexaInterface",
              "interface": "Alexa.EndpointHealth",
              "version": "3",
              "properties": {
                "supported": [
                  {
                    "name":"connectivity"
                  }
                ],
                "proactivelyReported": true,
                "retrievable": true
              }
            },
            {
              "type": "AlexaInterface",
              "interface": "Alexa",
              "version": "3"
            }
          ]
        }
      ]
    }
  }
}

Directives

Alexa sends the following Alexa.CameraStreamController interface directives to your skill.

InitializeCameraStreams directive

Support the InitializeCameraStreams directive so that users can ask to see the feed from their security camera.

The following examples show user utterances:

Alexa, show the front door camera.

Alexa, montre la caméra de la porte d'entrée.

Alexa, zeige die Haustür-Kamera.

Alexa, सामने के दरवाज़े का कैमरा दिखाओ

Alexa, mostra la videocamera della porta d'ingresso.

アレクサ、玄関のカメラを見せて

Alexa, mostra a câmera da porta da frente.

Alexa, muestra la cámara de la puerta principal.

InitializeCameraStreams directive example

The following example illustrates a InitializeCameraStreams directive that Alexa sends to your skill.

{
  "directive": {
    "header": {
      "namespace": "Alexa.CameraStreamController",
      "name": "InitializeCameraStreams",
      "messageId": "Unique version 4 UUID",
      "correlationToken": "Opaque correlation token",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "OAuth2.0 bearer token"
      },
      "endpointId": "Endpoint ID",
      "cookie": {}
    },
    "payload": {
      "cameraStreams": [
        {
          "protocol": "RTSP",
          "resolution": {"width": 1920, "height": 1080},
          "authorizationType": "BASIC",
          "videoCodec": "H264",
          "audioCodec": "AAC"
        },
        {
          "protocol": "RTSP",
          "resolution": {"width": 1280, "height": 720},
          "authorizationType": "NONE",
          "videoCodec": "MPEG2",
          "audioCodec": "G711"
        }
      ]
    }
  }
}

InitializeCameraStreams directive payload

Field Description Type
cameraStreams List of video and audio stream configurations to choose from. Array of CameraStream objects

InitializeCameraStreams response event

If you handle an InitializeCameraStreams directive successfully, respond with an Alexa.CameraStreamController.Response event. In the context object, include the values of all relevant properties.

The following example shows an Alexa.CameraStreamController response.

Copied to clipboard.

{
  "event": {
    "header": {
      "namespace": "Alexa.CameraStreamController",
      "name": "Response",
      "messageId": "Unique identifier, preferably a version 4 UUID",
      "correlationToken": "Opaque correlation token that matches the request",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "OAuth2.0 bearer token"
      },
      "endpointId": "Endpoint ID"
    },
    "payload": {
      "cameraStreams": [
        {
          "uri": "rtsp://username:password@link.to.video:443/feed1.mp4",
          "expirationTime": "2017-02-03T16:20:50.52Z",
          "idleTimeoutSeconds": 30,
          "protocol": "RTSP",
          "resolution": {"width": 1920, "height": 1080},
          "authorizationType": "BASIC",
          "videoCodec": "H264",
          "audioCodec": "AAC"
        }
      ],
      "imageUri": "https://example.com/image.jpg"
    }
  },
  "context": {
    "properties": [
      {
        "namespace": "Alexa.EndpointHealth",
        "name": "connectivity",
        "value": {
          "value": "OK"
        },
        "timeOfSample": "2017-02-03T16:20:50.52Z",
        "uncertaintyInMilliseconds": 0
      }
    ]
  }
}

InitializeCameraStreams response payload details

Property Description Type Required

cameraStreams

Provides information about the video and audio stream.

Array of CameraStream objects

Yes

imageUri

URI to a static image from a previous feed of the camera specified in the request.

String

Yes

InitializeCameraStreams directive error handling

If you can't handle a InitializeCameraStreams directive successfully, respond with an Alexa.ErrorResponse event. If the customer needs to configure the camera, return the NOT_SUPPORTED_IN_CURRENT_MODE error type and include the currentDeviceMode field with a value of NOT_PROVISIONED.

State reporting

Alexa sends a ReportState directive to request information about the state of an endpoint. When Alexa sends a ReportState directive, you send a StateReport event in response. The response contains the current state of all the retrievable properties in the context object. You identify your retrievable properties in your discovery response. For details about state reports, see Understand State and Change Reporting.

StateReport response example

Copied to clipboard.

{
  "event": {
    "header": {
      "namespace": "Alexa",
      "name": "StateReport",
      "messageId": "Unique identifier, preferably a version 4 UUID",
      "correlationToken": "Opaque correlation token that matches the request",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "OAuth2.0 bearer token"
      },
      "endpointId": "Endpoint ID"
    },
    "payload": {}
  },
  "context": {
    "properties": [
      {
        "namespace": "Alexa.EndpointHealth",
        "name": "connectivity",
        "value": {
          "value": "OK"
        },
        "timeOfSample": "2017-02-03T16:20:50.52Z",
        "uncertaintyInMilliseconds": 0
      }
    ]
  }
}

Change reporting

You send a ChangeReport event to report changes proactively in the state of an endpoint. You identify the properties that you proactively report in your discovery response. For details about change reports, see Understand State and Change Reporting.

ChangeReport event example

Copied to clipboard.

{  
  "event": {
    "header": {
      "namespace": "Alexa",
      "name": "ChangeReport",
      "messageId": "Unique identifier, preferably a version 4 UUID",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "OAuth2.0 bearer token"
      },
      "endpointId": "Endpoint ID"
    },
    "payload": {
      "change": {
        "cause": {
          "type": "PERIODIC_POLL"
        },
        "properties": [
          {
            "namespace": "Alexa.EndpointHealth",
            "name": "connectivity",
            "value": {
              "value": "UNREACHABLE"
            },
            "timeOfSample": "2017-02-03T16:20:50.52Z",
            "uncertaintyInMilliseconds": 0
          }
        ]
      }
    }
  },
  "context": {
    "properties": [
    ]
  }
}

Was this page helpful?

Last updated: Jan 26, 2024