Alexa.CameraStreamController Interface

Implement the Alexa.CameraStreamController interface in your Alexa skill so that users can view the feed from their security cameras. If your security camera device is capable of real-time communication (RTC), implement the Alexa.RTCSessionController interface instead. For more information about security skills, see Smart Home Security Overview.

If you store media recordings from your camera so that customers can view past recordings, also implement the MediaMetadata interface.

For the list of languages that the CameraStreamController interface supports, see List of Alexa Interfaces and Supported Languages.

Utterances

When you use the Alexa.CameraStreamController interface, the voice interaction model is already built for you. The following examples show some customer utterances:

Alexa, show the front door camera.

Alexa, zeige die frontkamera.

After the customer says one of these utterances, Alexa sends a corresponding directive to your skill.

Overview

You can build an Alexa skill for cloud-enabled cameras that stream video and audio by using the Real Time Streaming Protocol (RTSP).

Prerequisites and SLA requirements

Low latency is critical to an optimal user experience. To use the CameraStreamController API, you need the following:

  • RTSP + RTP streaming protocol.

  • Interleaved TCP on port 443 (for both RTP and RTSP).

  • TCP socket encryption on port 443 using TLS 1.2.

  • The RTSP commands DESCRIBE, SETUP, PLAY, and TEARDOWN are required, although we recommend a full RFC compliant implementation.

  • All RTSP URI responses must occur within six seconds from when the request is received.

  • Your Lambda skill must respond to requests within six seconds. For the best user experience, respond within one second. Operations like waking a camera to begin streaming should be done asynchronously as background tasks.

  • Network conditions should dictate the frame rate. If network conditions are good, stream at a higher frame rate. If network conditions are poor, stream at a lower frame rate. Lower frame rates might introduce motion blur, but still allows streaming to occur.

  • Under good network conditions, the first frame should render on a device with within six seconds from when the TLS handshake completes. Optimize startup latency by adjusting key frame rates and buffer times of the stream.

Local and remote execution recommendations

You can return a local URI on the same network as your device, or you can return a remote URI accessible from anywhere with an Internet connection. You should return a URI that makes the most sense for your device cloud configuration. Whether you return a local or remote URI, you must meet all requirements including the use of TLS 1.2.

In general, a URI isn't reachable both locally and remotely by default. You can make the URI accessible locally and remotely through domain purchasing or port forwarding. These solutions are technically challenging, so provide this solution only if your customers need both local and remote URI access.

Supported streaming protocols

You can use the following supported protocol values: RTSP, HLS.

Supported authorization types

You can use the following supported authorization type values: BASIC, DIGEST, NONE.

Supported video codecs

You can use the following supported video codec values: H264, MPEG2, MJPEG, JPG.

Supported audio codecs

You can use the following supported audio codec values: G711, AAC, NONE.

Supported resolutions

The supported resolutions are 480p - 1080p.

Properties and objects

The cameraStream object

The Alexa.CameraStreamController interface uses the cameraStream object to represent a camera stream.

CameraStream object example

Copied to clipboard.

{
  "uri": "rtsp://username:password@link.to.video:443/feed1.mp4",
  "expirationTime": "2017-02-03T16:20:50.52Z",
  "idleTimeoutSeconds": 30,
  "protocol": "RTSP",
  "resolution": {
    "width": 1920,
    "height": 1080
  },
  "authorizationType": "BASIC",
  "videoCodec": "H264",
  "audioCodec": "AAC"
  }

CameraStream object details

Field Description Type
uri The URI for the camera stream. For a temporary URI, specify an expiration time in the expirationTime field. If the URI expires, and an error occurs, Alexa calls InitializeCameraStreams again to get a new, unexpired URI. String
expirationTime The time that the stream expires, specified in UTC. A string in ISO 8601 format, YYYY-MM-DDThh:mm:ssZ.
idleTimeoutSeconds The number of seconds of inactivity after which the camera stream times out. Integer
protocol The protocol for the stream; one of the supported protocol values. String
resolution The resolution of the stream. A resolution object.
authorizationType The authorization type for accessing the stream; one of the supported authorization type values. String
videoCodec The video codec for the stream; one of the supported video codec values. String
audioCodec The audio codec for the stream; one of the supported audio codec values. String

The cameraStreamConfiguration object

The cameraStreamConfiguration object represents the configurations that your camera supports. You identify the configurations that your camera supports in your discovery response.

CameraStreamConfiguration object example

Copied to clipboard.

{
  "protocols": ["RTSP"],
  "resolutions": [{"width":1920, "height":1080}, {"width":1280, "height":720}],
  "authorizationTypes": ["BASIC"],
  "videoCodecs": ["H264", "MPEG2"],
  "audioCodecs": ["G711"]
},

CameraStreamConfiguration object details

Field Description Type
protocols The protocols that you support. An array of supported protocol values. Array
resolutions The resolutions that you support. An array of resolution objects. Array
authorizationTypes The authorization types that you support. An array of supported authorization type values. Array
videoCodecs The video codecs that you support. An array of supported video codec values. Array
audioCodecs The audio codecs that you support. An array of supported audio codec values. Array

The resolution object

The resolution object represents the height and width of a camera stream.

Resolution object example

Copied to clipboard.

{
  "height": 720,
  "width": 1280
}

Discovery

You describe endpoints that support Alexa.CameraStreamController using the standard discovery mechanism described in Alexa.Discovery.

Use CAMERA for the display category. For the full list of display categories, see display categories.

In addition to the usual discovery response fields, for the CameraStreamController entry in the capabilities array, include the following fields.

Field Description Type
cameraStreamConfigurations The configurations that your camera supports. An array of cameraStreamConfiguration objects.

Discover response example

The following example shows a Discover.Response message for a security camera that supports the Alexa.CameraStreamController, MediaMetadata, and EndpointHealth interfaces.

Copied to clipboard.

{
  "event": {
    "header": {
      "namespace":"Alexa.Discovery",
      "name":"Discover.Response",
      "payloadVersion": "3",
      "messageId": "<message id>"
    },
    "payload":{
      "endpoints":[
        {
          "endpointId": "<unique ID of the endpoint>",
          "manufacturerName": "<the manufacturer name of the endpoint>",
          "description": "<a description that appears in the Alexa app>",
          "friendlyName": "Front door camera",
          "displayCategories": ["CAMERA"],
          "cookie": {},
          "capabilities": [
            {
              "type": "AlexaInterface",
              "interface": "Alexa.CameraStreamController",
              "version": "3",
              "cameraStreamConfigurations" : [
                  {
                    "protocols": ["RTSP"],
                    "resolutions": [{"width":1920, "height":1080}, {"width":1280, "height":720}],
                    "authorizationTypes": ["BASIC"],
                    "videoCodecs": ["H264", "MPEG2"],
                    "audioCodecs": ["G711"]
                  },
                  {
                    "protocols": ["RTSP"],
                    "resolutions": [{"width":1920, "height":1080}, {"width":1280, "height":720}],
                    "authorizationTypes": ["NONE"],
                    "videoCodecs": ["H264"],
                    "audioCodecs": ["AAC"]
                 }
              ]
            },
            {
              "type": "AlexaInterface",
              "interface": "Alexa.MediaMetadata",
              "version": "3",
              "proactivelyReported": true
            },
            {
              "type": "AlexaInterface",
              "interface": "Alexa.EndpointHealth",
              "version": "3",
              "properties": {
                "supported": [
                  {
                    "name":"connectivity"
                  }
                ],
                "proactivelyReported": true,
                "retrievable": true
              }
            },
            {
              "type": "AlexaInterface",
              "interface": "Alexa",
              "version": "3"
            }
          ]
        }
      ]
    }
  }
}

Directives

InitializeCameraStreams directive

Support the InitializeCameraStreams directive so that users ask to see the feed from their security camera.

The following examples show user utterances:

Alexa, show the front door camera.

Alexa, zeige die frontkamera.

InitializeCameraStreams directive payload details

Field Description Type
cameraStreams An array of cameraStream objects that provide information about the stream. Array

InitializeCameraStreams directive example

The following example illustrates a InitializeCameraStreams directive that Alexa sends to your skill.

{
  "directive": {
    "header": {
      "namespace": "Alexa.CameraStreamController",
      "name": "InitializeCameraStreams",
      "messageId": "<message id>",
      "correlationToken": "<an opaque correlation token>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "endpointId": "<endpoint id>",
      "cookie": {}
    },
    "payload": {
      "cameraStreams": [
        {
          "protocol": "RTSP",
          "resolution": {"width": 1920, "height": 1080},
          "authorizationType": "BASIC",
          "videoCodec": "H264",
          "audioCodec": "AAC"
        },
        {
          "protocol": "RTSP",
          "resolution": {"width": 1280, "height": 720},
          "authorizationType": "NONE",
          "videoCodec": "MPEG2",
          "audioCodec": "G711"
        }
      ]
    }
  }
}

InitializeCameraStreams response event

If you handle an InitializeCameraStreams directive successfully, respond with an Alexa.CameraStreamController.Response event. In the context object, include the values of all relevant properties.

Alexa.CameraStreamController.Response event payload details

Field Description Type Required
cameraStreams An array of cameraStream objects that provide information about the stream. Array Yes
imageUri The URI to a static image from a previous feed of the camera specified in the request. String Yes

Alexa.CameraStreamController.Response event example

Copied to clipboard.

{
  "event": {
    "header": {
      "namespace": "Alexa.CameraStreamController",
      "name": "Response",
      "messageId": "<message id>",
      "correlationToken": "<an opaque correlation token>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "endpointId": "<endpoint id>"
    },
    "payload": {
      "cameraStreams": [
        {
          "uri": "rtsp://username:password@link.to.video:443/feed1.mp4",
          "expirationTime": "2017-02-03T16:20:50.52Z",
          "idleTimeoutSeconds": 30,
          "protocol": "RTSP",
          "resolution": {"width": 1920, "height": 1080},
          "authorizationType": "BASIC",
          "videoCodec": "H264",
          "audioCodec": "AAC"
        }
      ],
      "imageUri": "https://username:password@link.to.image/image.jpg"
    }
  },
  "context": {
    "properties": [
      {
        "namespace": "Alexa.EndpointHealth",
        "name": "connectivity",
        "value": {
          "value": "OK"
        },
        "timeOfSample": "2017-02-03T16:20:50.52Z",
        "uncertaintyInMilliseconds": 0
      }
    ]
  }
}

InitializeCameraStreams directive error handling

If you can't handle a InitializeCameraStreams directive successfully, respond with an Alexa.ErrorResponse event. If the customer needs to configure the camera, return the NOT_SUPPORTED_IN_CURRENT_MODE error type and include the currentDeviceMode field with a value of NOT_PROVISIONED.

State reporting

Alexa sends a ReportState directive to request information about the state of an endpoint. When Alexa sends a ReportState directive, you send a StateReport event in response. The response contains the current state of all of the retrievable properties in the context object. You identify your retrievable properties in your discovery response. For more information about state reports, see Understand State Reporting.

StateReport response event example

Copied to clipboard.

{
  "event": {
    "header": {
      "namespace": "Alexa",
      "name": "StateReport",
      "messageId": "<message id>",
      "correlationToken": "<an opaque correlation token>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "endpointId": "<endpoint id>"
    },
    "payload": {}
  },
  "context": {
    "properties": [
      {
        "namespace": "Alexa.EndpointHealth",
        "name": "connectivity",
        "value": {
          "value": "OK"
        },
        "timeOfSample": "2017-02-03T16:20:50.52Z",
        "uncertaintyInMilliseconds": 0
      }
    ]
  }
}

Change reporting

You send a ChangeReport event to proactively report changes in the state of an endpoint. You identify the properties that you proactively report in your discovery response. For more information about change reports, see Understand State Reporting.

ChangeReport event example

Copied to clipboard.

{  
  "event": {
    "header": {
      "namespace": "Alexa",
      "name": "ChangeReport",
      "messageId": "<message id>",
      "payloadVersion": "3"
    },
    "endpoint": {
      "scope": {
        "type": "BearerToken",
        "token": "<an OAuth2 bearer token>"
      },
      "endpointId": "<endpoint id>"
    },
    "payload": {
      "change": {
        "cause": {
          "type": "PERIODIC_POLL"
        },
        "properties": [
          {
            "namespace": "Alexa.EndpointHealth",
            "name": "connectivity",
            "value": {
              "value": "UNREACHABLE"
            },
            "timeOfSample": "2017-02-03T16:20:50.52Z",
            "uncertaintyInMilliseconds": 0
          }
        ]
      }
    }
  },
  "context": {
    "properties": [
    ]
  }
}