Understanding data-binding, transformers, commands, and custom components - How to build a slideshow project with Alexa Presentation Language (APL)

Gaetano Ursomanno May 07, 2020

Introduction

Are you a skill developer looking to implement visuals to your skill? Your skill already implements APL but you are looking to enhance it? If yes, please continue reading.

This blog post will walk you through the creation of a fully-fledged slideshow project using APL. During the process I will be explaining how to implement data-binding, how to declare and use transformers (required for SpeakList and SpeakItem commands), and how to create custom APL components.

By the end of this blog post, you will have an APL document that will generate a dynamic slideshow with different pages according to the content of the following sample input:

[
        {
            "caption": "First Caption",
            "image": "https://myServer/image1.png"
        },
        {
            "caption": "Second Caption",
            "image": "https://myServer/image2.png"
        },
        {   
            "caption": "Third Caption",
            "image": "https://myServer/image3.png"
        }
]

Example output:

Prerequisites:

An Amazon developer account. You can create one here
Basic knowledge of the APL Authoring Tool
A screen-enabled Alexa device like the Amazon Echo Show. If you don't have one, please follow the optional "Render the document from a skill" step

Project goals:

The project will be a slideshow of pictures
Use a Pager as the "core" component
Every page of the slideshow should contain a picture and a caption, and it should cover the entire screen
For every swipe, Alexa will read the caption of the next page

Let's start!

Building our APL layout

To build our layout, the first thing to do is to open the APL Authoring Tool.

Select "Start from scratch" and you will see a page like this:

As the base of our document, add a Container by pressing the + button, assign it an id, set its width to 100vw and its height to 100vh.

Vw and vh respectively refer to vertical and horizontal points (in %) of the device's viewport profile.

Then, it is time to add our core component: click on the base Container you have just added, then press the + button to add a Pager right below it. Again, assign it an id and its width / height dimensions.

Repeat this step and add another Container under the Pager: this will hold the pages of our slideshow.

The current component hierarchy should look like this:

Keep in mind that the Pager will generate a page for every child component we add under it.

Designing how the slideshow pages should look like

Our project requires that a slideshow page has a picture and a caption, Let's add them! Go on the last Container you have added, press the + button add an Image component, and then a Text. They should be both under the Container.

Since we want our page to take 100% of the screen, we will reserve 90% of it for the Image, and the remaining 10% of it for the Text.

So make sure you set the following component properties as follows:

Image: width 100vw and height 90vh
Text: width 100vw and height 10vh

Understanding and implementing data-binding

Data-binding is the process of filling the data property of native parent components (the Pager in our case) with an input array. This will define 2 things:

How many times the first child component of the Pager should be repeated, which is equals to the length of the array
The data that our APL document can have access to

Example: If our data array contains 3 elements, the first child of the Pager will be shown 3 times. In our particular case, 3 pages will be generated.

Click on the Pager in the Authoring Tool, and fill its data property with {payload.data.properties.values}. You will see that this refers to the content of the datasources mentioned below.

Let's test it! The current JSON representation of the Pager in our document should look like this:

{
	"type": "Pager",
	"id": "mainPager",
	"width": "100vw",
	"height": "100vh",
	"data": "${payload.data.properties.values}",
	"items": [{
		"type": "Container",
		"id": "myPageContainer",
		"width": "100vw",
		"height": "100vh",
		"items": [{
				"type": "Image",
				"width": "100vw",
				"height": "90vh",
				"source": "https://public-us-east-1.s3.amazonaws.com/amazon-logo.png"
			},
			{
				"type": "Text",
				"width": "100vw",
				"height": "10vh",
				"text": "This is a page"
			}
		]
	}]
}

Now, consider the following datasources:

{
    "data":
    {
        "properties":
        {
            "values":["amazon","alexa" ,"skill"]
        }
    }
}

Copy them into the "DATA" section of the Authoring Tool, and send the layout to a device with the View on... button (or follow the Render the document from a skill" step at the end of the post)

You will see that the document contains 3 pages, even though only one Container (myPageContainer) is present. This is because the length of the array ${payload.data.properties.values} is 3.

Having trouble pasting? Download the full importable layout from here

The length of the array is conditioning the number of generated childs, but how to acces the actual data and display it to the user?

The input array is accessible from all the childs of the component you bound your data to, with the following evaluation (Look at the text property):

{
	"type": "Text",
	"width": "100vw",
	"height": "10vh",
	"text": "${data}"
}

What should I expect after setting it?

Every page of the Pager will access its relative portion of data, based on their position. Refer to the following table for further details:

page	assigned array element	resulting text value
0	data[0]	"amazon"
1	data[1]	"alexa"
2	data[2]	"skill"

You can send the layout back to the device after setting the text property to ${data} to see that every page is now showing a different caption.

Building the datasources structure for our project, and implementing textToSpeech transformers for the SpeakItem command

Now that we have explained how data-binding works, it is time to fill our datasources with meaningful data for our project.

Go back to the Authoring Tool, and replace the whole datasources object with the following JSON:

{
        "data": {
            "properties": {
                "values": [
        {
            "name": "Page 1",
            "caption": "First Caption",
            "image": "https://myServer/image1.png"
        },
        {
            "name": "Page 2",
            "caption": "Second Caption",
            "image": "https://myServer/image2.png"
        },
        {   "name": "Page 3",
            "caption": "Third Caption",
            "image": "https://myServer/image3.png"
        }
]
            },
            "transformers": [
                {
                    "inputPath": "values[*].caption",
                    "outputName": "transformerOutput",
                    "transformer": "textToSpeech"
                }
            ]
        }
    }

We have just pasted the sample input from the introduction into the data.properties.values array and added a textToSpeech transformer in our datasources object.

What is a transformer? And why is it required for the SpeakItem command?

In a nutshell, transformer(s) are used to transform data present in datasources into alternative representations. Since our goal is to read the captions, and they are provided in the form of a string, we need a textToSpeech transformer.

Implementing a transformer is also requirement because to let Alexa read the captions we will be using the SpeakItem command, which relies on the speech property of components, and it can be only set with either an audio URL or the output of a transformer.

Let's have a closer look on how the transformer has been implemented here:

property	value	notes
inputPath	values[*].caption	contains the input data to transform
outputName	transformerOutput	output name needed for the speech property
transformer	textToSpeech	type of transformer used

Keep all these information in mind since they are required for the next steps.

Making our slideshow page a single custom component

So far our page is made of a Container, an Image and a Text component, the whole component hierarchy should look like the following at this stage:

The next step is to make all these three components just single custom one.

Go on the APL tab of the Authoring Tool, locate the layouts key, and replace it with the following:

"layouts": {
        "myPage": {
            "parameters": [
                "internalIndex",
                "internalCaption",
                "internalImageUrl"
            ],
            "item": {
                "type": "Container",
                "id": "myPageContainer",
                "width": "100vw",
                "height": "100vh",
                "items": [
                    {
                        "type": "Image",
                        "width": "100vw",
                        "height": "90vh",
                        "source": "${internalImageUrl}"
                    },
                    {
                        "type": "Text",
                        "speech": "${payload.data.properties.values[internalIndex].transformerOutput}",
                        "id": "pageText_${internalIndex}",
                        "width": "100vw",
                        "height": "10vh",
                        "text": "${internalCaption}"
                    }
                ]
            }
        }
    }

We have just declared a custom layout called myPage, moved the mainPageContainer under it, and declared 3 properties (internalIndex, internalCaption, and internalImageUrl).

Are you wondering how to do this manually from the Authoring Tool? click here

As you might have understood from their names, these properties are private and can only be accessed from within the internal components of myPage. In fact, they are setting some properties of the internal Image and Text components.

Let's focus on some of the Text properties, that are needed for the whole SpeakItem mechanism to work:

property	value	notes
id	pageText_${internalIndex}	pageText_0, pageText1, and so on based on the input array length
speech	${payload.data.properties.values[internalIndex].transformerOutput}	referencing the output of the textToSpeech transformer

From now on, it is possible to add a new component called myPage, that will only have that 3 properties mentioned above. Let's do it!

Delete the second Container of the document, click on the Pager, press the + button and add our new myPage component.

Set all the properties as follows:

property	value	notes
internalIndex	${index}	*index* of the current page, inherited from the input array
internalCaption	${data.caption}	*caption* property of the input array
internalImageUrl	${data.image}	*image* property of the input array

At this point, our component hierarchy should look like this:

Now that our custom component is all set, let's proceed to the final step.

Declaring the action for the Pager when a page changes

We want our project to read the caption after every swipe, so we need to declare the action that the Pager on every page change.

To accomplish that, we are going to declare that the onPageChanged event will execute the SpeakItem command against the current page.

Click on the APL tab of the Authoring Tool, and add the following property to the Pager:

"onPageChanged":
[
    {
	"type": "SpeakItem",
	"componentId": "pageText_${event.source.value}"
    }
]

This means that every time the page changes, the SpeakItem command will be executed against the Text component inside myPage, and since ${event.source.value} contains the index of the page after the swipe, Alexa will always read the caption that is displayed on the screen at that time.

More information about onPageChanged event can be found here.

Testing the whole document

Make sure that our input array contains all the images/captions you want to show, and send the document to the device!

You will see that for every swipe, the screen will change the picture and Alexa will read the caption out load.

(Optional) Render the document from a skill (Developer portal or a device)

Make sure to export the document from the Authoring Tool by pressing the download button on the upper-right corner, and make the file available in our backend.

From the endpoint code, send the Alexa.Presentation.APL.RenderDocument directive referencing the just downloaded file:

Node.js:

// from the LaunchRequest handler:
let speakOutput = 'Here is your slideshow!'
let aplDocument = require('./mySlideshow.json'),

return handlerInput.responseBuilder
    .speak(speakOutput)
    .addDirective({
        type:'Alexa.Presentation.APL.RenderDocument',
        token :'documentToken',
        document: aplDocument.document,
        datasources: aplDocument.datasources,
        })
    .getResponse();

Python:

# function declaration:
def _load_apl_document(file_path):
    # type: (str) -> Dict[str, Any]
    """Load the apl json document at the path into a dict object."""
    with open(file_path) as f:
        return json.load(f)

# from LaunchRequest handler:
speakOutput = 'Here is your slideshow!'
aplDocument =  _load_apl_document('./mySlideshow.json')

handler_input.response_builder.speak(speakOutput).add_directive(
            RenderDocumentDirective(
                token='documentToken',
                document=aplDocument['document']
                datasources=aplDocument['datasources']
            )
        )
        return handler_input.response_builder.response

Next steps

Tweak the document! Check out all the properties available for the Pager, Image, and Text components
Little challenge: Try to make the slideshow scrolling vertically
Want to build more? Have a look at all the APL components and commands

About & Links

If you are looking for additional support, post your question on the Alexa Developer Forums, or contact us. Also feel free to reach out to me on Twitter at @ugaetano_.

Understanding data-binding, transformers, commands, and custom components - How to build a slideshow project with Alexa Presentation Language (APL)

Introduction

Example output:

Prerequisites:

Project goals:

Building our APL layout

Designing how the slideshow pages should look like

Understanding and implementing data-binding

Building the datasources structure for our project, and implementing textToSpeech transformers for the SpeakItem command

Making our slideshow page a single custom component

Declaring the action for the Pager when a page changes

Testing the whole document

(Optional) Render the document from a skill (Developer portal or a device)

Next steps

About & Links

Related Articles

Introducing "Zero to Hero: A comprehensive course to building an Alexa Skill"

After pressing "submit for review": a look behind the scenes of the certification team

Subscribe

Alexa Skills Kit

Resources

Alexa Voice Service

AVS Resources

Connected Devices

Agreements

Blogs

Support