Alexa Web API debugging and performance tips

Staff Writer Mar 29, 2023
Alexa Skills

In a previous blog post we saw that the Alexa Web API for Games is a great way to build rich games for Alexa, but the wide open versatility of creating a web app to run your game comes with a lot of open questions. In this post we’ll look at a few common debugging and performance related ones.

How do I see my console logging?

There are three broad categories of logs we’ll divide this question into: game, Alexa, and metrics logging.

When developing your game, you’ll get the best iteration time by running your game in a browser on your development machine, emulating the input from Alexa. This way, reloads after a tweak will be immediate. In this case rely on the tools in your web browser, including its debugger and console, to inspect the game state and do things like debug state transitions, presentation timing, and so on. The web view on most Alexa devices is based on Chromium, so you can expect any similar desktop browser to behave the same way.

When debugging your Alexa related code, or looking for performance bottlenecks, you’ll need to be running on a target device. The simplest way to get logging in that situation is to run your skill on a Fire TV, and then connect to it using the ADB tool. On Echo Show devices, you cannot connect directly. On all devices, you may still want to mix and match several logging strategies:

  1. Add a div on top of your HTML application, and then append critical events to it. This is very effective for quickly visualizing messages sent to and from your endpoint, or to surface error conditions.

  2. Add a “heads up display” div on top of your HTML to monitor constantly changing data. A good example of this is the stats.js library that provides a mini HUD displaying framerate data.

  3. Use a socket based Javascript debugging tool that can echo your console over the internet, for instance This can be handy if you need to inspect a large amount of logging from a specific device. It also gives you the ability to execute new code in the running app, making it possible to experiment live.

  4. Echo your logging to a cloud service for later inspection. You can take advantage of the Alexa SDK’s secure connection to your skill endpoint to send your logs as a message, and then log them there as appropriate. This is useful if you need to collect detailed logging from a set of devices over time, e.g. if you’re collecting performance data while beta testing.

In this interactive fiction skill, a debug div is enabled/disabled by tapping the title area at the top 6 times. A central “pushDebug” function is called from game code to log events and messages exchanged with the skill backend.

This fantasy strategy game is built with PlayCanvas, which offers its own built-in mini profiler HUD that makes it easy to see how different events and settings affect the game’s performance.

When looking for larger scale metrics insights, it’s best to use a dedicated metrics collection service. If you’re developing your skill endpoint in AWS Lambda, a quick way to get started is to log data in the Embedded Metrics Format. CloudWatch will automatically detect these and let you graph them over time, as well as analyze the data in bulk using CloudWatch Logs Insights.

Given the diversity of Alexa devices, you’ll get a better picture of performance metrics if you partition them by device. The best way to bucket similar devices into performance appropriate groups is to look at two dimensions: the screen resolution and device chipset. Use the WebGL_debug_renderer_info extension to discover the latter.

How do I speed up WebGL Rendering?

There are as many performance considerations when using WebGL as there are variations on renderers you could write using it, but we can look at a few general truisms that apply across Alexa devices.

Alexa devices vary greatly in processing power, including their GPUs. In the general case, it is unlikely that a modern PBR shading pipeline will run at high speed across all Alexa devices. Make sure shader complexity and texture reads are kept to a minimum in your games. Terms for built-in support for this vary between frameworks and engines, but look for “mobile”“simple”, or “phong” material types, avoid “physically based” or “physically correct” ones. Avoid area or soft lights and shadow casters, and if available, experiment with vertex lighting.

Use power of two textures exclusively, as non power of two sizes will be significantly slower on some devices. Support for up to 2048x2048 texture sizes is available across all Web API compatible Alexa devices. Expect to render to a traditional 24 or 32 bit LDR back buffer to conserve bandwidth. Similarly, while post effects are possible, you’ll want to pack your post processing into a single full screen pass as bandwidth will be limited. Consider skipping post processing on the slowest devices. Hardware MSAA is available on most devices, and devices that do not support it will ignore the setting.

Chillout Checkers was written using the ThreeJS framework, replacing its standard lighting pipeline with a customized low spec one, consisting of a uniform ambient, a single directional light, and optional cube-mapped specular.

As usual, keeping draw calls to a minimum is key to optimizing CPU utilization. The ANGLE_instanced_arraysWebGL extension can help drastically reduce draw calls for certain kinds of applications, and is available across all Web API compatible Alexa devices. If you’re using an HTML game engine, look for support for this under feature names like “geometry instancing.”

By default the web view used on Alexa devices will not initialize to native resolution; instead a factor will be applied to the logical size of the HTML page that scales common HTML text elements and font metrics for the best legibility. Underneath, images and fonts are still rasterized at native resolution, but usually report being smaller on the page. You can discover the real native resolution by multiplying the window’s innerWidth and innerHeight by window.devicePixelRatio. Once you have this, you can initialize your WebGL buffer to match, for the highest possible clarity. If you encounter fill-rate related issues (see a description of this below), consider intentionally choosing a smaller resolution, and letting HTML scale your rendered image back up to full screen. Exactly halving the height or width will provide the fewest undesirable artifacts, but you can experiment with your game to find the best balance between image quality and speed. Depending on the style of your game, you may find the best result by mixing and matching lower resolution dynamic WebGL content, with higher resolution more static HTML content, around or on top of it.

In a complex game, determining where a graphics performance bottleneck is can be a tricky problem. Try the following steps:

  1. Replace all your textures with a single pixel texture. If rendering speed drastically improves, you’re likely consuming too much texture bandwidth. You may either have too many textures, the sizes of the textures are too large, or you may not have mip mapping enabled. Consider reducing their dimensions, or compressing them into hardware formats to save bandwidth.

  2. Initialize your canvas to a quarter of the resolution you’d normally use. If rendering performance drastically improves, then your scene is fill-rate bound, it cannot execute the fragment shaders quickly enough. You may be experiencing one of two common problems:

    • You have too many overlapping transparent pixels in your scene, possibly because many smaller triangles overlap in one area, or you several large transparent polygons covering significant portions of the screen. Verify this by temporarily disabling all transparency in your materials. If this is the problem, then you’ll need to redesign some elements to reduce the polygon sizes, or the number of them. If you have many quads that have significant transparent portions in their textures, consider creating more details polygons that cut out those transparent areas. A tool like Sprite UV can help generate these automatically.

    • Your shaders are just doing too much calculation per pixel. You’ll need to simplify your fragment shaders to win back performance. You can verify this by changing your experiment back to a full size canvas, but replace your fragment shader with one that just returns a solid color. If performance improves improves in a similar way, begin reintroducing elements of your fragment shader until you isolate the culprit.
  3. Try modifying your vertex shaders to scale all points to zero. This will cause most of the pipeline to execute, but obviate all actual drawing costs. If your performance does not improve drastically, you are likely doing too much work on the CPU, and starving the GPU. You’ll need to optimize your main game loop to win back performance.

All Web API compatible Alexa devices support the WebGL 2 interface, but some older Fire TV models do not. In practice, performance of WebGL 2 contexts is not usually significantly better, but there are novel features you may want to use. If you choose not so support WebGL 1, make sure to include a message in your skill to your players to explain that.

What should I know about using HTML?

Depending on your design, creating your game in dynamic HTML may be desirable, especially if it involves a lot of text, or would greatly benefit from complex flex box layouts. All of the usual responsive web design principles can be applied here, including designing with screen relative measurements like the CSS vh (viewport height) and vw (viewport width) units.

If you wish to animate HTML elements, bear in mind that it is much faster to animate CSS transform than any of the layout properties, including left, top, right, or bottom properties of an element. You can further improve performance by telling the web view ahead of time that you will change CSS properties with the will-change property. Use this judiciously as many large elements on screen will quickly allocate a lot of video memory. As usual, apply this to portions of your application and then test performance to verify that the setting is worth it, before continuing.

Using SVG graphics is a great way to make your content scale up to higher resolution screens, making games look great on FireTV. One caveat to look out for is that SVG graphics are always rasterized at the native resolution, and so using a great deal of them can quickly eat up your available memory. You can modify this behavior by rasterizing your SVG graphics into 2D Canvas objects before use, where you can intentionally set the desired canvas resolution beforehand.

What should I know about playing Audio?

CPU time is constrained on lower end Alexa devices, so more processor intensive WebAudio operations like higher quality reverbs or scriptProcessorNodes may struggle alongside an otherwise already busy game loop. Otherwise, simply mixing several voices in WebAudio with gain nodes is a great way to add depth to your presentation. 

One thing to keep in mind is that decoding audio data may be processor intensive, depending on the device CPU and audio format. This may cause discernible hiccups in your game, or long delays before you can play a particular sound. To avoid this, rather than using WebAudio’s AudioBufferSourceNode, prefer to play longer form audio like music or ambient beds using the <audio/> HTML element, which supports streaming and can begin playing back partially downloaded sources. Remember that you can mix <audio/> sources into your WebAudio graph using the MediaElementAudioSourceNode.

Why does my touch input seem laggy?

Touch input on Alexa Show devices behaves like touch input on mobile devices, in that the browser “context click” input, what on a laptop might be a right click, is performed using a long press action. This necessitates a ~200ms hesitation on tap inputs to disambiguate between a tap and hold. Most games do not need this behavior and can bypass it by listening to the the touch family of HTML events rather than a click event.

Why am I running out of memory?

Web API games operate within a web view, which itself sits in a garbage-collected environment. This makes pinning down the total amount of reliably available memory somewhat tricky. The best practice is to be as conservative as possible with your memory usage, being conscious to release objects you no longer need as soon as possible. Bear in mind that the browser cache will retain recently downloaded files, within the limits of respecting your server’s cache headers, so dropping an asset from memory that you won’t need for a while will not necessarily incur a full redownload penalty later.

When considering how much memory you’re using, bear in mind that most web formats are compressed, like a .jpg image or an .ogg sound, but that these will be decompressed in memory for use. Think of your images as raw 32bit images, and your audio as raw PCM data. To get smaller textures in memory, consider creating hardware compressed textures. To reduce your audio memory usage, consider whether some samples could be mono instead of stereo, or have lower bit-depths and sample rates. 

Why is my skill’s endpoint getting so much traffic?

It is surprisingly easy to overwhelm your skill endpoint with message chatter from an active Alexa Web API game. One simple way to automatically manage this is by coalescing multiple events into a single message. You can do this by collecting an array of messages to send, and then keeping a timer to measure a minimum duration before you dispatch the next batch. One caveat to watch for: you should not let your total message size get larger that ~25kb, or it may not be deliverable. The simplest form of this could look something like:

Copied to clipboard
let lastSendTime = 0;
let pendingMessages = [];
let messageSize = 0;
function queueMessage(message) {
  // if we believe a single message might push
  // us over the edge, we'd need to handle testing
  // and flushing the buffer when it overflows here
  messageSize = JSON.stringify(pendingMessages).length;

function sendPendingMessages() {
  if ( pendingMessages.length > 0 ) {
    let timeSince = - lastSendTime;
    // Check that it's been at least 3 seconds
    // or we're running out of message space.
    // Depending on your message structure, this 
    // may need to be more complex.
    if ( lastSendTime > 3000 || messageSize > 20 * 1024 ) {
      pendingMessages = [];
      messageSize = 0;
  // We try to only send messages once a display frame. 
  // This helps coallesce messages that occur 
  // in the same game frame, for instance, all the 
  // consequences of a single player interaction. 
  // In your game this might better live at the end 
  // of your main loop function

Why does my game load so slowly?

Alexa games tend to perform well with customers when they’re quick to drop in and out of. Whether your game is snackable for a couple of minutes, or engaging for hours, it’s critical that when the customer asks for your game to come up, it reacts as quickly as possible. Optimizing for this across all Alexa devices means reducing your endpoint latency as much as possible, which we’ve covered before. When using Alexa Web API, you have the added challenge of quickly spinning up your web app. Here are a few ideas to help:

  1. Consider the concept of “time to first paint.” Customers will wait longer to get to your game’s main content if there’s something for them to see/hear along the way. The contents of your initial HTML file will generally be presented to the customer as soon as possible, but several features may delay this. See this article on First Contentful Paint for a deep dive.

  2. Consider packing some initial assets into your HTML page. If you have content that will always be shown immediately in your game, consider inlining that into the HTML. Ideally, you can display at least your game’s loading screen without requesting any additional files from your website. You can inline style and javascript content directly as <style/> and <script/> tags. You can also inline images and sounds by converting them to base64 and then also including them in a <script/> tag, or directly as a url() object in CSS. The only caveat is to stay conscious of the total size of the HTML file, the bigger it is, the longer it’ll take to load!

  3. Conversely, try to avoid including anything you don’t immediately need in the initial HTML file, and instead use JavaScript to load more code and assets later. 

  4. Customers will stop and resume your game skill fairly often, and expect your game to pick up where they left off. You can use the ASK SDK’s persistent attributes support to keep track of where you should be resuming. Make use of the data parameter of your HTML.Start directive to communicate that information to your game, and then figure out how to skip your game’s cold start behavior, jump directly to where you are resuming, and load only the assets you need for that state.

  5. If you find that you still need significant time to load your web app, consider saying something welcoming and brief with outputSpeech or APLA alongside your HTML.Start directive. The audio will start to as soon as possible, possibly even before your web app starts to load, and will continue to play out over the top of it.


There are a lot of aspects to optimizing performance for games in general, and Alexa Alexa Web API games in specific, with many specific topics to discuss for each of the popular frameworks and engines one might use to build an HTML5 game. Far more than we can cover in this article! What questions would you like us to answer next? Join the the official Alexa Slack to continue the discussion, and exchange ideas with the broader Alexa developer community.

Tune in to Office Hours every week at 9 a.m. PT to unlock even more insights and get your questions answered, plus check out our library of past episodes.

Recommended Reading

Drive engagement by making your skill a part of customers' daily lives with the Alexa Routines Kit
Certification requirements for privacy policy URLs
Six tips for designing smooth Alexa shopping experiences to help grow revenue