Using Location Services To Estimate Supermarket Queues in Italy

Andrea Muttoni Apr 22, 2020

API Build Inspiration Content Skills Tips & Tools Tutorial

In this post you will learn how I implemented Location services in a skill. We will cover the Geolocation (Dynamic Location) API and the Device Address API by adapting a real-world geo-aware application to Alexa: Fila Indiana, an Italian website that crowdsources supermarket waiting times.

The Context

I'm based in Italy. At the time of this writing, the ongoing quarantine is causing large queues at supermarkets as entrance is limited to avoid crowding. This prompted the team from Wiseair to create a crowd-sourced portal called Filaindiana.it (in Italian "fila indiana" means "single file"). The portal geo-locates you and gives the estimated waiting time for supermarkets around you. Here's an example of the Fila Indiana UI:

If you click on the stores on the map, it opens a popup that shows how many people are waiting, when the last data source was picked and allows you to contribute an update yourself.

It was a massive hit. They went from 0 to 800K users in less than a week, despite only initially covering Lombardy. Thankfully they were fully serverless and hosted on AWS so...lots of room to scale! Anyways, as soon as I saw the service I wanted it to be available on Alexa. I therefore reached out to the founders of Wiseair and they loved the idea. However they were quite busy expanding their web platform so they offered to work on the backend API endpoints if I was willing to help them out on the Alexa-side of the integration. I had some free time in the evenings, so I replied: "of course".

The User Experience

What I wanted as a basic experience was to be able to say "Alexa, open File Indiana" and immediately get the wait times for supermarkets closest to me. I also wanted to be able to filter by supermarket name. For example: "Alexa, ask Fila Indiana what the queue is at Carrefour". To optimize the backend code, all intents pointed to the same handler in charge of getting the estimated queue times and the handler would tailor the response based on the provided parameters (e.g. if only the location was specified, the handler would return top supermarkets closeby, if the desired supermarket was also specified, it would only return results matching that name).

Here's an example script of the user journey (translated):

User Journey #1

User: Alexa, open Fila Indiana
Alexa: Welcome to Fila Indiana. Here's the estimated wait time of supermarkets near you:
- Carrefour in Via Spiga has an estimated wait time of 23 minutes
- Esselunga in Viale Piave has an estimated wait time of 10 minutes
- (add a third and last option if available)

User Journey #2

User: Alexa, ask Fila Indiana how long the queue is at Carrefour
Alexa: At Carrefour, the queue is currently 23 minutes

Excluding edge cases (that we'll cover later), that's the basic functionality I wanted to have. I also wanted to provide some visuals but that's just a progressive enhancement. The core functionality would work with or without a visual UI.

Locating the User

Once the interaction was sketched, and the basic intents implemented, the first step was to get the user's location in order to filter for the closest supermarkets nearby them. There are two options: use the dynamic user location (geolocation) or the device address. The dynamic user location is only supported by "on-the-go" devices such as the Alexa App, the Echo Auto or the Echo Buds. The device address is supported for all stationary devices, such as a regular Echo, and is inputted through the Alexa App in the device’s settings pane.

I opted to go for both methods: if the user's device supports the dynamic geolocation, this would give a better overall experience. In the case that it's not supported, I would fallback to checking the device address.

Regardless of which option you opt for in your skill, it important to understand that getting any user information requires explicit permissions on behalf of the user. That means we must control that in our skill code. You can find an easy getting started templates in our Alexa Cookbook for both the Dynamic Location as well as the Device Address.

Skill Permissions

First, we need to enable these service APIs in our skill in the form of permissions. Note: these aren't user-side permissions, but skill-side permissions. That means that it doesn't grant us automatic access to the information, it just lets Alexa know we will be potentially querying this scope of information. In order to actually get the user's dynamic location or device address we will need to explicitly ask them for permission, by letting them know they need to check the app and grant us permission. We'll see how to do that in the code in the later sections.

So, let's enable permissions for our skill to use the Dynamic Location & Device Address API in the Developer Console for our skill. You can do this by going to "Permissions" and enabling both "Device Address" (ensuring that the "Full Address" is selected) as well as "Location Services". The latter is the permission required for the dynamic location.

Further down the same page you will find "Location Services":

If you prefer to use the ASK Command Line Interface (ASK CLI) and would like to know how to do this step from your local skill package, it’s even easier! All you need to do is update your skill.json with a “permissions” array, like so:

  "manifest": {
    ...
    "permissions": [
      {
        "name": "alexa::devices:all:address:full:read"
      },
      {
        "name": "alexa::devices:all:geolocation:read"
      }
    ],
    ...
  }

Dynamic Location

Before we get started, a little spoiler: to integrate with the Fila Indiana API, I needed to get latitude and longitude of the user. In the case of Dynamic Location, the sample code is easy because that's exactly what we get back from the Geolocation service. That is, if the user has a supported device and has explicitly granted us permission we simply receive the coordinates along with a lot of other useful information (altitude, bearing, etc). Let's still look at a simplified piece of sample code to get an idea of how it works:

function getDynamicLocation() {   

    const {context, request} = handlerInput.requestEnvelope;
    const isGeoSupported = context.System.device.supportedInterfaces.Geolocation;
  
    // check if dynamic location is enabled
    if (isGeoSupported) {
        var geoObject = context.Geolocation;
        let ACCURACY_THRESHOLD = 500; // example accuracy of 500 meters

        // check if there's permission to get location updates
        if (!geoObject || !geoObject.coordinate) {
            // check that we have permissions
            const skillPermissionGranted = context.System.user.permissions.scopes['alexa::devices:all:geolocation:read'].status === "GRANTED";
            if (!skillPermissionGranted) {
              // in this case we don't have permission. Let's ask for it
              return handlerInput.responseBuilder
                .speak("Let user know they need to grant us permission...")
                  // let's also send a card for them to do this easily from the app
                .withAskForPermissionsConsentCard(['alexa::devices:all:geolocation:read'])
                .getResponse();
            } else {
                if(context.Geolocation.locationServices.access !== 'ENABLED'){
                    // the user needs to allow location services on their device
                    return handlerInput.responseBuilder
                        .speak("Let user know they need to enable localization...")
                        .getResponse();
                }
                if(context.Geolocation.locationServices.status !== 'RUNNING'){
                    // the location is not being picked up for some reason
                    return handlerInput.responseBuilder
                        .speak("Let user know location is not being picked up...")
                        .getResponse();
                }
                return handlerInput.responseBuilder
                    // there was an error
                    .speak("Let user know there was an error...")
                    .getResponse();
            }
        }

        // Here we have the location!
        if (geoObject && geoObject.coordinate && geoObject.coordinate.accuracyInMeters < ACCURACY_THRESHOLD) {

            // let's save the lat and lon
            const lat = geoObject.coordinate.latitudeInDegrees;
            const lon = geoObject.coordinate.longitudeInDegrees;
            // note: the geoObject contains a lot of other useful parameters
            
            // and let's return it
            return {
                lat : lat,
                lon : lon
            }
        }
    } else {
        // dynamic location is not supported, handle it however you want!
        return false;
    }
}

As you can see, it's relatively straightforward. I've added some comments along the way so you shouldn't have trouble understanding the code above. At a high level, this is the process: we check that we have geolocation permissions. If we don't have permissions, we ask for them by sending a card together with our response informing the user that they can then enable permissions from the app. If anything else goes wrong along the way, we inform the user. Once we get the lat/lon coordinates, we save and return them. If the dynamic location is not supported, we handle it however we want. In my case, I fall back to using the device address.

Device Address

As just mentioned, I first check if the user supports the geolocation, and otherwise fallback to the Device Address. However, the device address only gives us the device address in written form, eg street address, city, zipcode and country. We'll see how to convert this into lat/lon further down!

In terms of device address, the sample is also quite self explanatory and the experience is relatively similar. We first check that we have the necessary "consentToken". If we don't, we notify the user in two ways: we let them know in the "speakOutput" that we need their consent before accessing the device address, and to facilitate this we also send a permission consent card in our response. This will send a card to their app where they can quickly toggle permissions with a simple tap.

const PERMISSIONS = ['read::alexa:device:all:address'];

// inside our handler:
async handle(handlerInput) {
    const { requestEnvelope, serviceClientFactory, responseBuilder } = handlerInput;

    const permissions = requestEnvelope.context.System.user.permissions;
    
    if (!permissions || !permissions.consentToken){
      // we don't have permission
      return responseBuilder
        .speak("...inform the user we don't have permissions...") 
        .withAskForPermissionsConsentCard(PERMISSIONS) // we send a permissions card
        .getResponse();
    }
// ...

Once we have permissions, we use the "getDeviceAddressServiceClient" provided by the SDK and use its "getFullAddress" method to get an address object containing all of the fields that the user is able to populate. Here's an example snippet of how we would do this:

// ...

const deviceId = requestEnvelope.context.System.device.deviceId;
const deviceAddressServiceClient = serviceClientFactory.getDeviceAddressServiceClient();
const address = await deviceAddressServiceClient.getFullAddress(deviceId);

if (address.addressLine1 === null && address.stateOrRegion === null) {
    // if the address is not populated in the app, we let them know.
    return responseBuilder
        .speak("...inform the user to populate the address for the device in the app...")
        .getResponse();

} else {
    // the 'address' object is available and populated.
    return responseBuilder
        .speak("Location receive successfully!")
        .getResponse();
}

// ...

Below is an example of the "address" object when we request the full address. Just a reminder, this is the information the user has set on their device settings from the Alexa App. Again, important to emphasize that we won't receive this information unless the user has granted us permission (via our skill permissions card for example).

{
"addressLine1" : "Viale Monte Grappa 3",
"addressLine2" : "",
"addressLine3" : "",
"stateOrRegion" : "",
"city" : "Milan",
"districtOrCounty" : "Milan",
"countryCode" : "IT",
"postalCode" : "20124"
}

Now that we have the address, I mentioned earlier we need to get latitude and longitude. This process of converting a text-based address into a standardized location format is called geocoding.

There are various ways to do this. I tried two:

the manual (not recommended) way, which is manually interfacing with the Open Street Map Nominatim API,
the easy way (recommended) way, which is to use an npm module (if you're using NodeJS) called node-geocoder. Node Geocoder supports most location based mapping APIs including Open Street Map (OSM). It has a bunch of features and you can easily swap out mapping providers. For example, after some colleagues beta tested the skill, I was unsatisfied with the results the OSM API was returning, so I switched to HERE, a mapping and location data provider. Let's see how I did that:

const NodeGeocoder = require('node-geocoder');
 
// instantiate node-geocoder with the right options
const options = {
  provider: 'here', // you can use any provider here (no pun intended)
  country: 'Italy', // this helps restrict the search area and make the API faster
  apiKey: '......'  // insert your API key here
};
const geocoder = NodeGeocoder(options);

// send the geocoding request
const location = await geocoder.geocode({
  address: "Viale Monte Grappa 3",
  countryCode: "IT",
  zipcode: "20124",
  limit: 1 // let's you decide how many results you want, in my case 1
});

console.log(location)
/* The output of our query:
  { 
      latitude: 45.4803627, // <-- BOOM!
      longitude: 9.1871467, // <-- THIS IS WHAT WE NEED!
      formattedAddress: 'Viale Monte Grappa 3',
      country: 'Italia',
      countryCode: 'IT',
      state: 'Lombardia',
      county: 'Milano',
      city: 'Milano',
      zipcode: '20124',
      ...
  }
*/

Integrating with Fila Indiana

Now that we have two methods or retrieving the longitude and latitude (whether by direct Geolocation or by converting the device address in Lat/Lon), I worked with the Fila Indiana / Wiseair folks to interface with their service using a dedicated API . I asked them to simply expect a "POST" request containing a JSON body with latitude and longitude, and optionally a supermarket brand name. After just a couple days, they provided the API endpoint and I queried that using "node-fetch". Here's a simplified example:

async function getClosestSupermarkets(lat, lon) {
  const body = {
    lat : lat,
    long: lon,
    brand: 'carrefour' // optional! if not specified, top 5 closest supermarkets are returned.
  }

  const request = await fetch("https://the-fila-indiana-api/prod/", { 
    method: 'post',
    body: JSON.stringify(body),
    headers: {'Content-Type': 'application/json'}
  })

  return await request.json(); // Get the result, parse the JSON and done!
}

And here is an example output of the Fila Indiana API response:

[
    {
        "supermarket": "Eataly",
        "wait": 25,
        "distance": 79.47722555961701,
        "address": "Piazza Venticinque Aprile, 10",
        "logo": "https://filaindiana.it/brands/Eataly.png"
    },
    {
        "supermarket": "esselunga",
        "wait": 75,
        "distance": 343.5930224231655,
        "address": "Viale Luigi Sturzo, 13, Italia",
        "logo": "https://filaindiana.it/brands/esselunga.png"
    },
    {
        "supermarket": "carrefour",
        "wait": 35,
        "distance": 464.91401414358444,
        "address": "Via della Moscova, 30, Italia",
        "logo": "https://filaindiana.it/brands/carrefour.png"
    },
    ...
]

What about APL?

Great, so now we have the data we can of course complete the skill and read out how long the wait time is for the supermarkets nearby. However, it would be nice to also provide visual feedback if users have a multi-modal (screen) device, such as an Echo Show, Echo Spot or FireTV. I went ahead and started from a standard ListTemplate1 and modified it a bit. In the APL screen I display wait time, supermarket logo, supermarket name, street address, and distance in meters. I also display up to 5 results. Via voice however, only the 3 closest supermarkets are read out. Here’s an example, both in rectangular aspect ratio (Echo Show) as well as circular aspect ratio (Echo Spot).

Seeing as how this blog post is getting long enough and the topic of the blog post was around user position, I’ll leave this as a topic for a future blog post. In the meantime, there are other great resources and blog posts on using APL.

Conclusions

That's how you can relatively quickly get going in order to create skills that make use of the device location and the dynamic geolocation of the user. While the API itself is quite easy to use, the complexity (and elegance) is how to handle edge cases (e.g. the user has not granted permissions). Make sure you give the user clear instructions around this: let them know they need to check their app and that you also sent them a permissions card from which to grant permissions. In the case of a missing device address, also let them know they should set an address on their device using the Alexa App.

Fun fact: this was completely developed using Alexa-Hosted skills as I wanted to stress test the developer experience compared to developing using my regular fully-fledged AWS account. I have to say I really enjoyed the experience and while there are some adjustments to make in the workflow, overall its really smooth. See this link on how to get started.

As we are nearing the end of the post, you may be wondering: where is the skill now? At the time of this writing, it's awaiting certification on the Italian skill catalog. Although I've poured hours into this project, let's hope it becomes obsolete very quickly. Regardless of how quickly the supermarket congestion situation resolves, it was a useful learning experience for me, and hopefully for you too.

A final parting word: if you see some cool services out there that you think would work perfectly on Alexa, make your voice heard! Reach out to the owners and let them know. If you have some spare time, why not offer to work together on creating a cool voice experience. In many cases the original service owners may not even think an Alexa skill is possible (or even what it is). It's your opportunity, as an Alexa Dev, to bring the world of voice to all the services you care about.

Happy coding, and stay safe!

About & Links

You can reach out to Andrea Muttoni on Twitter at: twitter.com/muttonia

Other Links:

Wiseair
Filaindiana.it
Code samples mentioned:
- Dynamic Location
- Device Address

Using Location Services To Estimate Supermarket Queues in Italy

The Context

The User Experience

Locating the User

Integrating with Fila Indiana

What about APL?

Conclusions

About & Links

Subscribe

Alexa Skills Kit

Resources

Alexa Voice Service

AVS Resources

Connected Devices

Agreements

Blogs

Support