Create your Catalog File
Getting Started with Catalog Ingestion
Catalog ingestion is the process of submitting your media to Amazon so that it can be surfaced to users. You first define all the metadata about your media (movies, tv shows) in a catalog file that conforms to a specific XML schema (the Catalog Data Format, or CDF).
You then upload this catalog file into an S3 bucket for Amazon to ingest. After Amazon ingests your catalog, Amazon runs some de-duplication processes and consolidates your media metadata into one master database of content that can be surfaced to users in unified ways across multiple devices.
- How Devices Interface with Ingested Catalog Metadata
- Requirements for Catalog Ingestion
- Catalog Ingestion Process Overview
- Update Interval Requirements for Catalog Files
- Refresh Intervals for Catalog Content
- Getting Started FAQ
- Next Steps
How Devices Interface with Ingested Catalog Metadata
Catalog ingestion is not tied to delivery on a specific device. Fire TV, Echo Show, and Echo Spot (as well as other potential Alexa-enabled entertainment devices with screens) can interface with your ingested catalog's content in different ways to play media. Not all devices interface with catalog data in the same way.
For example, Fire TV uniquely uses a Universal Search and Browse feature to enable content discovery that includes catalog-ingested TV shows and movies that are matched against IMDb or Amazon video. The main advantage of Universal Search and Browse is to allow for content discovery, since the search is not app specific but rather "universal" across all apps that have ingested their catalogs into Amazon.
The Video Skill API, also used on Fire TV, lets users interface with the media that is immediately playable by apps that the user already has installed and authorized (e.g., signed in to). When users say an utterance such as "Play House of Cards," Fire TV immediately opens Netflix and begins to play House of Cards. If the user doesn't have Netflix or isn't signed in to Netflix, Fire TV won't play the media. Hence, the Video Skill API is for user engagement rather than content discovery.
Echo Show and Echo Spot have the Video Skill API but not Universal Search and Browse. The Video Skill API lets users play video content from skills (rather than apps) that the user has enabled and authorized. In both cases, the catalog content you ingest can be surfaced to the various devices. Overall, different devices can interact with your ingested catalog in different ways, but the process for catalog ingestion remains the same.
Requirements for Catalog Ingestion
Before starting the catalog ingestion process, make sure that you have or can easily obtain the following requirements:
- Integration with IMDb: Catalog ingestion is restricted to apps that have movies and TV shows that are significant enough to be integrated in IMDb. See IMDb New Title Submission FAQs for details about the media that qualifies for IMDb.
- Easy access to your media metadata: You'll need to export your your media's metadata from a database; if you cannot export your metadata from a database, you will need to manually create your catalog file by hand. After exporting your metadata, you'll need to create a catalog file that structures the information according to the Catalog Data Format schema.
- An Amazon Web Services (AWS) account: You or someone in your organization will need an AWS account and familiarity with the AWS S3 tools or a willingness to learn about the AWS S3 tools. (Details are explained in Set Up Your AWS Account.) You will need to execute several AWS commands using the Command Line Interface (CLI) to upload your catalog file to AWS, so that Fire TV can obtain your catalog for integration. Amazon recommends using the AWS SDK to automate this process.
- Onboarding process and setup: It takes two weeks to set up our systems to provide you with a sandbox to test the catalog integration. Your Amazon business contact will also need to onboard you with the integration process. If you do not know who your Amazon business contact is, contact us.
- Minimal fields for matching: To provide metadata for matching, you must supply some required and additional fields. The Title field is required. Additional fields (a minimum of 2 but ideally 3 or more) are also required. The additional fields can include the following: ReleaseYear, Credits (Actor and Director), RuntimeMinutes, SeasonInShow or SeasonID (the SeasonID must reference a valid season), and ShowTitle or ShowID.
- Scripts to automate the process: The catalog creation and upload process requires multiple steps that can be simplified using automation. Consider allocating a developer resource to help script and set up a cron job to automate and simplify this process for your organization.
Catalog Ingestion Process Overview
The catalog ingestion process involves the follow high-level steps:
Here's some more detail about each of these steps:
- Step 1: Create Your Catalog File: Create an XML file in our Catalog Data Format (CDF) for your media metadata.
- Step 2: Validate Your Catalog File: Validate your catalog file against the CDF schema (XSD) file.
- Step 3: Set Up Your AWS Account: Set up an AWS account with appropriate permissions to enable you to upload to a specific S3 bucket using the command line.
- Step 4: Upload Your Catalog File: Upload your catalog file to the S3 catalog bucket that Amazon creates for you.
- Step 5: Verify Your Uploaded Catalog File: Review the Amazon-generated ingestion report to confirm that your catalog file was successfully ingested.
## Guidelines and Expectations for Catalog Uploading
As a content provider participating in catalog ingestion, you should ensure that your processes related to catalog updates and uploads meet the following guidelines, and set your expectations accordingly with regards to various wait times after uploading a new catalog.
Update Interval Requirements for Catalog Files
In order to keep the catalog data current, Amazon requires the following refresh intervals for updating your catalog:
- Amazon expects your catalog to be uploaded at least once per week, regardless of whether the catalog has changed. Your upload process should be scripted or otherwise automated so that the current catalog is uploaded on at least a weekly interval.
- If your most recent successfully ingested catalog file is more than three weeks old, Amazon will disable catalog integration for your app. (In other words, if your catalog is stale for more than three weeks, or fails catalog ingestion for more than three weeks, Amazon disables the catalog ingestion.)
Refresh Intervals for Catalog Content
When you update your catalog content, the new updates are not immediately available to viewers. Instead, the updates will appear on devices after the following:
- Amazon ingests your catalog: Amazon will validate and ingest newly uploaded catalog files at regularly defined intervals. For Fire TV, the ingestion interval is every four hours. Thus, after uploading a new catalog, for Fire TV you might have to wait up to four hours to see if the ingestion was successful.
- The on-device cache expires: Devices have an on-device cache that can persist longer than the catalog refresh interval. For example, the on-device caches for shows and seasons on Fire TV can persist for up to 10 hours. As a result, some content could have up to a 14-hour delay in availability to viewers (4 hours for the ingestion interval, and 10 hours for the on-device cache) after catalog upload.
Getting Started FAQ
- Q: What is catalog ingestion?
- A: Catalog ingestion allows Amazon to surface your content on entertainment devices such as Fire TV, Echo Show, and Echo Spot.
- Q: At a high level, what's the typical process for catalog ingestion?
- A: Basically, you create a catalog file that contains your media metadata. The catalog file follows Amazon's Catalog Data Format (CDF) schema. After validating your catalog file, you then upload it to Amazon's AWS S3 service, and then verify that there were no errors during the ingestion process.
- Q: How often should we go through this process of updating our catalog?
- A: As a best practice, Amazon recommends resourcing an engineer or developer to automate the catalog export and upload process. The catalog needs to be uploaded at least once a week, but you can upload your catalog file as frequently as your ingestion interval allows (on Fire TV, this ingestion interval is 4 hours). Note: Regardless of whether you actually have changes or updates to your catalog, you must update your catalog at least once every three weeks. Catalog files older than 3 weeks become stale, and the catalog content is removed. To avoid the risk of deactivation due to stale catalog content, set your upload process to run weekly, regardless of whether there have been any catalog updates.
- Q: Some of this sounds pretty technical. Who in my org should be handling this process?
- A: Preferably, an engineer or IT professional will be handling the creation of and uploading of your catalog file. Amazon highly recommends having an engineer automate this process. If an engineer is unavailable, Amazon recommends that the person creating and uploading the catalog file be comfortable working with XML and be comfortable executing commands using a command line interface.
Ready to get started? Go to Step 1: Create Your Catalog File.