Amazonbot is Amazon's web crawler used to improve our services, such as enabling Alexa to answer even more questions for customers. Amazonbot respects standard robots.txt rules.
In the user-agent string, you'll see “Amazonbot” together with additional agent information. An example looks like this:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML\, like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)
Robots.txt: Amazonbot respects the robots.txt directives user-agent and Disallow. In the example below, Amazonbot won't crawl documents that are under /do-not-crawl/ or /not-allowed:
User-agent: Amazonbot # Amazon's user agent Disallow: /do-not-crawl/ # disallow this directory User-agent: * # any robot Disallow: /not-allowed/ # disallow this directory
AmazonBot does not support the
crawl-delay directive in robots.txt and robots meta tags on HTML pages such as “nofollow” and "noindex".
Link-Level Rel Parameter: Amazonbot supports the link-level rel=nofollow directive. Include these in your HTML like this to keep Amazonbot for following and crawling a particular link from your website.
<a href="signin.php" rel=nofollow>Sign in </a> ...
Verify that a crawler accessing your server is the official Amazonbot crawler by using DNS lookups. This helps you identify other bots or malicious agents that may be accessing your site while claiming to be Amazonbot.
You can use command line tools to verify Amazonbot by following these steps:
hostcommand to run a reverse DNS lookup on the IP address
hostcommand to run a forward DNS lookup on the retrieved domain name
$ host 220.127.116.119 718.104.22.168.in-addr.arpa domain name pointer 12-34-56-789.crawl.amazonbot.amazon. $ host 12-34-56-789.crawl.amazonbot.amazon 12-34-56-789.crawl.amazonbot.amazon has address 22.214.171.1249
If you have questions or concerns, please contact us. If you are a content owner, please always include any relevant domain names in your message.