Amazon provides a Product Advertising API, but like most APIs, the API doesn’t provide all the information that Amazon has on a product page. Amazon API does not provide exact data about product's quantity, price, reviews and large images of product. How can one get Amazon products page HTML/XML and extract information from it? The exact way to extract all product information is by using a web parser.

Parsing

Parsing is syntax analysis of a HTML structure. It serves as the basis for parsing web pages formatted in HTML and XHTML.

Parsing module downloads by HTTP a lots of the same type HTML pages and extracts some information from this.

Nonstrogy HTML parser supports invalid HTML like browsers. Incorrect HTML code includes unclosed tags, signs > < inside tags <script></script>, attribute values without quotes, etc. Most of the HTML that occurs on the Internet is to some extent invalid.

Usually HTML converts to a tree DOM for transformation or to extract data. DOM is the Object Model that is an API that is provided to browser's JavaScript to manipulate with an HTML document. We can assume that is a tree-like data structure, which representates of the structure of the HTML document.

Parsing ensures that you can get exactly what you see by visiting the site using a web browser.

Amazon Product Adverstising Library

PHP library is a convenient way to automate the API request for information through the interface

There are a lot of Libraries based on several programming languiges. We can recommend Amazon Product Adverstising Library based on PHP REST and SOAP (only V1) using the Product Advertising API. ApaiIO is a highly flexible PHP library for fetching the Product Advertising API using REST or SOAP. You can implement your own request or response classes.

Magento Extension with parsing features

There is a parsing function in the Amazon Magento Parser Extension based on ApaiIO PHP library. With this HTML-parsing function you can easily extract Amazon stock, price, product features and customer reviews and import it to the Magento.

Parse Amazon quantity, price, reviews and  large images

Parsing Amazon for data will help you for a lot of things:

Grab product details

Grab product details that you can’t get with the Product Advertising API. Import Amazon products to Magento with quantity, price, reviews, hi-res images and features.

Online Price Monitoring

Monitor real-time data of change in Price, Stock Count/Availability, Rating etc. Get the new data that updates consistently. Once product is downloaded to Magento, it's price, stocks, images, reviews, etc are automatically synced with Amazon.

Get Amazon Product Reviews

Get Reviews and post it on your eCommerce site. Import Amazon Customer Reviews to the Magento database. This would look like Magento native product reviews.


Demo

Demo doesn't use any HTML parsing features because we can't publish on our site data protected by copyright. For testing this feature you can enable it in the settings:

Amazon Magento extension - HTML parsing features settings

Amazon Magento extension - Load Amazon customer reviews into Magento

Amаzon blocks html output if frequent requests are sent. So you need to add a captcha solver.

You can use deathbycaptcha.com service to solve captcha. There is a way to add captcha solver account in the extension settings. In our demo captcha solver credentials is not set.