Amazon provides a Product Advertising API, but like most APIs, the API doesn’t provide all the information that Amazon has on a product page. Amazon API does not provide exact data about product's quantity, price, reviews and large images of product. How can one get Amazon products page HTML/XML and extract information from it? The exact way to extract all product information is by using a web parser.
Parsing is syntax analysis of a HTML structure. It serves as the basis for parsing web pages formatted in HTML and XHTML.
Parsing module downloads by HTTP a lots of the same type HTML pages and extracts some information from this.
Nonstrogy HTML parser supports invalid HTML like browsers. Incorrect HTML code includes unclosed tags, signs > < inside tags
<script></script>, attribute values without quotes, etc. Most of the HTML that occurs on the Internet is to some extent invalid.
Parsing ensures that you can get exactly what you see by visiting the site using a web browser.
Amazon Product Adverstising Library
PHP library is a convenient way to automate the API request for information through the interface
There are a lot of Libraries based on several programming languiges. We can recommend Amazon Product Adverstising Library based on PHP REST and SOAP (only V1) using the Product Advertising API. ApaiIO is a highly flexible PHP library for fetching the Product Advertising API using REST or SOAP. You can implement your own request or response classes.
Magento Extension with parsing features
There is a parsing function in the Amazon Magento Parser Extension based on ApaiIO PHP library. With this HTML-parsing function you can easily extract Amazon stock, price, product features and customer reviews and import it to the Magento.
Parsing Amazon for data will help you for a lot of things:
Grab product details
Grab product details that you can’t get with the Product Advertising API. Import Amazon products to Magento with quantity, price, reviews, hi-res images and features.
Online Price Monitoring
Monitor real-time data of change in Price, Stock Count/Availability, Rating etc. Get the new data that updates consistently. Once product is downloaded to Magento, it's price, stocks, images, reviews, etc are automatically synced with Amazon.
Get Amazon Product Reviews
Get Reviews and post it on your eCommerce site. Import Amazon Customer Reviews to the Magento database. This would look like Magento native product reviews.
Demo doesn't use any HTML parsing features because we can't publish on our site data protected by copyright. For testing this feature you can enable it in the settings:
Amаzon blocks html output if frequent requests are sent. So you need to add a captcha solver.
You can use deathbycaptcha.com service to solve captcha. There is a way to add captcha solver account in the extension settings. In our demo captcha solver credentials is not set.