New reliable tool for crawl, extract and parse content from Amazon. With this solution you can extract Amazon stock, price, description, images, title and sync it with your system. You can easily get this data from Amazon directly.

System requirements

  • Dedicated server
  • Rotating proxies

User guide of our demo. Contents:


Amazon Data Extraction Engine Dashboard. How to check the Engine health.

On the dashboard, you can see statistics and key performance indicators.

  • Key performance indicators.
  • Graph of products synced per last 24 hours. Sync delay spread graph.
  • Proxy usage statistic.

Key performance indicators

Amazon Data Extraction Engine - key performance indicators

"Total products" are the Engine database, including blacklisted and disabled products.

"Active products" are a number of products which are enabled for sync.

"Products in the queue" is a number of products which are due to synchronization. A product might be synced several times a day, depending on how often it is likely to change price or stock.

.

"Active connections" number shows how many products are being processed at the very moment. The outdated number means how many processes hang more than usual.

It is good to have 0.7-0.9 ratio between “In stock” products and “Active” products. If you have a ratio like 0.5 - please check for products which are out stock for a long period. Maybe they are unlikely to be in stock again and you can remove them from the “Active” list.

Graph of products synced per last 24 hours. Sync delay spread graph.

Right after them, you can see the graph of products synced per last 24 hours and the sync delay spread graph. You can control the display of data, enable or disable data in the graph.

The first graph clearly indicates how the Engine has been working for the last 24 hours till now.

You see total synced products and which parts have had their stock or price updated.

Data Extraction Engine - graph of Amazon products synced per 24 hours

"Sync delay spread" graph shows sync-delay at the horizontal line and the number of products at the vertical line.

Data Extraction Engine - sync delay spread graph

Proxy usage statistic

Find out how a proxy works. Every proxy has 3-line stats representing 1 hour, 3 hours, and 24 hours statistics. You can also see the number of requests performed during each interval.

Amazon Data Extraction Engine - proxy usage statistic

A blue line means all requests from the Engine to Amazon through this proxy are getting positive response. Non-blue colors indicate failure requests.

If half of the line is red (zero response), the proxy is blocked for 1 hour, after that it is automatically enabled.

The quality of the proxy depends on many options: the proxy origin, user agent (and other headers), or a website being scraped. One proxy may work on Amazon.ca or fail on Amazon.com

Go to contents


How to search products by Amazon category

If you wish to find the products in your niche, just use search at Amazon.

Data Extraction Engine - search products at Amazon

The easiest way to find a proper category URL. Amazon offers customized options to search and browse, located in the top navigation bar or in the search box to the left. Use Amazon options to search and further refine the selection available in a category. Specify all required filter conditions: seller, price range, prime, free shipping, etc.

Copy and paste category URL in the Category URL field.

The Engine will parse all found pages in real time and it might take a while. This time the Engine only scrapes ASINs from category pages. The price and all other attributes will be scraped during the first sync later. Found products will go to the product list directly but with certain Sync Status: " Move to Active? "

Go to contents


How to upload your ASINs into the Engine database

Amazon Data Extraction Engine - upload ASINs into databsase

Choose locale and upload products list to sync. Make sure your ASIN's list in CSV format.

The Engine syncs products automatically. The price and all other attributes will be scraped during the first sync late.

Go to contents


How to choose products to any operations

At the top of the "Products" page, you can find standard pagination and standard buttons for selecting or deselecting products.

You can filter products by locale, by ASIN and parent ASIN, by name, by price, by quantity, by sync status. There are filters for last sync time and last update time. Sync status filter shows active, inactive, blacklisted and deleted products and products with statuses: "Check variations" or "Move to Active?".

Data Extraction Engine - products filter

There are options to select in the Product list section:

  • Select all
  • Unselect all
  • Select visible
  • Unselect visible

Go to contents


How to manage your product list

At the products table, you can manage or export results. Manage your product and listing data with a user-friendly interface designed to filter the data you want to view, compare, and export.

Main operations for a product list are "Mass Actions" and "Download CSV".

"Mass Actions" drop-down menu includes changing sync status and operations with products.

Data Extraction Engine - mass action: Active / Inactive / Check variations / Blacklisted / Sync selected

Data Extraction Engine - Sync statuses

  • Active
    Sync status "Active" means this product is active and properly synced by the Engine.
  • Inactive
    Sync status "Inactive", means you do not need to sync this product, but instead of deletion, you mark it with sync status "Inactive" just not to recheck this product again in the future by searching the same categories.
  • Check variations
    "Check variations" is the initial sync status. During first sync such product is checked for variations and if any, and all variations are also imported to the Engine.
    Data Extraction Engine - Check variations sync status
  • Move to Active?
    This means you need to revise the product and set sync status.
    Data Extraction Engine - Move to Active sync status
  • Blacklisted
    Sync status "Blacklisted" is obtained automatically during synchronization if the product matches the blacklist options (blacklisted seller, ASIN or brand).
  • Deleted
    Sync status "Deleted", means you do not need to sync this product, but instead of deletion, you mark it with sync status "Deleted" to not recheck this product again in the future by searching the same categories.

Go to contents


How to sync selected products with Amazon. How to check price and qty.

For sync your products with Amazon, choose "sync selected" mass action.

When you're done, hit SUBMIT button.

Amazon Data Extraction Engine - Sync products with Amazon

You will have the same quantity in the Engine and at the Amazon site after synchronization.

You can check the price and quantity by "View on site" link.

Go to contents

Links to additional information

Data Extraction Engine - sync opeartions

Manual start of synchronization for a particular product.

"Sync" is a link for the manual start of synchronization.

Check the available Amazon data for a particular product.

"Sync page" is a link to the page where you can see available Amazon data for a particular product. Sync page displays complete product data after synchronization.

How to check price and qty. Link to the Amazon product page.

"View on site" is a link to the Amazon product page. You can check the product price and quantity by this link.

Go to contents


How to download a CSV-file with product data

With our demo, you can download a CSV-file with product data. There is limitation on the number of items that can be exported per file.

At your server, all data can be exported in any format you need, or you can integrate it with a database.

Amazon Parser - Download csv data file

Go to contents


How to create blacklists by brand, by sellers, by ASINs

Data Extraction Engine - blacklist configuration

At the configuration section, you can see the blacklist of “Brands” per locale and “Seller/ASINs” per locale. This is an example of a completed form.

Data Extraction Engine - seller / ASIN blacklist configuration

  • Only ASIN
    If there is only ASIN: this ASIN will be blacklisted
  • Only seller
    If there is only seller: all offers of this seller will be skipped
  • ASIN + seller
    If there are ASIN + seller: only offers matching this ASIN and seller will be skipped

Go to contents


How to change XML configuration files

Each package is tailored to your needs.
You get full access to the process: sort, filter, make advanced scraping algorithm. All data can be exported in any format you need, or you can integrate it with a database.

The Engine has the ability to follow xml configuration instructions and may accommodate html structure changes quite fast. Rich xml configurations will make crawler support easier. Parsing configuration files are here:

[YourDir]/data/parser/Config/Profile/

It is easy to configure the Engine for certain needs. You can set system to parse by certain seller or prime status. Example of the config file with parsing prime and seller in the following screenshot. Also, you can set Amazon API account settings (if you need to use API).

Amazon Parser - parsing Amazon prime and seller

Go to contents


Change your account settings

This is a profile setting where you can change your account credentials.

  1. In the top right, click your username.
  2. At the top, choose a Profile page.
  3. Make your changes.
  4. After you're done, click Save Changes at the bottom.

Data Extraction Engine - user account

Go to contentsBack to previous page