Home /
Expert System /
Some questions regarding setup of the Web Data Extraction Engine

Some questions regarding setup of the Web Data Extraction Engine

We have a combined experience of over 8 years within web analytics, custom web development, SEO and IT. For the last 5 years, our focus has mostly been around web analytics and web data extraction.

Some questions regarding setup of the Web Data Extraction Engine

Q: Could you please respond to the addition or change of functions as needed? Also, can you respond to changes in the display composition etc. of data acquisition target site?

A: The parser is developed keeping in mind that html structure is not a constang, therefore most of the parsing settings are handled in the xml files. And most of the changes can be obtained by changing those xml instructions.

We will continue handling the software. We usually estimate all tasks and do them in a small less than $1000 chunks.

Q: Are there any costs and running costs other than those suggested?

A: Initial payment is $590. You need a server on which the parser will work and proxy account.

VPS even not big is ok for the start, it should be an Apache, PHP, MySQL. The parser is not very heavy, the big resources are required to get a lot of threads.

Proxy is required to run the parser. You need to obtain a rotating proxy account. For example, Stormproxies provide an account of 40 threads for $40/mo.

If you want to test parser, you may probably take the initial proxy account from Proxyrotator, they provide it for < $10/mo, it would be enough for some tests.

40 threads proxy would allow to do like 40 html pages per second parsed. Regarding Amazon parsing - it would be like 1 mln products parsed a day. It should be enough for average load. But you do not need proxy right from the start, only when we finish some of your parsing profile.

So, you need a server on which the parser will work and proxy account. Nothing else is needed.

Q: Can we protect our project with NDA?

A: Of course, we can do it and we can legally agree to keep our work private. Confidentiality refers to protecting information. We can't share confidential work without permission.

Waht is Web Data Extraction Engine.

High performance tool for crawl, extract and parse content from websites. It allows extract data from sites that uses anti-scraping technology. Data found on each page can be structured differently depending on its information. Get your data in a format most useful to you.

The best solution for web data scraping and extraction purposes.

Live Demo

Buy Now

Go to contents Back to previous page

Share on Facebook google icon

Share on Google

Twitter