English Deutsch Français Italiano Español Português 繁體中文 Bahasa Indonesia Tiếng Việt ภาษาไทย
All categories

can i grab data automatically from a yellow pages webiste & save it in excel. the site is running in aspx.
I want to develop a procedure which will go the link page and copy specific data and then go on next button and open 2nd page and copy data .....

2006-07-18 07:10:45 · 5 answers · asked by vickyhanif 1 in Computers & Internet Programming & Design

5 answers

Hmm what you have to do is a simple Script. Any language that has Sockets could do that.

What you could do is the following. Since your trying to copy information from a website, your technically trying to duplicate its database. So if your doing that, you cannot make it all dynamic, like store it all in memory since that is too much information.

The way I could think of doing that is the following:
- Open a HTTP Socket to that website URL (in any language it is just 1 line)
- Save all the contents of that website to a folder local, so you download every single page. Your program will goto the next page since you could just parse the NEXT PAGE portion on that website with ease with any Stirng function or Regular Expression

What you could do , some websites have the Number of results, so you could instantly find the number of pages for that result you wish to capture. http://localhost/?page=1 etc http://localhost/?page=2
As you see, you can open that website everytime in your program, and save its content disk. So you will have site1page1.html, site1page2.html ..... until the end of the results.

Now once you finish downloading the pages, you could now parse each page as the STEP 2 of the program. Each page should look the exact same, so what you do is just Use simple String Functions to capture the exact location of that line you want to save. Once you get the information, you can store it in an array or an object. Once you finish parsing that page, you save and dump that information in a Database. And you continue the loop of pages yousaved locally.

As you see, this is simply a 2 STEP situation, you download the pages to your local drive, and you parse them one by one and save them to the database.

In VB6 you could do it in a couple of lines, in PHP you could do this in couple of lines since the html_get_contents function is given. In C++ it will take you some time, not hard but will be more programming. In C# would be simple since many funcitons are given to you.

Good Luck! Remember it might be illegal of what your doing! :)

2006-07-18 07:36:08 · answer #1 · answered by ? 6 · 0 0

The yellow page site has to be set up as a Web Service unless you ghetto rig your programming code. Look into XML.

2006-07-18 07:12:22 · answer #2 · answered by Anonymous · 0 0

yes. its called manual data entry on your part.

2006-07-18 07:14:53 · answer #3 · answered by rob 3 · 0 0

what your refering to is screen scraping

Try this site to give you some guidance.

http://www.rexx.com/~dkuhlman/quixote_htmlscraping.html

2006-07-18 07:16:06 · answer #4 · answered by Panther 3 · 0 0

that info is copyrighted. ou better get permission first.

2006-07-18 07:15:50 · answer #5 · answered by Anonymous · 0 0

fedest.com, questions and answers