English Deutsch Français Italiano Español Português 繁體中文 Bahasa Indonesia Tiếng Việt ภาษาไทย
All categories

I don't know how to explain this very well, but for example, let's say I wanted to see every page cnn.com has posted on the internet. Maybe they have pages published that aren't linked from anywhere else. Is there a software or some other way of seeing everything http://www.cnn.com/?????

I want to do this for a much smaller web site of course.

2007-05-26 23:20:19 · 4 answers · asked by Me 4 in Computers & Internet Internet Other - Internet

4 answers

You will have to use spidering software of some type. There are several available
http://www.trellian.com/sitespider/index.html
has one.
Many sites pages are actually NOT published, like CNN, the pages are in a database and only when someone clicks the link or calls the page does the page get "built". These active pages can only be downloaded or seen by using software that forces them to be generated such as a web browser.

2007-05-26 23:36:00 · answer #1 · answered by Tracy L 7 · 1 0

Hit every button on the navigation bar on the home page. Then hit every button on any other navigation bar that comes up on the pages you navigate to from the home page. This would work better if you just do a search within the site which cnn would have but not most personal websites.

2007-05-27 06:28:55 · answer #2 · answered by Marissa 6 · 0 1

Google: "site:cnn.com", that will return every page indexed by google. Or you could just code your own spider, its not that hard. "Download Page -> Scan For Links -> Repeat"

2007-05-27 06:32:27 · answer #3 · answered by Barrucadu 2 · 1 0

You would probably need a crawler program but even if there are some on the internet for download, it's best if you write it yourself if you know how to. I don't

2007-05-27 06:23:55 · answer #4 · answered by Anonymous · 0 0

fedest.com, questions and answers