Saturday, January 03, 2009

Supplementary Index Issues

Oh, and what else have I learned about supplementary index stuff.....

Here's how to find pages (and the total number) that Google knows about (example site
(don't use the quotes)
Search: "" Total listing: 560
Search: "*" Main listing: 367
Subtract to get the supplementary listing: 193
(You can get the math done for you by the mapelli tool)

Next question. Any simple method or tool for sorting thru the google results to find the 193 pages that aren't in the main index?

Conceptually, I see two approaches to finding out which of my pages are in the supplementary index:

1. I take the total listing and copy them into a document. But, since Google returns results 10 at a time, I will have to do this 56 times. Then, I'll take the main listing results and one by one, remove them from my list until I find the 193 that are left.

2. The webmaster tool can be used to find the pages that have zero internal or external links. While these pages might actually have links, this is google's way of communicating that these pages have been put into the supplementary index.

Once I know which pages are in the index, I imagine I'll see why they are in the index. They might be:
- duplicate or nearly duplicate pages
- pages with no link support
- pages with no text content (all graphics or flash)


