Announcement

**David** · May-25-2008, 11:41 PM

If you see the "indexing" message, then yes, the page would have been indexed.

Content might be missed if you have
1) Invalid HTML on the page
2) Content that is not in HTML but is generated on the fly, client side, by a script of some sort (e.g. Javascript)
3) You have excluded text from being indexed using ZOOMSTOP tags.
4) You have configured Zoom not to index page content (on the indexing options tab).
5) You have tricky server side browser sniffing which returns different content for different browsers
6) You have some authentication scheme running on your server and if you are not logged in, the content if every page says something like, "Please login to view this page". So the real page content is not visible.

If you still have a problem can you post the URL for the page in question and some examples of words that you think should have been found on the page.

**lfeuling** · May-26-2008, 12:49 AM

Content not being included

Thanks for the quick response! The subject pages do normally require authentication using htaccess/htpasswd. I disabled the htaccess so you can take a look. An example page is:
http://www.feuling.org/family/valentin/phg01.htm
and an example word on that page is: ProGenealogists

I have authentication setup in the Zoom Search configuration and it appears to be working properly. The results of the indexing process appear to be the same with the directory authentication enabled and with it disabled.

**David** · May-26-2008, 01:04 AM

The HTML on the phg01.htm page is invalid. This is an extract from the HTML on your page.
<html>
<head>
</head>
<body background="images/background.jpg">
<html>
<head>
<title>The Feuling Family Genealogy</title>
</head>
<body background="background.jpg">
</body>
</html>
<html>
<head>
<title>Genealogy page footer</title>
</head>
<body background="family/background.jpg">
</body>
</html>
</html>

3 body tags, 2 title tags and 7 HTML tags. You also have large blocks of NULL characters (0x00 in hex) in the document, which will cause problems. In short it is a bit of a mess.

The W3 HTML vaidator reports 277 errors on the page in question. See,
http://validator.w3.org/check?uri=ht...Inline&group=0

**lfeuling** · May-28-2008, 11:16 PM

The invalid pages have existed for years, so I never thought to look
closely at them (they are generated by a genealogy program). Thanks again
for your quick help and getting me pointed in the right direction! I've
figured out how to correct the HTML mess and now Zoom Search is working
perfectly. Zoom Search is a great product with Awesome support!

Announcement

Content not being included

Content not being included

Comment

Comment

Comment

Comment