Announcement

**Ray** · May-02-2007, 03:48 AM

Presuming that you have server-side generated scripts which access the SQL database and report their active/inactive status, and presuming that these pages will be the only places where the links to the PDF files themselves will be offered, then you can consider the following:

When a page displays a HTML link to a PDF file which is inactive, add some scripting to the page so that it will surround the link with the  and  tags. This will prevent Zoom's spider mode from following the links and downloading (and thus, indexing) the file.

More information on the ZOOMSTOP and ZOOMRESTART tags can be found in the Users Guide and Help files.

Note that you will need to block off all links to these inactive PDF files on your website, when crawling the site in spider mode. So if you have other pages where there are direct links to the PDF files, you will need to do this on those pages too.

**jlperry** · May-02-2007, 12:51 PM

Actually, I'm not using Zoom to index the website itself, which will contain active/inactive links based on a users input to a database search.

The Zoom is meant to offer a full-text search of the active PDFs, which of course the database couldn't offer. I use Zoom to just search the directory of PDFs itself, using the directory listing as a guide to search the PDFs. I was thinking that there may be some way to use the Skip Links configuration, but I would want the database to generate the Skip Links and not have to manually go through the GUI config tool to add all those files... I figured there might be a better way to do it. Any other thoughts?

Thank you for your very quick response.

Jen

**Ray** · May-03-2007, 12:59 AM

If you are an experienced developer, we provide a SDK for Zoom which includes documentation on the ZCFG file format. This means that you can then create a script that generates a new ZCFG file each time (with a different skip list reflecting the inactive statuses), and call Zoom via the command-line to re-index the files accordingly. More information on the SDK here:
http://www.wrensoft.com/zoomsdk/index.html

Alternatively, you could use Spider Mode and point it to a specially created script/page which generates HTML links to only the active PDF files. This way Zoom will never find the inactive PDF files.

**jlperry** · May-03-2007, 01:22 AM

Thank you. Those are excellent suggestions. I didn't realize that you had an SDK. I will look at it.

Best,
Jen

Announcement

PDF indexing strategy

PDF indexing strategy

Comment

Comment

Comment

Comment