PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Scheduler on IIS won't index pages

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Scheduler on IIS won't index pages

    This is a problem that I expect has a simple solution?! I've installed ZoomIndexer for use on our company intranet. It works a treat, until I try to set up scheduled indexing. I've set it up to run as a user called "schedule" with appropriate permissions and password (this is used elsewhere for other scheduled tasks so I know this bit is OK). The process runs when it's told to.

    My intranet is coded under IIS, scripted with PHP. It is set up to use Windows authentication (if available) or anonymous if not. If you visit it via IE on Windows my intranet knows who you are and serves pages accordingly. If you visit on a Mac it won't know who you are so you're set as a guest user. This is all done by a small bit of php at the top of each page that checks for a value in a session variable and if this isn't set diverts to a logon.php script.

    (Still with me?) My problem (at last ) is that the scheduled task for ZoomIndexer never gets past the logon.php page. I've set up the start location as my sitemap so it indexes Word docs and PDFs and such like but only 1 php page (logon.php). The only way I can get it to index properly is to run ZoomIndexer manually, which is obviously not ideal. Interestingly, I need to run the program twice to get it to pick up all the php pages so the stand alone program must do something with sessions?!

    Sorry this was so long!!! I'll probably get told off. Would appreciate any ideas/help.

    Many thanks

    Andy
    -=-
    Mac is best

  • #2
    If you haven't seen it yet, take a look at our support page regarding indexing websites requiring authentication:
    http://www.wrensoft.com/zoom/support/auth.html

    It sounds like you may have cookie-based authentication and the reason it works manually is that you log in via IE before your indexing session (so that the cookie is set in IE and shared with Zoom). See the bottom part of the above page for more information.

    There are a few alternative methods described on that page, such as allowing for the login information to be passed in via the URL or changing your script so that a client identifying as ZoomSpider is allowed access.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Did the trick

      Thanks for the reply. The info page is one I managed to miss while hunting through the docs...

      In case anyone else reads this and has a similar problem, the solution that worked best for me was to change the small piece of code I've embedded at the start of most pages on my site. In there, I test for a session variable and if it's not set divert to a logon script. As suggested, I've added a line that checks for "ZoomSpider" in the HTTP_USER_AGENT and if found omits the diversion regardless of what's in the session variable.

      As this code is embedded by use of an include directive I only had to change it once...

      Thanks for the advice, the scheduled task is now happily indexing my entire site on a regular basis. Cool.

      Regards

      Andy
      -=-
      Mac is best

      Comment

      Working...
      X