PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Adding PDFs blocks results

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding PDFs blocks results

    I am confused with this one. The search engine works 100% and I am over the moon with it. The ability to configure and style it to your own liking is brilliant.

    However, when I add PDFs into the "scan options" I encounter a problem with the output search results.

    To see an example...
    Search with no PDFs:
    http://www.metadigm.co.uk/search_nopdf/search.php

    Search with PDFs:
    http://www.metadigm.co.uk/search_pdf/search.php

    Try putting "email" as the search term.

    The only difference is that in the "Zoom Indexer Configuration" => "Scan Options" I added .pdf to the "Scan Extensions". Nothing else changed.

    Why is it that when it comes to the output one gives a full listing of results and links but the other (the one with PDFs enabled) only shows as far as the number of results?

    Specialists In Network Security

  • #2
    The problem is nothing to do with PDF files. It is related to the size of the index.

    You have this server side configuration issue,
    PHP script returns no results on Apache

    Your server is killing the script before it completes. By adding all the PDF files you increased the size of your index by a factor of 5. Meaning that more RAM is used on the server and the execution time is slightly longer. Triggering the issue referenced above.

    Comment


    • #3
      Thanks for your help. It provided a good starting point with the problem, but unfortunately didn't rectify it.

      In the end I added
      ini_set("memory_limit","32M");
      to the top of settings.php

      And it worked first time. We undid all the server side config changes we had made to double-check that this one simple line was making the difference and it was.

      Maybe something to add in to the settings.php for future downloads?

      Thanks for the great software and also a swift reply and starting point to my problem. Hope I have helped you and others in providing a solution to a problem.

      Specialists In Network Security

      Comment


      • #4
        Glad to hear that you now have it working.

        The "memory_limit" setting is not available on all PHP configurations, so it is probably not something that we should include in our default script. We will however, add this to our documentation on the issue. I find it surprising however, that PHP did not return a warning or error message when this limit is reached. We may do some further testing to confirm PHP's behaviour when this setting is enabled. There is also the "max_execution_time" setting which may have a similar effect.

        Of further note, we do not recommend modifying the "settings.php" file. This file needs to be overwritten and updated by the Indexer, each time you re-index your site. If you need to add additional PHP commands to the script, you should do so by clicking on "Templates"->"Modify search script source code". Alternatively, if you wish to have multiple copies of customized scripts, you can save various versions of the scripts in different folders, and specify the path to your customized script per config file (under the "Advanced" tab).
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Originally posted by Ray View Post
          Glad to hear that you now have it working.
          Well the initial reply and your subsequent one have proved most useful. They gave me a great starting point and then a place to play around with the search script in order to give the results a bit of a "Google feel" to them.

          Have a look at:
          www.metadigm.co.uk/search/search.php

          and search for "utm"

          Note how the number of results now follows the reporting style of Google and I have lost the "Results for:", "x results found" and "x pages of results" have been dropped, with the time taken moved to the top.

          If there are no results (just search for a random string of characters), then it is reported back nicely.

          Added to that a script that doesn't allow you to search for a zero character string (try it!!), I've ended up with a very nice search engine that works a lot better than I initially set out.

          Thank you!!!

          Specialists In Network Security

          Comment


          • #6
            Looks good! We're pleased to see that it's working well for you.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment

            Working...
            X