PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Search results not displaying the main .html pages

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Search results not displaying the main .html pages

    Using offline indexing and then uploading to our servers, the results are not displaying on the results page and I cannot figure out why.
    The main .html pages are not showing in the results at all. I am using the javascript version, we don't have access to a server side script due to our IT department.
    I have the .html set up correctly with titles, etc.. and added the boost meta tags.
    My weightings are set to +5 boost for the page titles, description headings.
    The keywords from the .pdfs are coming up ahead of all other settings and they are not related in most cases, including the main .html pages. I have tried several different setting to see what the results are and they remain the same.

    Any ideas why or how to fix the problem?

  • #2
    I failed to mention that the .html files are located within folders named by category but the html pages inside these folder are labeled index.html. The titles are labeled the same as the folder names. I have boosted these pages. Is there any way to get the search to find the index pages by the page title or description?

    Comment


    • #3
      Are the HTML files actually indexed?
      Are they valid HTML files?
      If you are indexing in offline mode, then I assume you realise that any server side scripting will be ignored during the indexing process.
      The page titles and meta descriptions are (at least by default) indexed. So you should be able to search for them.
      Is this page online somewhere where we can see it?

      Comment


      • #4
        Thank you for your relpl.
        Yes, the .html files are indexed and they are valid.
        I am indexing in offline mode and then uploading to our servers.
        We don't have server side scripting.
        It seems the .pdf documents are being found instead of the .html files.

        Comment


        • #5
          How do I know if the .html files are actually indexed and how can I correct this if they are not?

          Comment


          • #6
            Originally posted by jknotek View Post
            How do I know if the .html files are actually indexed and how can I correct this if they are not?
            The "Log" tab after indexing (and during) shows the files that have been indexed. And what files are skipped etc. You need to make sure the files in question haven't been skipped or excluded for some given reason. The log would tell you what that reason is (and it may be due to your configuration).

            Originally posted by jknotek View Post
            The keywords from the .pdfs are coming up ahead of all other settings and they are not related in most cases, including the main .html pages. I have tried several different setting to see what the results are and they remain the same.
            I presume the words actually do appear on the PDF files, it's just that the PDF files are so large that they occur so often that they swamp the results because they score considerably higher than small HTML pages with a few mentions of the same word.

            To fix this, use the setting under "Configure"->"Weightings"->"Content density". Set this to "Strong adjustment" and you will give preference to smaller web pages, over larger documents.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              Ray,
              Thank you for the responses. Fixing the weightings was most helpful.
              It turns out that the "content filtering" was blocking all of our .html pages. Not sure why.
              We had a few keywords listed that were not listed on the .html pages. We removed the content filtering and used the skip options instead and this seemed to fix our problem. Wanted to let you know in case anyone else experiences the same issues with Javascript.

              Thanks

              Comment


              • #8
                Originally posted by jknotek View Post
                It turns out that the "content filtering" was blocking all of our .html pages. Not sure why.
                We had a few keywords listed that were not listed on the .html pages.
                This can be tricky to spot if you're not fully aware of what Content Filtering does.

                For one thing, there's the "+" and "-" syntax to take into account. Secondly, it applies to the HTML source code, not just the page content. So if you're not very careful, it could be easy to be filtering pages you did not intend if you have not checked the HTML source.

                If you would still like to use the feature, I would suggest clicking on the Help button and reading the examples and documentation.

                If you still think it's doing something it shouldn't, then provide us with your .zcfg file containing your content filtering list, and also the URL to the web page you are indexing (that is being filtered), and we can take a look.
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment

                Working...
                X