PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Rename search_template.html

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    There's nobody confused here, I'm trying to figure why this program indexes the way it does. A link sometimes is part of the content and sometimes it isn't. For my purposes it never is. There must be a simple option in configuration to choose whether to index the content in links or not. I am indexing hundreds of dynamically generated pages and there is no way to manually insert ZOOMSTOP and ZOOMRESTART.

    Comment


    • #17
      Originally posted by webtrade View Post
      There's nobody confused here,
      My comment was in response to your questions regarding the Indexing Word rules, and your suggestions that they affect the parsing of HTML tags and the indexing of text links. They do not. I hope you did not take my comment personally, I was just pointing out that you had made some assumptions with incorrect premises, and I had to clear them up.

      Originally posted by webtrade View Post
      I'm trying to figure why this program indexes the way it does. A link sometimes is part of the content and sometimes it isn't. For my purposes it never is. There must be a simple option in configuration to choose whether to index the content in links or not.
      What you are asking for is not normal or typical behaviour, and I have explained why. I can understand that it's not exactly what you would like it to do right now, and it is fair enough to say that it would be helpful in your scenario, if there was an option to toggle off the indexing of link text. However, I disagree that there "must" be such an option, or that this should be expected behaviour (as you implied earlier, when you asked "why would anyone...?").

      The content of a page is essentially all the text that you can see in the browser. It would be nice if Artificial Intelligence can be employed and it can guess at what counts as navigation menus, or what are headers and footers etc., but this is not technically possible or practical.

      In the search engine world, the exclusion of link text from content is a very rare option. In fact, I've never seen it in action before. As I said, I can understand that there would be situations where this would be a handy option, but I'm arguing against the idea that we are being unreasonable in making this assumption. I'm also pointing out that there are reasonable workarounds (explained below).

      To further illustrate this, you should note that all link text are included in Google search results. Here's a search on Google for the word "login" where you can see the link text in the description for the facebook site:
      http://www.google.com/search?q=Login

      Originally posted by webtrade View Post
      I am indexing hundreds of dynamically generated pages and there is no way to manually insert ZOOMSTOP and ZOOMRESTART.
      Actually, dynamically generated pages should be easiest to make such changes. Typically, this means that a single script (PHP, ASP, etc.) is responsible for generating many different pages, and that a single line change in the script (the part that creates the "Previous" and "Next" links) will apply the change to ALL pages generated with that script.

      Even if there are many scripts to apply the change to, a careful search+replace across your files should be possible (eg. the code to generate the Previous link should be partially similar, if not exactly the same on each script). It is no different to a requirement to change the header or footer of a page. These things should be possible.
      --Ray
      Wrensoft Web Software
      Sydney, Australia
      Zoom Search Engine

      Comment

      Working...
      X