PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Regarding index of the content inside a href tag

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regarding index of the content inside a href tag

    Hi again. The Zoom search when indexing the html code of a page the text if finds an hyperlink like this <a href="http://www.yale.com">Yale</a> only takes the "Yale" word to be included for the search. But is there a way to index also the content inside the href tags???. The idea is that you can write "yale" in the search box but also www.yale.com

  • #2
    Hmm, there's no functionality to do this. But I'm curious as to why you think this is necessary. In most cases, the link URLs are not very meaningful for searching. You need to remember that links can be created in many different ways, as relative paths (eg. "../myfile.html"), absolute paths ("/home/myfile.html") or absolute URLs (http://www.blah.com/myfile.html). How would the other formats be searchable? Do you only want to search by domains as in your example? Do you expect it to strip out the filename and paths etc and only have the domain name (stripping out the "http://" part of the URL)? You might need to be more specific as to what you are trying to achieve.

    There is already an option to index the filename of the file being indexed. But what you are asking for sounds more like it is actually indexing outgoing links. This would be similar to the "link:" search functionality in Google, which allows you to find all pages containing a certain link. If this is what you are asking for, you would be the first person to ask for it at this point, but we'll keep it in mind.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Thanks for answering. The question is that my boss wants to be able to make searches using the url instead of the name of the place. In the example I wrote above, the logical behavior would be to use the word Yale to make the search but for some situations would be desirable to make the search using the url www.yale.com. This kind of search seems weird with this example because is someone is looking for theword Yale it happens that the url is the same plus the usual www and the .com but there are quite a few cases in which this does not happen, for example many people knows the MASSACHUSETTS INSTITUTE OF TECHNOLOGY for the acronym MIT so it makes sense to try a search in our website using www.mit.edu if you are not sure of what means MIT

      Comment


      • #4
        If there are a limited number of keywords (or domains in this case) that you wish to allow users to search for, you can perhaps add them as Synonyms.

        Assuming you have "Dots" enabled as a word joining character (on the "Index Options" tab), you would be able to create a synonym for "yale" as "www.yale.com" so that searches for either would yield the same result.

        As I mentioned before, searching and indexing the link address might not actually be what you're expecting. For example, if you search for "www.yale.com" in your original example above, you would be directed to pages that contain the most links to Yale, and not pages that contain the most information on Yale. There is a significant difference.

        I'm thinking the Synonyms solution would be better suited for the scenario you describe. See the Users Guide for details.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment

        Working...
        X