PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Possible to convert the score to a percent number?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Possible to convert the score to a percent number?

    Wondering if there is a way to convert the score in the search output to a percent number - like 76%, etc. Maybe a future release item? Thanks.

  • #2
    If it was a percentage value, then what would it be a percentage of? We have seen percentage values on some products, but this only makes sense if you don't think too hard about it.

    ------
    David

    Comment


    • #3
      Total all of the scores and then it becomes a % of the total score value for that particular search. You hit the nail on the head - from a consumer perspective they do not want to think a whole lot about it - when you see "score", I feel people do not understand what that means - a percentage is just pretty straightforward...

      By the way, I purchased Zoom instead of going with Atomz ($5,000 quote)and absolutely LOVE IT! With a few simple modifications to the search_template file - which I am publishing out of our Collage content management system - the results wrap straight into our site. Still want to do a few tweaks but I am thrilled with it so far.

      Check it out at www.childrensdayton.org

      Thanks for a great product!

      Dave

      Comment


      • #4
        Your suggestion would not give a very good range of percentage values. Here are a couple of examples using your method.

        Example 1:
        Searching for a rare word on a small site returns 1 page.
        This page would be given a percent of 100%, even though the document might not really be relevant.

        Example 2:
        The next day, a 2nd page, containing the rare word in example 1, is added to the site. The percentage of the initial page drops from 100% to 50%. Even though the document was just as relevant as it was the day before.

        Example 3:
        Searching for a common word on a large site returns 20,000 pages.
        The number 1 result would probably be given a percent of around ~10%, but the document would could be very relevant.

        So you could have irrelevant documents with high percentages and relevant documents with low percentages. The exact opposite of what would be intuitively expected. Probably leading to more confusion, not less.

        Thanks for positive comments about our software.

        -----
        David

        Comment


        • #5
          I am not the developer - just giving suggestions from a customer perspective. As you mentioned in your first response - other companies have apparently figured it out.

          In your reply, you state in example #1 that the single returned page may not be relevant but then in example #2 you say it is just as relevant as the day before (?). If in example #1 that was the only page returned on the entire site - seems like it should receive a 100% rating. In example #2, if each page only contains the rare word 1 time each - then 50% for each would (in a ratio perspective - which is what percentages are about) be exactly correct. In the current scoring method - wouldn't the 2 pages show an identical score of say 30 (or whatever the score)? How is that any different?

          In the third example it would be a lower percentage but again - it is all about ratios and just displaying a number in a format people are used to seeing. Sure, the percentage may be lower - say 29% - and the score might be 350 but it is still the same result order. I can do a search today and the highest score I receive is 25...

          The key is the results will be displayed in the exact same order whether it shows the score or a percentage.

          Thanks for listening...

          Comment


          • #6
            The potential problem is that people will think that the percentage value would be an indication of relevancy. A user could (quite reasonably) think that a result of 100% means the document is an exact match for what they are searching for. But of course what it would really mean was that nothing better could be found on this site. The percent value would in no way indicate if the document was relevant to the search or not.

            So I think that if percentages were to be used, a better algorithm would be required. And at the moment we don't have a algorithm that makes sense from a user perspective and a technical perspective.

            Yes, other products have done it, bit I still maintain that their solutions don't make sense (if you think about it).

            Also note that none of the major search engines (Google, Yahoo, etc) use percentage values. Internally however they would all have a score value, like Zoom does, in order to rank pages.

            -----
            David

            Comment


            • #7
              Is there a "max" number of points that a document could presumably earn based on the search term? I know nothing about the points algorithm, but for example let's say that if the term is in the file name that would be +5 points, if the term is in the document that would be +10 points, and if the term is in the document's title that would be +15 points. So if those are the only way to earn points, the max would be 30 points. Then a percentage could be calculated based on the number of points earned of the total possible. (I.e. if the term is only in the document's title it would be 15 out of 30 points, or 50% relevant)

              Comment


              • #8
                The algorithm is broadly in line with your description below but significantly more complicated. You have only considered searching for a single word and assumed the word only appears in each document once.

                You also need to consider multiword searches (AND and OR), exact phrase searches, wildcard searches, date sorting, and a mixture of all of the above.

                There is also the problem that not all pages have a valid title or meta data.

                On top of this the user has control over the weightings for different types of text and individual pages can be boosted or de-boosted via meta data.

                In V4.1 we are further complicating matters by weighting heading text differently and taking into account word density.

                The output of the calculation is a score ranging from 1 to 65,000. But for most searches on most sites the score will be less than 5000.

                So the realistic maximum will vary somewhat from site to site.

                ---------
                David

                Comment

                Working...
                X