PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Some codes not showing up in search

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Some codes not showing up in search

    In my site www.icdmeister.com/site/ one feature is to search for diagnosis codes to find their descriptions. Examples of codes include V70.0, 311, 599.0, 706.1, V25.03, etc. I've configured ZoomSearch with the indexing word rule to allow dots to join words. The max unique words to index limit (I've set it to 300000) has not been exceeded.

    All of the codes listed above seem to have been indexed and are listed appropriately when searched for. Other codes (V65.46, 305.1, V25.04) are not found by Zoom Search.

    Interestingly, V25.03 and V25.04 are on the same page. The first code is searchable, the second is not. The source code for the content seems indistinguishable to my eye:
    <li><a name="Initiation_of_other_contraceptive_measures"> Initiation of other contraceptive measures</a>
    <span class="code">V25.02</span><ul>
    <li>including fitting of diaphragm, prescription of foams, creams, or other agents</li>
    </ul></li>
    <li>
    <a name="Encounter_for_emergency_contraceptive_counse ling_and_prescription">Encounter for emergency contraceptive counseling and prescription</a>
    <span class="code">V25.03</span></li>
    <li><a name="Counseling_and_instruction_in_natural_family _planning">
    Counseling and instruction in natural family planning</a> to avoid
    pregnancy<span class="code">V25.04</span></li>
    <li> <a name="advice">Other general counseling</a> and advice <span class="code">V25.09</span><ul>
    <li>including family planning advice</li>
    </ul>
    Any suggestions to fix this?

    Thanks,

    Andrew

  • #2
    The problem is in your markup. You have used span tags in the markup and there is no space between the word "pregnancy" and "V25.04" as seen in the code extract below:

    Code:
    <li><a name="Counseling_and_instruction_in_natural_family_planning">
      Counseling and instruction in natural family planning</a> to avoid 
      pregnancy<span class="code">V25.04</span></li>
    Span tags are inline HTML tags. This means that it does not introduce a break in the text, much like bold <b>, italics <i> and underline <ul> tags. Technically, your markup is actually suggesting that it is one word "pregnancyV25.04".

    It does not appear like this in your browser because your CSS is causing it to space the words apart. But if you open the page without the CSS file, or turn off stylesheets in your browser, you will see the text as follows:

    Counseling and instruction in natural family planning to avoid pregnancyV25.04
    And if you use Zoom to search for the word "pregnancyV25.04", you will find the search result returned correctly.

    The same thing is happening for the other codes that are failing to match. The ones that are returning have a space or newline before the span tag.

    Hope that helps and let us know if you have further questions.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Thanks Raymond. That explains it.

      If I'm understanding this I should be able to fix this by doing a search and replace throughout my content to add an &nbsp; before each span tag and then reindexing it. I'll try this and reply with the results.

      Andrew

      Comment


      • #4
        I think it would be better to include a normal space character, rather than a non breaking space character. But the idea is correct.

        Comment


        • #5
          Very interesting. This is the first I've heard of the difference between a normal space and a non-breaking space. I've been seeing (and occasionally using) the code &nbsp; for years without realizing it stood for non-breaking space. So I did some Google searching to try to understand this concept and found "HTML entity " for a normal space. But I haven't come across this code before. Not sure if it would make sense to use this in place of &nbsp;

          For my website, I tried inserting a space in the HTML before the span tag. This didn't make the unindexed codes indexed. I didn't expect this to work since my understanding is that HTML ignores whitespace.

          Next I tried to insert a &nbsp; before each span tag. This seems to have partially fixed my problem. After reindexing, Zoom Search now finds V65.46, & V25.04. I still doesn't find 305.1. In fact it seems not to find any of the codes in the 305 series. You can locate the pages with them by searching for *305* but they don't show up if I just search for 305.1 or 305.62 for instance.

          I've tried to insert a whitespace either before or after the &nbsp;
          <li>Continuous &nbsp;<span class="code">305.61</span></li>
          and
          <li>Continuous&nbsp; <span class="code">305.61</span></li>
          but this doesn't seem to allow these codes to be indexed either.

          I tried to substitute the for &nbsp; but this didn't make code 305.61 indexed either.

          What can I try next to fix this?

          Andrew

          Comment


          • #6
            Actually, I suspect you're still indexing cached pages. Click on "Configure" and check the option to "Reload all files (do not use cache)" and try re-indexing again.

            I just tried indexing the page here and it indexed "Continuous" and "305.61" as two separate words.

            Originally posted by aschecht View Post
            For my website, I tried inserting a space in the HTML before the span tag. This didn't make the unindexed codes indexed. I didn't expect this to work since my understanding is that HTML ignores whitespace.
            Not quite. Whitespaces are ignored for the rendering and layout of the content, but they are a valid and important character for the purpose of the HTML markup. You can say that multiple whitespaces are ignored, but the absence or presence of a single whitespace character is definitely important.

            That is, while the HTML:

            Code:
            <p>cat  dog<p>
            (with two spaces in between) will be the same as:

            Code:
            <p>cat dog</p>
            that does not mean that, "<p>catdog</p>" is the same as "<p>cat dog</p>" (obviously).

            And likewise,

            Code:
            Continuous<span class="code">305.61</span>
            is not the same as:

            Code:
            Continuous <span class="code">305.61</span>
            Just as the following HTML:

            Code:
            cat<b>dog</b>
            is rendered as one word like so: catdog

            Hope that makes it more clear.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              I thought to clear the cache when viewing the pages in the browser but didn't realize Zoom Search had it's own cache. Clearing the cache seems to have fixed the problem - all the codes are now searchable (as far as I can tell). Thanks for the help and also thanks for the tips on better understanding HTML whitespace.

              Andrew

              Comment

              Working...
              X