PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

multibyte language problem with space character

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • multibyte language problem with space character

    I am interested in Zoom Search Engine script.

    I am Japanese.
    I tried free trial.
    We fill in search box by multibyte language.
    But it seems not support multibyte space in case of over two words.
    In case of two words we search,word1 space(multibyte)
    word2 we fill.
    I admit word1 space(not multibyte) word2 is work well.
    But we usually not change multibyte and not multibyte every search cases.
    Do you have some ideas to fix this problem?

  • #2
    Are you using UTF-8? Or some other character set?

    What Zoom script are you using, PHP, ASP, CGI or JS?

    What version of Zoom are you using?

    I searched Google for an exact definition of a multi-byte space but didn't find anything precise. I didn't find several conraditorary claims about it being, U+1680, U+FEFF, or U+3000, or U+2000. Do you know which one it is you are using?

    There is also this page listing 18 different types of spaces.

    Comment


    • #3
      >Are you using UTF-8? Or some other character set?
      Yes,I am using UTF-8.
      Not only script's template but also server and database character setting UTF-8.

      >What Zoom script are you using, PHP, ASP, CGI or JS?
      PHP

      >What version of Zoom are you using?
      Version 5.1 Free Edition.

      Please check it.

      Comment


      • #4
        It would help if you can provide us with a URL to a page which demonstrates the problem.

        Also, some example search queries to perform on this page, which illustrate what is being found, what is not, and what you are expecting.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Thanks Ray.
          I will send you a E-mail to Wrensoft.

          Makoto.

          Comment


          • #6
            We had a look at your site and have confirmed the problem.

            However, we noticed that you have not enabled the option to “Support single-case languages (eg. asian languages)”. This can be found on the “Languages” tab of the Configuration window.

            Enabling this option allows for better searching in Asian languages. It will handle the fact that words are not necessarily separated by spaces as they are in western languages.

            With this option enabled, your search queries with the multi-byte space will now return results.

            But this is not perfect. As the script is actually unaware of multi-byte spacing at the moment. What happened is that the word was divided into three “words” based on their character type (with the multibyte space being treated as a word). So this means that selecting “match all search words” will fail to return results.

            We will address this in the next version (V6) and add handling of multibyte space characters. In the meantime, enabling “Support single-case languages” and disabling the “match all search words” option should return reasonable results and act as a workaround.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              Thanks.
              I know.

              I will wait V6 release.
              Does 3rd qtr of 2008 mean from July to September?

              Comment


              • #8
                Does 3rd qtr of 2008 mean from July to September
                Yes, that is the current plan.

                Comment

                Working...
                X