I currently run an english based web site that has been using Zoom Search for a while with great succes. The nature of the content on my site has lead me to start developing a Japanese sister site for it and I wish to continue using Zoom Search, looking around on the site I do see Japanese is a supported language but I was concerned with this statement:
This means that an entire sentence may be indexed as a "word". However, if you enable "Substring match for all searches" on the "Languages" tab of the Configuration window, then searches which appear within a sentence will match correctly.
Zoom does not currently support indexing Shift-JIS pages. You will have to convert your website to UTF-8 if you wish to use it with Zoom.
Some words in the Japanese language ustilize both Kanji and hiragana to write out the single word and my concern is this would cause searches to fail. For example the word "split" is written with one Kanji and one hiragana. If zoom is splitting on the new hiragana after the single kanji and indexing it as two words what would happen when someone searches for the kanji and hiragana string?
If your website is encoded in UTF-8, Zoom will successfully index your site, and will be capable of performing searches. However, search performance and accuracy is limited, as Zoom will only split words by
- Formatting (spaces between words, or paragraphs, etc.)
- Change of character type (from hiragana to katakana, etc.)
This means that an entire sentence may be indexed as a "word". However, if you enable "Substring match for all searches" on the "Languages" tab of the Configuration window, then searches which appear within a sentence will match correctly.
Zoom does not currently support indexing Shift-JIS pages. You will have to convert your website to UTF-8 if you wish to use it with Zoom.
Comment