The HTML documents are being saved from MS Word and exported using the "Filtered Option" (note no word documents are being indexed only HTML). These documents are cleaned up and formatted using CSS in Dreamweaver. All documents originally carried a WOrd template title in the TITLE tag, in this case "FWR Template". The title tag was changed to reflect the correct title for each HTML document. I cannot seem to get the "FWR Template" from the title field when the documents are indexed using zoom (results page). I have searched sitewide and removed any occurrence of the that string and the page title in the search results still carry "FWR Template" - all the docs are titled correctly. I wonder if there is some meta data somewhere thats not in the file is overriding the title field in HTML - the files still show up with the word html icon - although i have removed all references to word from the file. As far as i could see the only place i am seeing this is in the zoom results page
Announcement
Collapse
No announcement yet.
Problem with Title in Search Result
Collapse
X
-
If you are using Spider Mode, you might want to try turning off caching (in the Zoom Configuration window, on the "General" tab, check the option "Reload all files (do not use cache)") and re-indexing.
But if you still have this problem (or you are using Offline Mode, where this would not apply), then you will have to provide us with an example of one of these Word to HTML pages you created. Since you said that you have modified them in Dreamweaver, etc. we can not know what they actually contain without seeing the file, and it is possible that there is meta data stored within the HTML file that you are not aware of. Viewing the HTML source of the page should show this pretty clearly though.
If you want us to take a look at it, you can either upload the file to your website and provide us with a URL to the page; or e-mail us the page in question.
-
Fixed
First off i was using offline mode. WHen you asked for an example file i went and created a test site - as i am working with highly confidential documents on an company intranet - so i was going to create a few test documents using the same word template on a new site and send that. When i did it it worked perfectly. The only difference that i did was there was no test site (in dreamweaver you could specify a local testing server with IIS running on your machine) - set up everything on the actual network path. and have zoom search output the files directly to the search file used by the website. There was no reference to the template words in any of the code - unless it was somehow using the meta data tags thats not inside the file (if there is such a thing) as you could have opened the html files in notepad and remove any references.
Although this seemed to solve the problem for me - it doesnt make sense.
Comment
-
Zoom does not pick up meta data from outside of the source code of a HTML file. It will use ".desc" files if enabled, for other file formats, but not HTML.
If you are sure that the file does not contain the text anywhere in its source code, I would say that it's most likely you were using a different set of index files than the ones most recently updated. Now that you have it outputting the files directly to the desired folder, this should prevent mistakes when copying the files manually. It is a common mistake to copy the wrong files, or omit the necessary files when users copy the Required Files manually, especially as you end up having multiple sets of index files in different folders from various attempts.
Try it again using your actual document files (but again, specifying the output folder to be the actual destination path) and see if that solves it.
Comment
-
I should add that Word is known for creating terrible HTML files. Microsoft Word adds a whole bunch of proprietary "Office" tags, that are not standard compliant, and are even known to cause problems in their own browsers. These tags are usually obscured in HTML comments, and it may be possible that you do not see them unless you examine the HTML source code carefully using a text editor that does not try to hide or suppress certain information (like WYSIWYG web page applications often do).
Some more information in this FAQ (although note that in your case, this has nothing to do with the "search_template.html" file as described, but the general issues with MS Word produced HTML files is relevant):
http://www.wrensoft.com/zoom/support...#msword_markup
Comment
Comment