PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Any way to get rid of parts of MS Word MetaData?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Any way to get rid of parts of MS Word MetaData?

    Making these a skipword just inhibits being able do search...

    I am pulling in the metadata from the MS Word plugin for search results, to help my users quickly head for articles most likely to be of interest for a search.

    When doing so, MS Word [Office 2003 and newer] documents that have autofields always end up with the ugly "MERGEFORMAT" at the very beginning of the result:

    \* MERGEFORMAT Article Header would be here Author: \* MERGEFORMAT author name would be here...

    Is there a hack available to NOT pull in Microsoft Field Codes and yet pull in the other metadata, typically from File Properties...

  • #2
    There is currently no built-in method for filtering out Word macros such as MERGEFORMAT switches when indexing Word documents.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Thanks

      In that case, I'll abandon that idea and get rid of them in the output window with a bit of php or JavaScript.

      Comment

      Working...
      X