Bug #16986
closedWord-Docs are not correct indexed
0%
Description
Typo3 4.0.4
Indexed Search 2.9.3
catdoc 0.91.5
I have configured indexed_search to index also external files. This works well for pdf, sxw, rtf, odt, ppt. But not for Word ".doc" - instead of the Text there are only question marks. It is surely a charset problem.
in a shell I could get the correct text from the file with the -8 option of catdoc:
catdoc -dutf-8 -8 worddok.doc
Is this a bug of catdoc? Should I use a newer version?
I have attached the file which could not get indexed correctly...
(issue imported from #M4985)
Files
Updated by Michael Stucki almost 18 years ago
I have catdoc 0.94 here, so maybe you first try updating...
Updated by Michael Stucki almost 18 years ago
If your catdoc doesn't work without the "-8" option, you'll need to fix it on that side. An upgrade should really help. With my 0.94 version the "-8" option is not required and the file is parsed correctly.
It's definitely not a TYPO3 bug!