Feature #14355
closedcatdoc with default charsets on indexed search
0%
Description
Catdoc default parses doc files with cp-something not western european, resultning in a bad display of the search result. Usual workaround: compile catdoc with default charset for source and destinations. Problem: if no hands on server available. Solutions: provide -d and -s for execs to catdoc
Proposed solution: ts-setup flags source and destination charset. I.e.
source=cp1262
dest=8859-1
(catdoc: http://www.45.free.net/~vitus/ice/catdoc/charsets.html) and modify readFileContent in class.indexer.php accordingly, i.e.:
catdoc s[ts>source] -d[ts-dest] and all search result displays will be correct according to locale
(issue imported from #M417)
Updated by Michael Stucki about 19 years ago
Fixed in 3.8.0. The indexer now indexes all external files with utf-8 charset and converts it back right before the text is displayed on the frontend.