Actions
Bug #19254
closedindexing of records containing HTML leads to concatenated words
Start date:
2008-08-26
Due date:
% Done:
0%
Estimated time:
TYPO3 Version:
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:
Description
While indexing records via the class.crawler.php in the function indexSingleRecord() the content of the fields is simply passed through strip_tags() - that's not sufficient.
In class.indexer.php in the function splitHTMLContent() it's solved like this:
// remove tags, but first make sure we don't concatenate words by doing it
$contentArr['body'] = str_replace('<',' <',$contentArr['body']);
$contentArr['body'] = trim(strip_tags($contentArr['body']));
This has to be done here too:
$theContent = '';
foreach($fieldList as $k => $v) {
if (!$k) {
$theTitle = $r[$v];
} else {
$theContent.= $r[$v].' ';
}
}
// add the following lines to prevent concatenated words
$theTitle= str_replace('<',' <',$theTitle);
$theContent= str_replace('<',' <',$theContent);
// Indexing the record as a page (but with parameters set, see >backend_setFreeIndexUid())
$indexerObj>backend_indexAsTYPO3Page(
strip_tags($theTitle),
'',
'',
strip_tags($theContent),
$GLOBALS['LANG']->charSet, // Requires that
$r[$GLOBALS['TCA'][$cfgRec['table2index']]['ctrl']['tstamp']],
$r[$GLOBALS['TCA'][$cfgRec['table2index']]['ctrl']['crdate']],
$r['uid']
);
(issue imported from #M9229)
Files
Updated by Dmitry Dulepov over 14 years ago
Revisions:
- 7147 for 4.3
- 7146 for 4.4
Actions