Bug #95238
openMetatags Keywords are not indexed by indexed_search
0%
Description
Hi, i had the problem that the indexed_search doesn't find any meta keywords on the search results.
Then i debugged this part:
https://github.com/TYPO3/typo3/blob/master/typo3/sysext/indexed_search/Classes/Indexer.php#L400-L415
if ($this->conf['index_metatags']) {
$meta = [];
$i = 0;
while ($this->embracingTags($headPart, 'meta', $dummy, $headPart, $meta[$i])) {
$i++;
}
// @todo The code below stops at first unset tag. Is that correct?
for ($i = 0; isset($meta[$i]); $i++) {
// decode HTML entities, meta tag content needs to be encoded later
$meta[$i] = GeneralUtility::get_tag_attributes($meta[$i], true);
if (stripos($meta[$i]['name'], 'keywords') !== false) {
$contentArr['keywords'] .= ',' . $this->addSpacesToKeywordList($meta[$i]['content']);
}
if (stripos($meta[$i]['name'], 'description') !== false) {
$contentArr['description'] .= ',' . $meta[$i]['content'];
}
}
The problem is, that the while only found the hreflang meta tag an then stop working.
Maybe you can change the part like this, with the new MetaTagApi ... it worked for me :)
// get keywords and description metatags
if ($this->conf['index_metatags']) {
// Get Keywords
$metaTagManager = GeneralUtility::makeInstance(MetaTagManagerRegistry::class)->getManagerForProperty('keywords');
$keywords = $metaTagManager->getProperty('keywords');
if(!empty($keywords[0]['content'])) $contentArr['keywords'] .= ',' . $this->addSpacesToKeywordList($keywords[0]['content']);
// Get Description
$metaTagManager = GeneralUtility::makeInstance(MetaTagManagerRegistry::class)->getManagerForProperty('description');
$pageDescription = $metaTagManager->getProperty('description');
if(!empty($pageDescription[0]['content'])) $contentArr['description'] .= ',' . $pageDescription[0]['content'];
}
Updated by Christian Hackl over 3 years ago
I have looked around a bit in the Class Indexer:
the Title comes from "indexedDocTitle" for whatever reason(?) and the rest of the meta tags from the HTML Content. But the HTML Content is not rendered yet - there are still the placeholders in it, something like "".
e. g. Indexer.php line: 319 - $this->conf['content'];
There it is already clear that he can not parse out meta tags...
Updated by Christian Hackl over 3 years ago
Workaround:
In your own "ext_localconf.php" write:
unset($GLOBALS['TYPO3_CONF_VARS']['SC_OPTIONS']['tslib/class.tslib_fe.php']['contentPostProc-cached']['indexed_search']);
create a PSR-15 Middleware and call it AFTER typo3/cms-frontend/tsfe
Then at the Middleware process() method, call something like:
public function process(ServerRequestInterface $request, RequestHandlerInterface $handler): ResponseInterface {
// ...
$tsfe = $GLOBALS['TSFE'];
$TypoScriptFrontendHook = GeneralUtility::makeInstance(\TYPO3\CMS\IndexedSearch\Hook\TypoScriptFrontendHook::class);
$TypoScriptFrontendHook->indexPageContent([], $tsfe);
// ...
}
In this solution the "no_cache" condition is not considered.
If you want consider the "no_cache" write something like:
public function process(ServerRequestInterface $request, RequestHandlerInterface $handler): ResponseInterface {
$response = $handler->handle($request);
$tsfe = $GLOBALS['TSFE'];
$TypoScriptFrontendHook = GeneralUtility::makeInstance(\TYPO3\CMS\IndexedSearch\Hook\TypoScriptFrontendHook::class);
if(!$tsfe->no_cache) {
$TypoScriptFrontendHook->indexPageContent([], $tsfe);
}
return $response;
}
just put something together, maybe someone needs:
https://github.com/Hauer-Heinrich/hh_indexed_search
Updated by B. Kausch about 2 years ago
- TYPO3 Version changed from 10 to 11
This is clearly a bug. Since switching to the Metatag API, the head content looks like this:
<head>
<meta charset="utf-8">
<!--
This website is powered by TYPO3 - inspiring people to share!
TYPO3 is a free open source Content Management Framework initially created by Kasper Skaarhoj and licensed under GNU/GPL.
TYPO3 is copyright 1998-2023 of Kasper Skaarhoj. Extensions are copyright of their respective owners.
Information and contribution at https://typo3.org/
-->
<!-- ###TITLEdfaab768279a90e4d957fa20450f0d20### -->
<!-- ###METAdfaab768279a90e4d957fa20450f0d20### -->
<!-- ###CSS_LIBSdfaab768279a90e4d957fa20450f0d20### -->
<!-- ###CSS_INCLUDEdfaab768279a90e4d957fa20450f0d20### -->
<!-- ###CSS_INLINEdfaab768279a90e4d957fa20450f0d20### -->
<!-- ###JS_LIBSdfaab768279a90e4d957fa20450f0d20### -->
<!-- ###JS_INCLUDEdfaab768279a90e4d957fa20450f0d20### -->
<!-- ###JS_INLINEdfaab768279a90e4d957fa20450f0d20### -->
<!-- ###HEADERDATAdfaab768279a90e4d957fa20450f0d20### -->
</head>
No meta tags to be found...
Updated by Martin Weymayer about 2 months ago
3 years later bug still exists :-( also in TYPO3 12 ...
Updated by Garvin Hicking about 2 months ago
Maybe a weird question, but: since meta tags nowadays are only used for display in search results and not evaluated by search engines, the keywords should always be contained in regular page content.
If that happens then indexed search would also index these keywords (outside the meta scope). So I would actually recommend to drop that metatag indexing feature alltogether, what do you think?