Bug #104387
openLinkvalidator shows wrong url for linked content
Added by Ulrich Mathes 4 months ago. Updated 6 days ago.
0%
Description
When Linkvalidator finds a broken link to an content element the url
has to be stored as [pageUid]#c[contentElementUid]
. In case of bodytext
this is the case and tested.
In case of header_link
for example only the contentElementUid gets stored as url
. The report then shows a link to a page with the uid of the content element which is wrong.
First row shows the incorrect link, content element linked in header_link.
Second row shows the correct link, content element linked in bodytext.
As far as i can see, only fields with softref=typolink_tag
handling links to content elements right.
TYPO3 12 and 13 are affected.
Files
clipboard-202407121743-zpghz.png (34.2 KB) clipboard-202407121743-zpghz.png | Ulrich Mathes, 2024-07-12 15:43 | ||
parserResult.png (47.2 KB) parserResult.png | Sybille Peters, 2024-10-29 12:08 |
Updated by Garvin Hicking 4 months ago
Reproduced this via:
1. Create a page "Invisible"
2. Create a content element of type "Header Only". Can be empty.
3. Hide that content element.
4. Create a page "Linked".
5. Create a content element of type "Header Only". For the field "Link [header_link]" open the link browser, go to page "Invisible", select hidden content element there.
6. Go to linkvalidator, validate the page "Linked"
7. The resulting output will show the error from the screenshot, the broken link will lead to having the page UID set as the content element UID
I've tried to debug this for more than an hour, and the code is very hard for me to read.
https://github.com/TYPO3/typo3/blob/main/typo3/sysext/linkvalidator/Classes/LinkAnalyzer.php#L370
seems to be missing an array key like
$results[$type][$table . ':' . $field . ':' . $idRecord . ':' . $reference['tokenID']]['pageAndAnchor'] = $record['pid'];
but I can't see how to construct this to only do that in case of a "t3://page?uid=4711#0815" link.
In https://github.com/TYPO3/typo3/blob/main/typo3/sysext/linkvalidator/Classes/Linktype/InternalLinktype.php#L95
there's already some 't3://' parsing involved, unsure if there a specific key needs to be set.
Ultimately, in https://github.com/TYPO3/typo3/blob/main/typo3/sysext/linkvalidator/Classes/LinkAnalyzer.php#L211
there's:
if (!empty($entryValue['pageAndAnchor'] ?? '')) { // Page with anchor, e.g. 18#1580 $url = $entryValue['pageAndAnchor']; } else { $url = $entryValue['substr']['tokenValue']; }
where the proper URL would need to be referenced. As noted this "pageAndAnchor" would be a key that could be utilized. Otherwize the `$url` variable here could also be set to something from `$entryValue` but I don't really get where all this is coming from and to.
Jsut leaving this here in case it finds someone with expertise or more willpower than me.
Updated by Sybille Peters 4 months ago
To summarize: The problem only occurs if a link to a content element is used in header_link, not in bodytext?
I can reproduce this.
I think I should be able to provide a patch. Here is some more information:
About debugging: you ideally debug LinkAnalyzer::analyzeRecord when checking links, e.g. by putting a breakpoint there with condition, e.g. $table == 'tt_content' or putting the breakpoint on line 337.
I also checked the database first because I wanted to make sure the problem is in the checking:
select url,table_name,field from tx_linkvalidator_link where record_pid=663; +-------------+------------+-------------+ | url | table_name | field | +-------------+------------+-------------+ | 101999 | tt_content | header_link | | 664#c101999 | tt_content | bodytext | +-------------+------------+-------------+
This confirms: It is already wrong when written to the DB, the URL should be the same.
The problem seems to be that for bodytext TypolinkTagSoftReferenceParser is used and for header_link TypolinkSoftReferenceParser. Both (correctly) return the same result, but in line 337 of LinkAnalyzer there is an if switch:
if ($softReferenceParser->getParserKey() === 'typolink_tag') { $this->analyzeTypoLinks($parserResult, $results, $htmlParser, $record, $field, $table); } else { $this->analyzeLinks($parserResult, $results, $record, $field, $table); }
So for header_link analyzeLinks is called and for bodytext analyzeTypoLinks - even though it is also a typolink. Where in the one case pageAndAnchor is written and in the other not.
And you can see in line 214 what happens:
if (!empty($entryValue['pageAndAnchor'] ?? '')) { // Page with anchor, e.g. 18#1580 $url = $entryValue['pageAndAnchor']; } else { $url = $entryValue['substr']['tokenValue']; }
I did not thoroughly debug that part, but pretty confident that is the problem.
This code has not been changed much for years. I will also check other versions 11 and 12 to check if new problem or already existed.
Updated by Sybille Peters 4 months ago ยท Edited
I could also reproduce this in v11 and v12.
Updated by Ulrich Mathes 4 months ago
We just came across this issue as we show broken links with an dashboard widget. The editors don't usually use the backend module.
Updated by Sybille Peters 2 months ago
- Related to Bug #104902: Working link to pageuid#contenid in sys_file_reference.link is reported as broken (false positive) added
Updated by Sybille Peters 24 days ago
- File parserResult.png parserResult.png added
I think the problem is in the TypolinkSoftReferenceParser. This returns 2 results for the one link.
Is that correct behaviour? Should we change the behaviour in linkvalidator or in the parser?
for example: $parserResult = $softReferenceParser->parse('tt_content', 'header_link', 700944, 't3://page?uid=108222#700943');
Updated by Sybille Peters 24 days ago
Please ignore my previous comment. It seems to be correct that the link parser returns 2 items for a link with anchor. (However the comment may still be useful for understanding the code).
I have fixed the behaviour in "brofix" which uses similar code see https://github.com/sypets/brofix/pull/404/files
Linkvalidator handles the results from TypolinkTagSoftreferenceParser and TypolinkSoftReferenceParser difference, for the former analyzeTypoLinks is called and for the latter analyzeLinks.
analyzeTypoLinks handles the duplicate entries for one link for the specific usecase page with anchor, analyzeLinks does not.
Updated by Gerrit Code Review 19 days ago
- Status changed from New to Under Review
Patch set 1 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875
Updated by Gerrit Code Review 19 days ago
Patch set 2 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875
Updated by Gerrit Code Review 19 days ago
Patch set 3 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875
Updated by Gerrit Code Review 19 days ago
Patch set 4 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875
Updated by Gerrit Code Review 17 days ago
Patch set 5 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875
Updated by Gerrit Code Review 17 days ago
Patch set 6 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875
Updated by Gerrit Code Review 6 days ago
Patch set 1 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/87004