Project

General

Profile

Actions

Bug #104387

open

Linkvalidator shows wrong url for linked content

Added by Ulrich Mathes 4 months ago. Updated 6 days ago.

Status:
Under Review
Priority:
Should have
Assignee:
-
Category:
Linkvalidator
Target version:
-
Start date:
2024-07-12
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
13
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

When Linkvalidator finds a broken link to an content element the url has to be stored as [pageUid]#c[contentElementUid]. In case of bodytext this is the case and tested.

In case of header_link for example only the contentElementUid gets stored as url. The report then shows a link to a page with the uid of the content element which is wrong.


First row shows the incorrect link, content element linked in header_link.
Second row shows the correct link, content element linked in bodytext.

As far as i can see, only fields with softref=typolink_tag handling links to content elements right.

TYPO3 12 and 13 are affected.


Files

clipboard-202407121743-zpghz.png (34.2 KB) clipboard-202407121743-zpghz.png Ulrich Mathes, 2024-07-12 15:43
parserResult.png (47.2 KB) parserResult.png Sybille Peters, 2024-10-29 12:08

Related issues 1 (0 open1 closed)

Related to TYPO3 Core - Bug #104902: Working link to pageuid#contenid in sys_file_reference.link is reported as broken (false positive)Closed2024-09-11

Actions
Actions #1

Updated by Ulrich Mathes 4 months ago

  • Description updated (diff)
Actions #2

Updated by Ulrich Mathes 4 months ago

  • Description updated (diff)
Actions #3

Updated by Garvin Hicking 4 months ago

Reproduced this via:

1. Create a page "Invisible"
2. Create a content element of type "Header Only". Can be empty.
3. Hide that content element.
4. Create a page "Linked".
5. Create a content element of type "Header Only". For the field "Link [header_link]" open the link browser, go to page "Invisible", select hidden content element there.
6. Go to linkvalidator, validate the page "Linked"
7. The resulting output will show the error from the screenshot, the broken link will lead to having the page UID set as the content element UID

I've tried to debug this for more than an hour, and the code is very hard for me to read.

https://github.com/TYPO3/typo3/blob/main/typo3/sysext/linkvalidator/Classes/LinkAnalyzer.php#L370 seems to be missing an array key like

            $results[$type][$table . ':' . $field . ':' . $idRecord . ':' . $reference['tokenID']]['pageAndAnchor'] = $record['pid'];

but I can't see how to construct this to only do that in case of a "t3://page?uid=4711#0815" link.

In https://github.com/TYPO3/typo3/blob/main/typo3/sysext/linkvalidator/Classes/Linktype/InternalLinktype.php#L95 there's already some 't3://' parsing involved, unsure if there a specific key needs to be set.

Ultimately, in https://github.com/TYPO3/typo3/blob/main/typo3/sysext/linkvalidator/Classes/LinkAnalyzer.php#L211 there's:

                if (!empty($entryValue['pageAndAnchor'] ?? '')) {
                    // Page with anchor, e.g. 18#1580
                    $url = $entryValue['pageAndAnchor'];
                } else {
                    $url = $entryValue['substr']['tokenValue'];
                }

where the proper URL would need to be referenced. As noted this "pageAndAnchor" would be a key that could be utilized. Otherwize the `$url` variable here could also be set to something from `$entryValue` but I don't really get where all this is coming from and to.

Jsut leaving this here in case it finds someone with expertise or more willpower than me.

Actions #4

Updated by Sybille Peters 4 months ago

To summarize: The problem only occurs if a link to a content element is used in header_link, not in bodytext?

I can reproduce this.

I think I should be able to provide a patch. Here is some more information:

About debugging: you ideally debug LinkAnalyzer::analyzeRecord when checking links, e.g. by putting a breakpoint there with condition, e.g. $table == 'tt_content' or putting the breakpoint on line 337.

I also checked the database first because I wanted to make sure the problem is in the checking:

select url,table_name,field from tx_linkvalidator_link where record_pid=663;
+-------------+------------+-------------+
| url         | table_name | field       |
+-------------+------------+-------------+
| 101999      | tt_content | header_link |
| 664#c101999 | tt_content | bodytext    |
+-------------+------------+-------------+

This confirms: It is already wrong when written to the DB, the URL should be the same.

The problem seems to be that for bodytext TypolinkTagSoftReferenceParser is used and for header_link TypolinkSoftReferenceParser. Both (correctly) return the same result, but in line 337 of LinkAnalyzer there is an if switch:

if ($softReferenceParser->getParserKey() === 'typolink_tag') {
  $this->analyzeTypoLinks($parserResult, $results, $htmlParser, $record, $field, $table);
} else {
  $this->analyzeLinks($parserResult, $results, $record, $field, $table);
}

So for header_link analyzeLinks is called and for bodytext analyzeTypoLinks - even though it is also a typolink. Where in the one case pageAndAnchor is written and in the other not.

And you can see in line 214 what happens:

if (!empty($entryValue['pageAndAnchor'] ?? '')) {
  // Page with anchor, e.g. 18#1580
  $url = $entryValue['pageAndAnchor'];
} else {
  $url = $entryValue['substr']['tokenValue'];
}

I did not thoroughly debug that part, but pretty confident that is the problem.

This code has not been changed much for years. I will also check other versions 11 and 12 to check if new problem or already existed.

Actions #5

Updated by Sybille Peters 4 months ago ยท Edited

I could also reproduce this in v11 and v12.

Actions #6

Updated by Ulrich Mathes 4 months ago

We just came across this issue as we show broken links with an dashboard widget. The editors don't usually use the backend module.

Actions #7

Updated by Sybille Peters 2 months ago

  • Related to Bug #104902: Working link to pageuid#contenid in sys_file_reference.link is reported as broken (false positive) added
Actions #8

Updated by Sybille Peters 24 days ago

I think the problem is in the TypolinkSoftReferenceParser. This returns 2 results for the one link.

Is that correct behaviour? Should we change the behaviour in linkvalidator or in the parser?

for example: $parserResult = $softReferenceParser->parse('tt_content', 'header_link', 700944, 't3://page?uid=108222#700943');

Actions #9

Updated by Sybille Peters 24 days ago

Please ignore my previous comment. It seems to be correct that the link parser returns 2 items for a link with anchor. (However the comment may still be useful for understanding the code).

I have fixed the behaviour in "brofix" which uses similar code see https://github.com/sypets/brofix/pull/404/files

Linkvalidator handles the results from TypolinkTagSoftreferenceParser and TypolinkSoftReferenceParser difference, for the former analyzeTypoLinks is called and for the latter analyzeLinks.

analyzeTypoLinks handles the duplicate entries for one link for the specific usecase page with anchor, analyzeLinks does not.

Actions #10

Updated by Gerrit Code Review 19 days ago

  • Status changed from New to Under Review

Patch set 1 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875

Actions #11

Updated by Gerrit Code Review 19 days ago

Patch set 2 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875

Actions #12

Updated by Gerrit Code Review 19 days ago

Patch set 3 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875

Actions #13

Updated by Gerrit Code Review 19 days ago

Patch set 4 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875

Actions #14

Updated by Gerrit Code Review 17 days ago

Patch set 5 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875

Actions #15

Updated by Gerrit Code Review 17 days ago

Patch set 6 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/86875

Actions #16

Updated by Gerrit Code Review 6 days ago

Patch set 1 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/87004

Actions

Also available in: Atom PDF