Project

General

Profile

Actions

Bug #99909

open

False positive broken links by parsing URLs not inside <a> tags

Added by Christian Ludwig about 1 year ago. Updated 8 months ago.

Status:
Needs Feedback
Priority:
Should have
Assignee:
-
Category:
Linkvalidator
Target version:
-
Start date:
2023-02-09
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
10
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

Linkvalidator always reports an error due to the appearance of "http://" or "https://" in the body text (just text, no href etc.).

Linkvalidator should only check real links not the apparente of link like texts.

In addition it would be grate to have an icon in the "Listing of broken links" next to the pencil that can be clicked to ignore these errors (this link) in future crawlings.


Files

image.png (17.7 KB) image.png Message in Linkvalidator Christian Ludwig, 2023-02-09 20:28
image_2.png (24 KB) image_2.png Text in RTE Christian Ludwig, 2023-02-09 20:29
image_3.png (20.1 KB) image_3.png Text in RTE (source code view) Christian Ludwig, 2023-02-09 20:30
source-code-from-rendered-page.png (14 KB) source-code-from-rendered-page.png Christian Ludwig, 2023-08-30 11:09

Related issues 5 (5 open0 closed)

Related to TYPO3 Core - Epic #85006: Reduce falsely reported broken linksNew2018-02-11

Actions
Related to TYPO3 Core - Bug #97937: Linkvalidator: Links and &nbsp; in tt_content.bodytext cause problems in UrlSoftReferenceParserUnder Review2022-07-14

Actions
Related to TYPO3 Core - Bug #95878: In linkvalidator, soft reference parser extracts 2 links from rich text with URL as anchor textUnder Review2021-11-05

Actions
Related to TYPO3 Core - Feature #85127: linkvalidator: Add possibility to exclude specific external URLs / domains or patternsNew2018-05-31

Actions
Related to TYPO3 Core - Bug #101670: Linkvalidator reports some external URLs as "false positives"New2023-08-13

Actions
Actions #1

Updated by Rémy DANIEL 9 months ago

  • Related to Epic #85006: Reduce falsely reported broken links added
Actions #2

Updated by Sybille Peters 9 months ago

  • Status changed from New to Needs Feedback

Linkvalidator always reports an error due to the appearance of "http://" or "https://" in the body text (just text, no href etc.).

Can you give an example here and steps to reproduce (preferably reproducible in TYPO3 v13)?

Do you mean URLs directly in the body text, not enclosed in an <a> tag, such as

https://example.org/abc

as opposed to

<a href="https://example.org/abc">link</a>

If URLs are parsed as links also depends on how you configured the softref field in TCA.
For example
$TCA['tt_content']['columns']['bodytext']['config']['softref'] = typolink_tag,email[subst],url;

If you remove the "url" here, URLs in bodytext will no longer be parsed as links.

Compare also how these URLs are rendered in the Frontend by TYPO3.

So, depending on your configuration this may be intended and correct behavior.

I am very well aware that there are still some problems with parsing, but this issue needs a better description.

See also:

See also other issues for linkvalidator: https://forge.typo3.org/projects/typo3cms-core/issues?c%5B%5D=tracker&c%5B%5D=status&c%5B%5D=priority&c%5B%5D=subject&c%5B%5D=assigned_to&c%5B%5D=category&c%5B%5D=fixed_version&f%5B%5D=status_id&f%5B%5D=category_id&f%5B%5D=&group_by=&op%5Bcategory_id%5D=%3D&op%5Bstatus_id%5D=o&per_page=50&set_filter=1&sort=id%3Adesc&t%5B%5D=&utf8=%E2%9C%93&v%5Bcategory_id%5D%5B%5D=1493

Actions #3

Updated by Sybille Peters 9 months ago

  • Related to Bug #97937: Linkvalidator: Links and &nbsp; in tt_content.bodytext cause problems in UrlSoftReferenceParser added
Actions #4

Updated by Sybille Peters 9 months ago

  • Related to Bug #95878: In linkvalidator, soft reference parser extracts 2 links from rich text with URL as anchor text added
Actions #5

Updated by Sybille Peters 9 months ago

In addition it would be grate to have an icon in the "Listing of broken links" next to the pencil that can be clicked to ignore these errors (this link) in future crawlings.

For excluding specific external URLs there is already an issue: https://forge.typo3.org/issues/85127

You can also look at my extension "Broken Link Fixer" (brofix) which is a fork of linkvalidator and implements this, see https://extensions.typo3.org/extension/brofix

Actions #6

Updated by Sybille Peters 9 months ago

  • Related to Feature #85127: linkvalidator: Add possibility to exclude specific external URLs / domains or patterns added
Actions #7

Updated by Sybille Peters 9 months ago

  • Subject changed from False positive broken links to False positive broken links by parsing URLs not inside <a> tags

If this issue relates specifically to parsing, I would change the title to differentiate from: https://forge.typo3.org/issues/101670

Actions #8

Updated by Sybille Peters 9 months ago

  • Related to Bug #101670: Linkvalidator reports some external URLs as "false positives" added
Actions #9

Updated by Christian Ludwig 8 months ago

In the described case it is simply text, no anchor href. Here the example code from the rendered page as it is shown to the visitor.

<p>
    ... Eine verschlüsselte Verbindung erkennen Sie daran, dass die Adresszeile des Browsers
    von „http://“ auf „https://“ wechselt und an dem Schloss-Symbol in Ihrer Browserzeile.
</p>

Actions

Also available in: Atom PDF