Project

General

Profile

Actions

Epic #85006

open

Reduce falsely reported broken links

Added by Sybille Peters almost 6 years ago. Updated 8 months ago.

Status:
New
Priority:
Should have
Assignee:
-
Category:
Linkvalidator
Target version:
-
Start date:
2018-02-11
Due date:
% Done:

62%

Estimated time:
(Total: 0.00 h)
Sprint Focus:

Description

Falsely reported broken links are currently a main factor that makes link fixing with Linkvalidator tedious and annoying: there is no way to remove them from the list of broken links. When searching for links to fix, you have to check several that are not really an error. Furthermore, these stay in the list while the real broken links will disappear, so after fixing more links the ratio of falsely reported broken links to real broken links worsens.

By "falsely reported broken links" we mean links that Linkvalidator shows as broken but that are either not broken or that cannot be edited by the editor or some other reason why they are either irrelevant or cannot be fixed.

We already have several issues and open patches addressing these issues. This EPIC serves to give an overview.

Main reasons for false broken links

  • external link checking may fail. This means we will get false negatives links that actually work but are evaluated as "broken" by linkvalidator). We already improved here, but it still may happen. (see #89488, #86918)
  • Some links are not broken, but will not return HTTP Status Code 200. This are for example pages that require a login (403, 401).
  • broken links are in some fields that are no longer relevant, e.g. in tt_content.bodytext for content elements that do not use bodytext. This may happen if tt_content.ctype is changed (e.g. to plugin), which may often happen on older sites. (see #89182)
  • FIXED: the editor has no permission to edit the field or the record (#84214)
  • the broken link information is "stale", meaning, the broken link has already been fixed but linkvalidator has not rechecked the field or the record has been deleted (see #89426, #83847)

Ideas for reducing false broken links

do not check some external links

  • Make it possible to exclude URLs from link checking in the configuration (TSconfig), e.g. URLs starting with http://intranet.mysite.com/
  • Make it possible to exclude a specific link from link checking (in the RTE)
  • For more ease of use: In the list of broken links: add an action button to click on which will add the URL to the ignore URLs

check sooner

  • ideally, the broken link information should be updated as soon as a record is changed (e.g. broken links in list of broken links removed, as soon as record is deleted), e.g. by using NEW / UPDATE / DELETE events
  • alternatively, link checking could be done incrementally and more often, only checking the records that changed (see #92220)

do not check some fields

  • only check fields that will be rendered, e.g. not tt_content.bodytext for ctype='plugin', etc. (see #89182)

Files

ignore-url.png (6.22 KB) ignore-url.png Sybille Peters, 2020-09-12 18:50

Subtasks 8 (3 open5 closed)

Feature #83847: Linkvalidator should remove repaired links from report after editing record ClosedSybille Peters2018-02-11

Actions
Bug #84214: Linkvalidator should not check records without write permissionsClosedSybille Peters2018-03-12

Actions
Feature #85127: linkvalidator: Add possibility to exclude specific external URLs / domains or patternsNew2018-05-31

Actions
Bug #86918: Linkvalidator stops working on specific links (external URLs)Closed2018-11-13

Actions
Bug #89182: Linkvalidator should only check relevant fields in tableUnder Review2019-09-16

Actions
Bug #89488: HTML special characters fool linkvalidatorClosedSybille Peters2019-10-23

Actions
Feature #92297: Make it possible to mark specific links to not get checked by linkvalidatorClosedSybille Peters2020-09-12

Actions
Bug #101670: Linkvalidator reports some external URLs as "false positives"New2023-08-13

Actions

Related issues 5 (5 open0 closed)

Related to TYPO3 Core - Bug #99909: False positive broken links by parsing URLs not inside <a> tagsNeeds Feedback2023-02-09

Actions
Related to TYPO3 Core - Feature #96525: Add the possibility to repport only 404 errors (Don't show 403, 999 LinkedIn and SSL problems)Accepted2022-01-12

Actions
Related to TYPO3 Core - Feature #92822: Ignore button for link targetsUnder Review2020-11-11

Actions
Related to TYPO3 Core - Task #89287: Make linkvalidator crawling politeNew2019-09-26

Actions
Related to TYPO3 Core - Bug #97937: Linkvalidator: Links and &nbsp; in tt_content.bodytext cause problems in UrlSoftReferenceParserUnder Review2022-07-14

Actions
Actions #1

Updated by Riccardo De Contardi almost 6 years ago

  • Category set to Linkvalidator
Actions #2

Updated by Sybille Peters almost 6 years ago

  • Subject changed from Improve broken link handling in TYPO3 / linkvalidator rewrite to Epic: Improve broken link handling in TYPO3 / linkvalidator rewrite
Actions #3

Updated by Patrick Broens over 5 years ago

  • Related to Bug #84016: impexp: page links are parsed / replaced incorrectely due to error in SoftReferenceIndex added
Actions #4

Updated by Patrick Broens over 5 years ago

  • Related to Bug #85576: Linkvalidator not checking linked content elements with TypoLink added
Actions #5

Updated by Lina Wolf over 4 years ago

  • Related to Feature #89177: Change TsConfig Defaults of Linkvalidator and Enable all core fields containing links added
Actions #6

Updated by Sybille Peters over 4 years ago

  • Related to Bug #86918: Linkvalidator stops working on specific links (external URLs) added
Actions #7

Updated by Sybille Peters over 4 years ago

  • Subject changed from Epic: Improve broken link handling in TYPO3 / linkvalidator rewrite to Epic: Improve broken link handling in TYPO3 / linkvalidator
Actions #8

Updated by Sybille Peters over 4 years ago

  • Subject changed from Epic: Improve broken link handling in TYPO3 / linkvalidator to Reduce falsely reported broken links
  • Description updated (diff)
Actions #9

Updated by Sybille Peters over 4 years ago

  • Related to deleted (Bug #84016: impexp: page links are parsed / replaced incorrectely due to error in SoftReferenceIndex)
Actions #10

Updated by Sybille Peters over 4 years ago

  • Related to deleted (Bug #85576: Linkvalidator not checking linked content elements with TypoLink)
Actions #11

Updated by Sybille Peters over 4 years ago

  • Related to deleted (Feature #89177: Change TsConfig Defaults of Linkvalidator and Enable all core fields containing links)
Actions #12

Updated by Sybille Peters over 4 years ago

I changed, title, description and subtask to make this EPIC more focused.

Actions #13

Updated by Sybille Peters over 4 years ago

  • Description updated (diff)
Actions #14

Updated by Sybille Peters over 3 years ago

Actions #15

Updated by Sybille Peters almost 3 years ago

  • Assignee deleted (Sybille Peters)
Actions #16

Updated by Rémy DANIEL 8 months ago

  • Related to Bug #99909: False positive broken links by parsing URLs not inside <a> tags added
Actions #17

Updated by Rémy DANIEL 8 months ago

  • Related to Feature #96525: Add the possibility to repport only 404 errors (Don't show 403, 999 LinkedIn and SSL problems) added
Actions #18

Updated by Rémy DANIEL 8 months ago

Actions #19

Updated by Rémy DANIEL 8 months ago

  • Related to Task #89287: Make linkvalidator crawling polite added
Actions #20

Updated by Sybille Peters 8 months ago

  • Description updated (diff)
Actions #21

Updated by Sybille Peters 8 months ago

  • Subtask #101670 added
Actions #22

Updated by Ralf Hettinger 8 days ago

  • Related to Bug #97937: Linkvalidator: Links and &nbsp; in tt_content.bodytext cause problems in UrlSoftReferenceParser added
Actions

Also available in: Atom PDF