Feature #96447

Linkvalidator concept & open decisions

Added by Sybille Peters 5 months ago.

Status:
New
Priority:
Should have
Assignee:
-
Category:
Linkvalidator
Target version:
-
Start date:
2022-01-01
Due date:
% Done:

0%

Estimated time:
PHP Version:
Tags:
concept, decision required
Complexity:
Sprint Focus:

Description

Before proceeding, it would be good to have a concept, a plan and a roadmap. In order to achieve that a number of things need to be decided:

Priority

  • decide how to do the link checking, also decide about throttling external link checking and / or using link target cache and exclude list
  • address current problems: e.g. content which is not rendered is checked (e.g. bodytext in plugins)

Open decisions

software architecture / structural / refactoring

  • split up classes
  • separate general API which can be moved to core and functionality specific to linkvalidator
  • look at reference index (sys_refindex) which also contains references (e.g. page links), and there is a lot of overlap in functionality

configuration

  • make more configurable, e.g. make it possible for extensions to use lowlevel API but create their own module
  • rethink usage of TSconfig

Testing

  • more tests, acceptance tests missing, test coverage low

link checking

  • when / how to do the link checking (e.g. due full check via scheduler, do "on-the-fly" checking when content is changed, e.g. via hook, do immediate (synchronous) checking or asynchronous etc.)
  • external links: throttle checks to reduce excessive checking of external sites (which may lead to site being blocked), #89287
  • external link target cache to store result of checks
  • exclude / ignore list of external link targets (URL) to exclude "false positives" (URLs that are checked as broken but are not broken) from link checking, #85127, #92822
  • figure out reasons for "false positives" and try to reduce them, #85006
  • address problem of content which is not rendered in FE and / or not editable in Backend is checked nonetheless. This content should not be checked for broken links, and if not editable, clicking the pencil will result in error, e.g. * pages.url if doktype != 3 * content which will not be rendered due to l18n_cfg * content which will not be rendered because in hidden gridelement * content.bodytext where bodytext not used due to ctype / list_type (e.g. plugins) * translated (connected mode), original content hidden, #95195
  • some things not checked, e.g. shortcut pages
  • decide what are links, "soft references", references etc. and what linkvalidator should cover (e.g. shortcut pages are redirects, not links, file references are not (necessarily) links. In TYPO3 this is sometimes not clearly defined what is what and terms used ambiguously. See #92542, #83835

scheduler

  • BUG (regression): no longer possible to enter several pids in scheduler, see #90848
  • FEATURE: if no start pid is given, use all start pids from sites configuration (see brofix)

GUI

  • move list to separate module?
  • add pagination (see brofix)
  • add sorting (see brofix)
  • add filtering (see brofix)
  • declutter list? - currently contains a lot of text, e.g. full page path (even if editor is working in a mountpoint)
  • add buttons to jump to the page or open page layout
  • open buttons to recheck a specific broken link?

email report

  • email report: what is relevant? E.g. in brofix a number of links in general is calculated and a percentage of broken links / total links is calculated.
  • create email reports for editors (which show only the pages / content they have access to, as in the BE module)

No data to display

Also available in: Atom PDF