Feature #89426

Remove stale broken links in tx_linkvalidator_link

Added by Sybille Peters about 1 year ago. Updated 8 months ago.

Status:
New
Priority:
Should have
Assignee:
-
Category:
Linkvalidator
Target version:
-
Start date:
2019-10-15
Due date:
% Done:

0%

PHP Version:
Tags:
Complexity:
Sprint Focus:

Description

If a page (or content element, or ...) is removed, the related broken links are not removed from tx_linkvalidator_link.

They are not displayed because only broken links of subpages of current page are displayed, so it is not a huge problem. But, eventually, this clutters up the tx_linkvalidator_link table.

Currently, there is no possibility to remove stale (no longer relevant) broken links except removing them manually from the db table.

History

#1 Updated by Sybille Peters 12 months ago

  • Subject changed from Give possibility to remove stale broken links in tx_linkvalidator_link to Remove (or do not show) stale broken links in tx_linkvalidator_link

#2 Updated by Sybille Peters 12 months ago

  • Parent task set to #85006

#3 Updated by Riccardo De Contardi 9 months ago

  • Status changed from New to Needs Feedback

Is this one still reproducible? Could you add an example?

I tried with the following test on 10.3.0-dev:

1) I created a new "Header" content element (ID=269)
2) on the header_link field, I added a non-existent link: https://www.typo3.org/sidfhrthorteiheoprit
3) Info > linkvalidator > check links > the link is reported (external links)
4) on the database > table tx_linkvalidator_links contains the line:

uid record_uid record_pid headline field table_name link_title url url_response last_check link_type needs_recheck
27 269 244 header test header_link tt_content NULL https://www.typo3.org/sidfhrthorteiheoprit {"valid":false,"errorParams":{"errorType":404,"exception":"Client error: `GET https:\/\/www.typo3.org\/sidfhrthorteiheoprit` resulted in a `404 Not Found` response:\n<!DOCTYPE html>\n<html lang=\"en-US\">\n<head>\n\n<meta charset=\"utf-8\">\n<!-- \n\tMaintained by the typo3.org team\n\n\tThis websit (truncated...)\n","message":"The requested url was not found (404)."}} 1581633094 external 0

5) Delete the content element
6) Info > linkvalidator > check links (again) > now the broken external links reported are 0
7) on the database the row has been removed

Am I misinterpreting it? A different test is necessary?

#4 Updated by Sybille Peters 8 months ago

Your test was perfect except in order to reproduce this, you must remove the page (not just the content).

Clarification: Before links are rechecked, linkvalidator removes all links of list of pages it is rechecking from tx_linkvalidator_links. However, if a page no longer exists, the corresponding broken links will never be removed from tx_linkvalidator_link. They will not show up in the list of broken links in the BE. But, eventually, the list of broken links will contain "stale" (no longer relevant) broken links and this will likely increase over time.

You can only see this if you look directly in the DB table.

Reproduce

  1. Create content with one or more broken links
  2. Check for broken links (in BE linkvalidator module or by using the scheduler task)
  3. Remove a page which contains content with broken links
  4. Check for broken links again
  5. Look in table tx_linkvalidator_link

Expected result

No records for deleted content in tx_linkvalidator_link

Actual result

tx_linkvalidator_link DB table constains records for deleted content.

#5 Updated by Sybille Peters 8 months ago

  • Subject changed from Remove (or do not show) stale broken links in tx_linkvalidator_link to Remove stale broken links in tx_linkvalidator_link

#6 Updated by Sybille Peters 8 months ago

  • Description updated (diff)

#7 Updated by Sybille Peters 8 months ago

The title and description was wrong / misleading. I have corrected this.

#8 Updated by Sybille Peters 8 months ago

  • Parent task deleted (#85006)

#9 Updated by Riccardo De Contardi 8 months ago

  • Status changed from Needs Feedback to New

Also available in: Atom PDF