Feature #83847
closedEpic #85006: Reduce falsely reported broken links
Linkvalidator should remove repaired links from report after editing record
100%
Description
This change will improve the workflow for editors, who fix broken links.
In the reports view you can click on the pencil to edit the record. If you edit and close, the backend will automatically jump back to the Report view. If the broken link was fixed, I would expect it to be removed from the list of broken links.
This is currently not the case. It makes fixing links very tedious in case you have several because you have to keep switching to the "Check links" tab and recheck. If you don't do that you can easily loose track in the long list of broken links (what was already fixed and what not?) Also, it is possible to deactivate the "Check Links" tab so not all editors can initiate unlimited crawling. In that case you are out of luck.
Reproduce¶
- Create 3 broken links in the body text of a content element (e.g. text & media)
- Info > Linkvalidator > Check links (current page, check all link types)
- Switch to "Report" tab. You should see 3 broken links in list
- click on pencil icon to edit the record.
- Remove one of the broken links
- click on save and close
Expected Result¶
You should now see only 2 broken links in list
Actual Result¶
There are still 3 broken links. You must go back to "Check links" tab and check again.
Possible solutions¶
When the editor edits, only a specific field of a record gets editied, e.g. tt_content.bodytext or pages.url.
Only one of the following solutions should be implemented:
- Recheck if the broken links in the field that the editor just edited are still in the field. If not, remove them from list.
- Recheck all links in record / field that was just edited. Update the list
- Change visual appearance of all broken links corresponding to edited record (e.g. dim it) to indicate that we are currently not sure if links are still broken. Display an icon to recheck broken links. This will default to behaviour 1 or 2.
- Try to determine a diff of what changed and only do a recheck for that
These are the pros and cons:
- Is very lightweight. No link checking on the fly is necessary. We just check if the links are still in the field. This has the disadvantage: If new broken links are introduced or a broken link is changed, the new broken links will not be added to the list. I think this is the best solution currently, but the biggest problem is having too many falsely reported broken links not removed from list. We optimize for reducing falsely reported broken links.
- for number 2 (recheck of all links in record): There may be a large number of links. External link checking may have delays. The timeout is currently set to 10. So if we have slow external webpages or there are problems, a recheck may take 30 seconds or more (as worst case). So, this is probably something we want to do asynchronously. But then we don't have the refreshing of the list.
- for number 3 this could actually be easily done by using sys_history which can be checked for matching records that match recuid, tablename, userid, timestamp (range). This then contains previous field in history_data. The links can be extracted and compared. (this only makes sense for RTE)
- This is clunky and a little more helpful than previously but not very helpful.
Files