Project

General

Profile

Actions

Bug #89682

closed

Linkvalidator: external URLs containing `& amp ;` or whitespace at the end not working

Added by Stefan Franke over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Should have
Assignee:
-
Category:
Linkvalidator
Target version:
-
Start date:
2019-11-14
Due date:
% Done:

100%

Estimated time:
TYPO3 Version:
9
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

When inserting external links via CKEditor, ampersands (&) are converted to & amp ; in the source code. That's fine for usage in the frontend, but the link validator seems to have problems with this as it tries to verify the link containing the html-entity-version of the ampersand (which doesn't work).

Links ending with whitespace (e. g. <a href="https://www.typo3.org/ ">link</a>) also work in the frontend but not with the link validator.

The attached patch file should take care of both issues.


Files


Related issues 1 (0 open1 closed)

Related to TYPO3 Core - Bug #89488: HTML special characters fool linkvalidatorClosedSybille Peters2019-10-23

Actions
Actions #1

Updated by Jonas Eberle over 4 years ago

While & (HTML) should be converted to & for the URL whitespaces as in your example are not supposed to be in the frontend. If that works it is pure luck. They would be encoded to %20 and become part of the path.

So I think a `html_entity_decode()` between fetching from HTML and further processing should be used.

Actions #2

Updated by Sybille Peters over 4 years ago

  • Related to Bug #89488: HTML special characters fool linkvalidator added
Actions #3

Updated by Sybille Peters over 4 years ago

Thanks for pointing out the issue!

I think the html_entity_decode() looks like a good solution.

The problem is, TYPO3 gives us (as result from the linkref functions) the URL as should be used in the BE form fields. So it is encoded with &

If an URL that ends with (unencoded) whitespace valid? I think not. I agree with Jonas here. Actually I think these things should be validated much earlier, preferably in the link wizard. Adding a trim() now would actually mask the problem that this is an invalid URL which should have been invalidated or sanitized earlier.

Whitespaces should be encoded in the URL, as stated above. Whitespaces will currently already cause problems - independent of linkvalidator - if you enter URLs with whitespaces unencoded in the link wizard, but that is something that probably should be handled in the linkwizard itself.

Examples:

  • this is not a valid URL: "https://example.org/path with spaces?id=1&id2=2"
  • nor is this: "https://example.org/ "
  • this is: "https://example.org/path%20with%20spaces?id=1&id2=2"

You can use PHP filter_var

e.g.

if (filter_var($url, FILTER_VALIDATE_URL) === false) {
print("not a valid URL");
}

see https://stackoverflow.com/questions/2058578/best-way-to-check-if-a-url-is-valid

Actions #4

Updated by Gerrit Code Review over 4 years ago

  • Status changed from New to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/62634

Actions #5

Updated by Gerrit Code Review over 4 years ago

Patch set 1 for branch 9.5 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/62645

Actions #6

Updated by Sybille Peters over 4 years ago

  • Status changed from Under Review to Resolved
  • % Done changed from 0 to 100
Actions #7

Updated by Benni Mack over 4 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF