Bug #14268: function substUrlsInPlainText in class.t3lib_div.php cant extract properly an url with char other than space at end - TYPO3 Core - TYPO3 Forge

Actions

Copy link

Bug #14268

closed

function substUrlsInPlainText in class.t3lib_div.php cant extract properly an url with char other than space at end

Added by old_hoang over 20 years ago. Updated over 18 years ago.

Status:

Closed

Priority:

Should have

Assignee:

Wolfgang Klinger

Category:

Backend API

Target version:

Start date:

2004-08-12

Due date:

% Done:

Estimated time:

TYPO3 Version:

3.5.0 final

PHP Version:

Tags:

Complexity:

Is Regression:

Sprint Focus:

Description

the code $newParts = split('[[:space:]]|\)|\(',$v,2);
in the function cant extract properly an url with char other than space at
the end.
for example http://www.cantgetlinkproperly.de! or
http://www.cantgetlinkproperly.de<br><br>
Result is a wrong link in table cache_md5params!

This occurs for example if you send a plaintext newsletter with embedded
html-content objects with html-links.

(issue imported from #M284)

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by old_hoang over 20 years ago

change $newParts = split('[[:space:]]|\)|\(',$v,2);

$newParts = split("[[:space:]]|\)|\(|<",$v,2);

will do for html-tags at the end of the url. By the way why is ( and ) in the regexpression?

Actions

Copy link

Updated by Ingmar Schlecht over 20 years ago

The '(' and ')' are in the regex for the same reason you added '<' to the list: In order to allow for other characters than [:space:] to terminate the URL.

For example in the following example mail you'll see that the ')' terminates the URL.

Dear User

You have won the car (see http://domain.tld/index.php?id=2&asdf)

Regards,
your sweepstakes team

I hope I have answered your question.

However, I don't really like your fix to the problem about the '<' character.
I think the regexp should contain ALL characters that are not allowed in an URL.

After having a look at http://www.rfc-editor.org/rfc/rfc2396.txt, I'd say all of the following characters are possible as URL delimiters and should be checked by the regexp:
"<" | ">" | <"> | "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"

Consider a URL written like that:
"http://domain.tld/index.php?id=2&asdf"
Or like that:
<http://domain.tld/index.php?id=2&asdf>
Or like that:
(http://domain.tld/index.php?id=2&asdf)

All of those possibilities should be considered, and as the RFC forbids to use these characters anyway, it should not be a problem. The only characters I'm not sure about is "[" and "]" because they are often illegally used by Typo3 for URLs.

Anyway, if there will be a fix to this bug, it will not go into the 3.6 branch but rather into HEAD/3.7-dev.

Actions

Copy link

Updated by old_hoang over 20 years ago

Hello Ingmar,

but thats what my fix do! I dont understand! My problem is that I put in the newsletter a link of the type typolink object with wrap <br>|<br> (coded in typoscript with the current page id to provide the nl reader a link to the page itself, i.e. server based newsletter with all images etc, and sending out only plaintext newsletter with link to itself) . Because of this the link is not split correctly and stored wrong in this jumpurl table. If you want to add all chars not allowed, I'm fine with it.

Greets,

Chi

Actions

Copy link