Bug #14268
closedfunction substUrlsInPlainText in class.t3lib_div.php cant extract properly an url with char other than space at end
0%
Description
the code $newParts = split('[[:space:]]|\)|\(',$v,2);
in the function cant extract properly an url with char other than space at
the end.
for example http://www.cantgetlinkproperly.de! or
http://www.cantgetlinkproperly.de<br><br>
Result is a wrong link in table cache_md5params!
This occurs for example if you send a plaintext newsletter with embedded
html-content objects with html-links.
(issue imported from #M284)
Updated by old_hoang over 20 years ago
change $newParts = split('[[:space:]]|\)|\(',$v,2);
to
$newParts = split("[[:space:]]|\)|\(|<",$v,2);
will do for html-tags at the end of the url. By the way why is ( and ) in the regexpression?
Updated by Ingmar Schlecht over 20 years ago
The '(' and ')' are in the regex for the same reason you added '<' to the list: In order to allow for other characters than [:space:] to terminate the URL.
For example in the following example mail you'll see that the ')' terminates the URL.
Dear User
You have won the car (see http://domain.tld/index.php?id=2&asdf)
Regards,
your sweepstakes team
I hope I have answered your question.
However, I don't really like your fix to the problem about the '<' character.
I think the regexp should contain ALL characters that are not allowed in an URL.
After having a look at http://www.rfc-editor.org/rfc/rfc2396.txt, I'd say all of the following characters are possible as URL delimiters and should be checked by the regexp:
"<" | ">" | <"> | "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
Consider a URL written like that:
"http://domain.tld/index.php?id=2&asdf"
Or like that:
<http://domain.tld/index.php?id=2&asdf>
Or like that:
(http://domain.tld/index.php?id=2&asdf)
All of those possibilities should be considered, and as the RFC forbids to use these characters anyway, it should not be a problem. The only characters I'm not sure about is "[" and "]" because they are often illegally used by Typo3 for URLs.
Anyway, if there will be a fix to this bug, it will not go into the 3.6 branch but rather into HEAD/3.7-dev.
Updated by old_hoang over 20 years ago
Hello Ingmar,
but thats what my fix do! I dont understand! My problem is that I put in the newsletter a link of the type typolink object with wrap <br>|<br> (coded in typoscript with the current page id to provide the nl reader a link to the page itself, i.e. server based newsletter with all images etc, and sending out only plaintext newsletter with link to itself) . Because of this the link is not split correctly and stored wrong in this jumpurl table. If you want to add all chars not allowed, I'm fine with it.
Greets,
Chi
Updated by old_chihoang over 19 years ago
Hallo Ingmar,
anyway here is another fix (should occurs in Typo3.7 too):
newParts = split('[[:space:]\<]|\)|\(',$v,2);
Greets,
Chi
Updated by Michael Stucki over 19 years ago
Hi Ingmar, have you fixed this yet? Is the bug still reproducable? Didn't test it myself...
Updated by Ingmar Schlecht over 19 years ago
No, I didn't fix it and I won't fix it for 3.8.0 because I don't have time for that right now.
Updated by Wolfgang Klinger over 18 years ago
fixed in CVS
it's a compromise,
the following characters are now allowed to terminate the URL:
any kind of whitespace (space, tab, ..) and
<>"{}|\^`()'