Project

General

Profile

Actions

Bug #14547

closed

XHTML compliance of typolink (non-compliant links are generated when additional parameters are used)

Added by Dmitry Dulepov about 19 years ago. Updated about 18 years ago.

Status:
Closed
Priority:
Should have
Category:
Frontend
Target version:
-
Start date:
2005-02-10
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
4.0
PHP Version:
4
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

Additional parameters to typolink must use "&" (not "&"), which is not XHTML-compliant.

Problem is located in several places, including:
- TSFE->cHashParams
- tslib_content->typolink
- ??? some others?

These function should be changed to both accept "&" and "&" (for compatibility) but convert "&" to "&"

(issue imported from #M772)


Files

class.tslib_content.php.typolink_amp.patch (2.91 KB) class.tslib_content.php.typolink_amp.patch Administrator Admin, 2005-05-18 14:39
class.tslib_content.php.amp.patch (10.2 KB) class.tslib_content.php.amp.patch Administrator Admin, 2005-05-18 14:39
class.tslib_menu.php.amp.patch (1.5 KB) class.tslib_menu.php.amp.patch Administrator Admin, 2005-05-18 14:44
class.tslib_fe.php.amp.patch (2.02 KB) class.tslib_fe.php.amp.patch Administrator Admin, 2005-05-18 14:59
class.t3lib_div.php.amp.patch (1.16 KB) class.t3lib_div.php.amp.patch Administrator Admin, 2005-05-18 15:18
bug772.diff (18 KB) bug772.diff Administrator Admin, 2005-05-22 22:23

Related issues 2 (0 open2 closed)

Related to TYPO3 Core - Bug #15362: template::getTabMenu double ampersandedClosedChristian Kuhn2006-01-04

Actions
Has duplicate TYPO3 Core - Bug #14648: getTypoLink (tslib_class.tslib_content.php) function produces non standard conform links.ClosedSebastian Kurfuerst2005-04-07

Actions
Actions #1

Updated by Ingmar Schlecht about 19 years ago

Hi Dmitry,

could you supply a patch?

cheers,
Ingmar

Actions #2

Updated by Martin Kutschker about 19 years ago

For tslib_content->getTypoLink the solution is easy. Replace:

$conf['additionalParams'].='&'.$k.'='.rawurlencode($v);

with

$conf['additionalParams'].='&'.$k.'='.rawurlencode($v);

Two more simple occurence is in frameParams (also found in tslib_content.php). tslib_content->typolink has quite a number of occurences, some are easy (just grep for '%).

Actions #3

Updated by Dmitry Dulepov about 19 years ago

I am not familar with tools to create patches (never needed them before). Actualy I just found a problem and did not have much time to make a deep investigation: our production server goes live these days...

Actions #4

Updated by Martin Kutschker about 19 years ago

All (I hope so...) functions in tslib_content.php:

function SEARCHRESULT
function imageLinkWrap($string,$imageFile,$conf)
function filelink($theValue, $conf)
function locDataJU($jumpUrl,$conf)
function http_makelinks($data,$conf)
function mailto_makelinks($data,$conf)
function typoLink($linktxt, $conf)
function getTypoLink($label,$params,$urlParameters=array(),$target='')
function getClosestMPvalueForPage($pageId, $raw=FALSE)
function editIcons($content,$params, $conf=array(), $currentRecord='', $dataArr=array(),$addUrlParamStr='')
function editPanelLinkWrap($string,$formName,$cmd,$currentRecord='',$confirm='')
function editPanelLinkWrap_doWrap($string,$url,$currentRecord)
function frameParams($setup, $typeNum)

Actions #5

Updated by Sacha Vorbeck about 19 years ago

keyword:accessibility

Actions #6

Updated by Bernhard Kraft almost 19 years ago

Do you mean it should output "&" as separator in GET parameters when an XHTML doctype is used. and "&" when an 4.n doctype is set ?

Actions #7

Updated by Martin Kutschker almost 19 years ago

Bernhard, no.

The point is that you have to quote ALL ampersands anywhere, be it in a text node (to speak DOM) or in an attribure value - in the document.

The actual GET query, uses of course the unquoted ampersand, as it is interpreted and sent by the client (ie the browser).

To sum up, in XHTML documents ampersands MUST be quoted, in HTML the MAY be quoted.

Masi

Actions #8

Updated by Martin Kutschker almost 19 years ago

tslib_content->getTypoLink now uses t3lib_div::implodeArrayForUrl which unfortunately is also not XHTML compliant.

Actions #9

Updated by Martin Kutschker almost 19 years ago

class.tslib_content.php.typolink_amp.patch fixes typolink()

class.tslib_content.php.amp.patch fixes the following methods:

SEARCHRESULT
imageLinkWrap
filelink
locDataJU
http_makelinks
mailto_makelinks
getClosestMPvalueForPage
frameParams
netprintApplication_offsiteLinkWrap

Actions #10

Updated by Martin Kutschker almost 19 years ago

class.tslib_menu.php.amp.patch fixes the &MP-params

class.tslib_fe.php.amp.patch fixes a few urls. Note that tslib_fe->simulateStaticDocuments_pEnc_onlyP_proc now uses a different param-split (for now and old-style URLs):

preg_split("/&(amp;)?/",$url)

Note: t3lib_div::cHashParams is used now by TSFE->cHashParams.

Actions #11

Updated by Martin Kutschker almost 19 years ago

class.t3lib_div.php.amp.patch fixes the following functions:

implodeArrayForUrl
explodeUrl2Array
cHashParams

Again explode has been substituted by preg_split.

Actions #12

Updated by Michael Stucki almost 19 years ago

Generally I'm fine with the patch, but are you really sure that this doesn't break anything?

Is it clear for - say - IE 4 that & needs to be translated to & in hyperlinks?

Unless we're 100% sure that this works, it should be configurable...

Actions #13

Updated by Martin Kutschker almost 19 years ago

Well, I'm sure all current browers can deal with & - I have a site running with soe of the patches (or variants) installed. But I haven't tested it with older browsers.

I'm also not sure that I've found all functions like explodeUrl2Array that process the generated URL (the incoming URL has only &).

As for configuring it: how? There are so many places that it seems like a very impractical thing to do. Any suggestions?

Generally speaking I'm aware that these patches are risky. I'd put it only in TYPO3 if I knew we have an RC2.

Actions #14

Updated by Michael Stucki almost 19 years ago

OK so let's postpone this change. Please find attached (bug772.diff) my compilation of all your previous changes, merged into one file and made against CVS 1 hour ago.

Actions #15

Updated by Ernesto Baschny over 18 years ago

This should enter 4.0, as this is a major XHTML-validation problem. Ampersant in URLs are always bad (and have always been). See http://www.htmlhelp.com/tools/validator/problems.html#amp

Its even dangerous if you don't encode the ampersant as & especially if you use variable names in the GET-query that are equally named as HTML-entities! See http://www.456bereastreet.com/archive/200406/ampersands_and_validation/

Actions #16

Updated by Ernesto Baschny over 18 years ago

Ok, I've given this thing a better though and came to the following conclusion:

1) internally we should always work with "&"
2) we only need to transform the "&" in an URL to "&" when creating a (X)HTML attribute (e.g. "href"), which will be sent to the browser.
3) we don't need to transform the "&" into "&" if we are sending a "Location:" header to the browser, AFAIK.
4) we always get a "&" from the browser (even if the link in the a-href is formed with "&"!) and from any TypoScript the user will write, as this has always been this way.

Also very important to note, as it looks as if the patch creator didn't considered this: htmlspecialchars() converts "&" into "&"!

So, going through the bug772.diff patch with all this in mind (skip to the end if the details bore you...):

Patch to TYPO3core/t3lib/class.t3lib_div.php:

a) function implodeArrayForUrl(): used internally, so we should return $str with "&", so patch not required
b) function explodeUrl2Array(): same as above! we are not outputting anything A-tag directly from here
c) function cHashParams(): same

Patch to TYPO3core/typo3/sysext/cms/tslib/class.tslib_content.php:

a) function SEARCHRESULT: patch changes & with & in $urlParams. But some lines later we call "htmlspecialchars($urlParams...)", so we are doing it twice, not needed! It already works as intended without the patch! The call is also replaced with:

htmlspecialchars($urlParams.'&spointer='.($spointer+$theRange))

you can see that this is nonsense, as htmlspecialchars takes care of & already! No patch needed here.

b) function netprintApplication_offsiteLinkWrap: same as above, patch changes "$url", later the link is constructed with "<a href="'.htmlspecialchars($url)...", so again are doing it twice! Wrong patch.
c) function imageLinkWrap: patch changes the "$params", which later is part of "$url", and which is again passed through htmlspecialchars() to construct the A-Tag. Patch wrong again.
d) function filelink(): same thing again for $initP, $url and later $icon. All go through htmlspecialchars(). Wrong patch.
e) function locDataJU(): no tag generated here, we are just generating the URL (to be used whoever knows where!). Its the responsibility of the caller to htmlspecialchars() if its going to be used inside an (X)HTML attribute, see 1)! As far as I can see, only filelink() calls this, and the output will go through htmlspecialchars before being put in an A-Tag. So patch not needed here.
f) function http_makelinks(): again, $initP and $res go through htmlspecialchars later. Patch not needed.
g) function mailto_makelinks(): same thing
h) function typoLink(): same stuff here. Patch is not correct. The only place where there is no htmlspecialchars() is for spamProtectEmailAddresses=ascii, as the URL will already be transformed to HTML-entities with TSFE->encryptEmail(). So its already correct.
i) function getClosestMPvalueForPage(): generates "&", but this is for internal use! So, as of 1), we keep it. All calls of this function make it go as $addParams to tmpl->linkData(), which is also just for internal use. Anyone outputting LD['totalURL'], for example, should htmlspecialchar's it. And skipping through some ocurrences of this, this is being done already.
j) function frameParams(): patch changes a commented out section, so not needed. New behaviour here is to go through linkData(). This returns some (X)html attributes (src="..." name="..."), but its being htmlspecialchar'ed, so no problem here. The only usage of this outputs the attributes directly, so correct too (no double-htmlspecialchar'ing)

Patch to TYPO3core/typo3/sysext/cms/tslib/class.tslib_fe.php:

a) function setIDfromArgV(): exploding on t3lib_div::getIndpEnv('QUERY_STRING')! As of 4), we will get a "&" from browser, not a "&", so patch is wrong here, current behaviour is correct.
b) function jumpUrl(): URL is composed for a Location: header. As of 3), no transformation is necessary, so patch is unnecessary.
c) function simulateStaticDocuments_pEnc_onlyP_proc(): exploding on linkVars, which come from the browser. As of 4), we don't need to explode on "&", patch not required. Same for the return value: internal-use, so no transformation yet to "&", the called should do this.

Patch to TYPO3core/typo3/sysext/cms/tslib/class.tslib_menu.php:

This is a bit more complicated, but to sum up: all things generated here are for internal use only, there's no direct href="" output going on here, so we should work with "&". So patch not needed.

In german I would say "lange Rede, kurzer Sinn": this patch is not needed, I cannot see any situation where TYPO3 isn't already generating "&" in HREF links. If someone can show me a setup (with 3.8.0) where this is the case, please let me know. It might be an extention that doesn't call htmlspecialchars() before generating an A-Tag, or something else not yet discovered.

The original bug reporter Dmitry said "Additional parameters to typolink must use "&" (not "&"), which is not XHTML-compliant.", is a bit confusing. So maybe he could explain better what is his setup, what is the current output and what is the expected output.

Also there is qcom_htmlcleaner which has a XCLASS for cObj, which makes typoLink translate & into &. I also don't understand why this is needed, as htmlspecialchars() is already doing that. If you use the URL directly (e.g. with "typolink.returnLast = url"), you have to be aware that you need to htmlSpecialChars' it (stdWrap function) before outputting it. Its a matter of discussion if we should htmlspecialchar() the URL before returning it, but as of 1), I would say no.

Actions #17

Updated by Michael Stucki over 18 years ago

Thanks Ernesto for your comments. I fully agree with you.
Masi: So can we close this bug?

- michael

Actions #18

Updated by Oliver Klee over 18 years ago

Well, AFAICT the kickstarter produces front end plug-ins that suffer from that problem (and if the extension authors are not aware of this, their plugins still have that problem). I've filed bug 0001890 about this, but I'm not sure whether this needs to be fixed in the kickstarter or in tslib_pibase.

Actions #19

Updated by Ernesto Baschny over 18 years ago

Please note that bug#1890 was a misunderstanding, no problem encountered there. So I suggest we close this (#772) and the other (#1890) bugs. Someone?

Actions #20

Updated by Michael Stucki over 18 years ago

Martin, can you please check if this bug can be closed?

Actions #21

Updated by Michael Stucki about 18 years ago

I'm still quite sure that this can be changed. Just waiting for Martins approval.

Actions #22

Updated by Martin Kutschker about 18 years ago

I'm willing to believe that Ernesto is right and that all problems arise with improper TS and broken extensions. But I have had no time to prove all claims.

Let's close the bug and open another when at some point any issue arises.

Actions #23

Updated by Axel Jindra about 18 years ago

There's one more point concerning XHTML 1.0 STRICT because there's no "target" attribute allowed. So if you set config.doctype = xhtml_strict the target attributes should be stripped.

Actions #24

Updated by Martin Kutschker about 18 years ago

Axel, does TYPO3 itself output any target-attributes where it shouldn't?

If you want no targets you have to check your TS/constants, so no target attributes are added.

Anyway, there is already code in place that manages target attributes, so please, name all the occurences you have found where a target is outputted (tag: form, a, ..., relevant settings of config, configuration of FORM or typolink if relevant).

Actions #25

Updated by Axel Jindra about 18 years ago

Hi Martin,
as far as I have tested there are no problems with target attributes from menus if you set
PAGE_TARGET =
styles.content.links.extTarget =
styles.content.links.target =,
from mail forms if you set tt_content.mailform.20.target=,
and so on,
or targets from RTE generated content, unless an editor sets the target explicitly.
You can disable the possibility to set a target for page elements by configuring $TCA['pages']['interface']['showRecordFieldList'], but not for content from SR's RTEhtmlarea extension.
So my note refers to my proposal [#002619] to disable the target attribute setting in the Typo3Browser Plugin with some UserTSConfig attribute.
If it's not done there, the Typo3 core should for example remove any target attribute from typolinks if the doctype is set to xhtml_strict.

Actions #26

Updated by Martin Kutschker about 18 years ago

Just picking tt_content.mailform.20.target as an example. This is the configuration of a FORM cObject. It might be argued that target must not be used when xhtml_strict is in effect even when the user did set a value.

If you want to modify the ouput of non-compliant plugins I suggest you use xhtml cleaning functions.

Actions #27

Updated by Ernesto Baschny about 18 years ago

I think we should close this bug, as the notes are getting pretty off-topic ("target-attribute": this is another issue). I'll willing to start some discussions on the Content Rendering Group about the issues that we want to solve in 4.1/4.5 and then new bug reports/feature requests will be generated. Thanks!

Actions #28

Updated by Martin Kutschker about 18 years ago

Closing the bug. Please file new bugs for specific issues of typolink handling or links within menus.

Actions

Also available in: Atom PDF