Bug #89389

xmlRecompileFromStructValArray creates invalid HTML code

Added by Jo Hasenau about 1 month ago. Updated about 1 month ago.

Status:
Under Review
Priority:
Must have
Assignee:
Category:
Miscellaneous
Target version:
Start date:
2019-10-10
Due date:
% Done:

0%

TYPO3 Version:
10
PHP Version:
Tags:
Complexity:
easy
Is Regression:
Sprint Focus:

Description

While making use of xml2tree the function xmlRecompileFromStructValArray is involved to "implode an array of XML parts (made with xml_parse_into_struct()) into XML again."

If there are opening and closing tags in the same line, the type reported by xml_parse_into_struct is "complete".
In this case, there is a check for the content of the given tag and if that is empty, the tag will be made a self closing tag no matter what.

if ($type == 'complete') {
  if (isset($val['value'])) {
    $XMLcontent .= '>' . htmlspecialchars($val['value']) . '</' . $val['tag'] . '>';
  } else {
    $XMLcontent .= '/>';
  }
}

This will convert code like

<script type="text/javascript" src="https://some.tld/something.js"></script>
to
<script type="text/javascript" src="https://some.tld/something.js"/>
which is valid XML but invalid HTML5 and will lead to a broken frontend output.
Same for other tags like <span>

We found out about this behaviour, because translated HTML content elements were broken after exporting and reimporting them with the L10nmgr.
Still this should be a problem for any other usage of xml2tree dealing with empty tags, that should not be self closing.

This is broken in any supported version from CMS 6 up to CMS 10.

History

#1 Updated by Jo Hasenau about 1 month ago

  • Description updated (diff)

#2 Updated by Jo Hasenau about 1 month ago

Just for clarification here is a good explanation of the problems caused by falsely self closing tags: https://stackoverflow.com/questions/3558119/are-non-void-self-closing-tags-valid-in-html5

#3 Updated by Jo Hasenau about 1 month ago

Here is a list of tags that are officially considered to be void: https://riptutorial.com/html/example/4736/void-elements

#4 Updated by Georg Ringer about 1 month ago

so I guess it would be enough to just add a check for the mentioned tag names?

#5 Updated by Jo Hasenau about 1 month ago

Yep - I did some quick fixes for the L10nmgr so people don't have to wait for a new core release.
But basically that's it:
https://gitlab.com/coderscare/l10nmgr/blob/7-0/Classes/Model/Tools/XmlTools.php#L237

#6 Updated by Jo Hasenau about 1 month ago

  • Status changed from New to In Progress
  • Assignee set to Jo Hasenau

#7 Updated by Gerrit Code Review about 1 month ago

  • Status changed from In Progress to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/61946

#8 Updated by Gerrit Code Review about 1 month ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/61946

#9 Updated by Gerrit Code Review about 1 month ago

Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/61946

Also available in: Atom PDF