Project

General

Profile

Actions

Task #96635

open

Improve XML handling of XmlEncoder and XmlDecoder

Added by Alexander Nitsche almost 3 years ago.

Status:
New
Priority:
Should have
Assignee:
-
Category:
-
Start date:
2022-01-24
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
12
PHP Version:
Tags:
Complexity:
Sprint Focus:

Description

During the review phase of task #83580 these follow-up improvements of the XML handling came up:

1) Refactor $additionalOptions

Referring to: \TYPO3\CMS\Core\Encoder\XmlEncoder
Christian Kuhn: I still wonder if we need this huge confusing $additionalOptions implementation in our generic encoder / decoder implementation: afair, EXT:impexp is the only place that uses this huge bandwidth of things. Maybe we should have a generic implementation that is much more streamlined and easier, and EXT:impexp then has an own solution that does all the additional magic?

2) Make $options class member

Referring to: \TYPO3\CMS\Core\Encoder\XmlDecoder
Referring to: \TYPO3\CMS\Core\Encoder\XmlEncoder
Oliver Hader: I think $options should be a class member variable, defined in a constructor - this way passing options to internal methods can be avoided.
Alexander Nitsche: Yes, could be. Leaving it as method argument underlines the immutable character of this class though. Having no objections to change it, if still wanted.

3) Harden CDATA encoding

Referring to: \TYPO3\CMS\Core\Encoder\XmlEncoder::parseArray()
Oliver Hader: It should be ensured that $value does not contain any XML literals - currently it seems to be possible to leave the CDATA section and create a new node, e.g. with

$value = ']]><evil>whatever</evil><![CDATA['

In other words: $value needs to be handled as well for that scope.
Oliver Hader: This is how PHP would do it:
<?php

$dom = new \DOMDocument();
$node = $dom->createElement('data');
$dom->appendChild($node);
$cdata = $dom->createCDATASection(' x ]]><evil>whatever</evil><![CDATA[ x ');
$node->appendChild($cdata);

echo $dom->saveXML();

4) Improve binary data detection

Referring to: \TYPO3\CMS\Core\Encoder\XmlEncoder::isBinaryValue()
Oliver Harder: So basically control-chars (x00-x1f) without TAB, NL, CR: Seems x7f should be included as well (https://en.wikipedia.org/wiki/Control_character).


Related issues 1 (0 open1 closed)

Related to TYPO3 Core - Bug #83580: GeneralUtility::xml2array() can't parse bigger files (> 10MB)Closed2018-01-16

Actions
Actions

Also available in: Atom PDF