Project

General

Profile

Actions

Bug #85635

closed

Broken <script> tag after XML import

Added by Dmitry no-lastname-given over 5 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Should have
Assignee:
Category:
Miscellaneous
Start date:
2018-07-24
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
8
PHP Version:
7.0
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

If bodytext contains <script src="/some/script.js"></script>, it would be replaced with <script src="/script.js"/> after importing the XML file with the localization manager. This is not valid HTML and brakes a page. The translated file has valid XML containing valid HTML with a closing tag, so the replacement happens during the import process.

Actions #1

Updated by Coders.Care Extension Team over 5 years ago

  • Status changed from New to Needs Feedback

Could you please check your file again to find out if there is any <![CDATA[]]> surrounding your HTML code?
Actually this seems to make a difference at least on our testing system, since exactly your HTML-Code will be imported unchanged with CDATA but as a closed tag without it.

Actions #2

Updated by Coders.Care Extension Team over 5 years ago

It seems to be a core bug, since the CatXmlImportManager uses

GeneralUtility::xml2tree

to transform the XML code.

That again uses xmlRecompileFromStructValArray to implode array structures back into XML.

If there is no value, this method automatically closes any opening tag into a self closing tag instead of using a closing tag for tags that are not allowed to be self closing.
If the type is detected as "cdata" the whole value is just added as is, which is why CDATA works.

    /**
     * This implodes an array of XML parts (made with xml_parse_into_struct()) into XML again.
     *
     * @param array $vals An array of XML parts, see xml2tree
     * @return string Re-compiled XML data.
     */
    public static function xmlRecompileFromStructValArray(array $vals)
    {
        $XMLcontent = '';
        foreach ($vals as $val) {
            $type = $val['type'];
            // Open tag:
            if ($type === 'open' || $type === 'complete') {
                $XMLcontent .= '<' . $val['tag'];
                if (isset($val['attributes'])) {
                    foreach ($val['attributes'] as $k => $v) {
                        $XMLcontent .= ' ' . $k . '="' . htmlspecialchars($v) . '"';
                    }
                }
                if ($type === 'complete') {
                    if (isset($val['value'])) {
                        $XMLcontent .= '>' . htmlspecialchars($val['value']) . '</' . $val['tag'] . '>';
                    } else {
                        $XMLcontent .= '/>';
                    }
                } else {
                    $XMLcontent .= '>';
                }
                if ($type === 'open' && isset($val['value'])) {
                    $XMLcontent .= htmlspecialchars($val['value']);
                }
            }
            // Finish tag:
            if ($type === 'close') {
                $XMLcontent .= '</' . $val['tag'] . '>';
            }
            // Cdata
            if ($type === 'cdata') {
                $XMLcontent .= htmlspecialchars($val['value']);
            }
        }
        return $XMLcontent;
    }

To get a better understanding of the types "open", "complete" and "close", see
http://php.net/manual/de/function.xml-parse-into-struct.php

Actions #3

Updated by Coders.Care Extension Team over 5 years ago

  • Project changed from 240 to TYPO3 Core
  • Description updated (diff)
  • Category changed from 1636 to Miscellaneous
  • Assignee deleted (Jo Hasenau)
  • Target version changed from 3461 to next-patchlevel
  • PHP Version set to 7.0
Actions #4

Updated by Coders.Care Extension Team over 5 years ago

Since the method is quite old, I guess this happens with CMS 6 and 7 as well.

Actions #5

Updated by Coders.Care Extension Team over 5 years ago

According to HTML5 specs, the following tags are allowed to be "void" and therefor self closing.

area, base, br, col, embed, hr, img, input, keygen, link, meta, param, source, track, wbr
Any other tag has to get a closing tag to produce valid HTML5.

Actions #6

Updated by Dmitry no-lastname-given over 5 years ago

Coders.Care Extension Team wrote:

Could you please check your file again to find out if there is any <![CDATA[]]> surrounding your HTML code?
Actually this seems to make a difference at least on our testing system, since exactly your HTML-Code will be imported unchanged with CDATA but as a closed tag without it.

Hi,
Just checked - no, block with script tag is not wrapped with CDATA. Weird, I have 20 translated bodytext fields in file, but only 4 of them are wrapped in CDATA.

Actions #7

Updated by Coders.Care Extension Team over 5 years ago

  • Assignee set to Jo Hasenau

As a workaround you can check the checkbox "Do not check XML" - this automatically wraps content with CDATA to make sure it does not break the XML parser. Still this is just a workaround for the L10nmgr but the bug in xmlRecompileFromStructValArray should be fixed anyway.

Actions #8

Updated by Dmitry no-lastname-given over 5 years ago

Coders.Care Extension Team wrote:

As a workaround you can check the checkbox "Do not check XML" - this automatically wraps content with CDATA to make sure it does not break the XML parser. Still this is just a workaround for the L10nmgr but the bug in xmlRecompileFromStructValArray should be fixed anyway.

It's already checked. If I don't check it - export fails.

Actions #9

Updated by Benni Mack about 5 years ago

  • Target version changed from next-patchlevel to Candidate for patchlevel
Actions #10

Updated by Susanne Moog about 4 years ago

Can somebody summarize what the core bug here is and what needs fixing exactly?

Actions #11

Updated by Christian Kuhn about 2 years ago

  • Status changed from Needs Feedback to Closed

Hey. It is quite hard to understand what the real issue is in this case. A request from Susi to clearify has not been answered. At the moment I see no other solution than to close this issue for now. If the problem persists, we should probably start with a fresh issue describing the problem again.

Actions

Also available in: Atom PDF