Bug #105943
openSingle quote encoded as ' in rte for link attributes
0%
Description
Context: Richtext editor for bodytext. Rendering with Fluid Styled Content.
Input HTML: <p>This is an external <a href="https://somesite.tld" title="I'm the link">link</a></p>
Typed in RTE with link to using external URL.
Result in database: <p>This is an external <a href="https://somesite.tld" title="I&'m the link">link</a></p>
<p>This is an external <a href="https://somesite.tld" title="I'm the link">link</a></p>
<p>This is an external <a href="https://somesite.tld" title="I'm the link">link</a></p>
<p>This is an external <a href="https://somesite.tld" title="I'm the link">link</a></p>
I observed that TYPO3\CMS\Core\Html\HtmlParser::get_tag_attributes
uses htmlspecialchars_decode
without flags. Using ENT_QUOTES
flag would decode single quote entity ('
).
I also observed that TYPO3\CMS\Core\Utility\GeneralUtility::implodeAttributes
uses htmlspecialchars
without flags.
I also observed that TYPO3\HtmlSanitizer\Serializer\Rules::enc
uses htmlspecialchars
with flags ENT_HTML5
and ENT_QUOTES
resulting in '
for single quotes. Used in TYPO3\CMS\Core\Html\RteHtmlParser::htmlSanitize
and called from TYPO3\CMS\Core\DataHandling\DataHandler
.
Thus when passing the initial value to TYPO3\CMS\Core\HTML\RteHtmlParser::transformTextForPersistence
from DataHandler
it is transformed to <p>This is an external <a href="https://somesite.tld" title="I'm the link">link</a></p>
before the call to htmlSanitize
and to <p>This is an external <a href="https://somesite.tld" title="I'm the link">link</a></p>
after the call and persisted as is.
When displaying the value in Frontend, the value is passed to parseFunc
and finally a call to TYPO3\CMS\Frontend\ContentObject\ContentObjectRenderer::parseFuncInternal
is done. This uses TYPO3\CMS\Core\Utility\GeneralUtility::get_tag_attributes
which does the same as TYPO3\CMS\Core\Html\HtmlParser::get_tag_attributes
without the metadata. Thus '
is not decoded and then incorrectly encoded using GeneralUtility::implodeAttributes
as &apos;
throught TypoLink behaviour in TYPO3\CMS\Frontend\Typolink\LinkFactory::addAdditionalAnchorTagAttributes
wich reads parameters from the parseFuncInternal
definition of TYPO3\CMS\Frontend\ContentObject\ContentObjectRenderer::parameters
.
When displaying the value in Backend editor, the value is passed to TYPO3\CMS\Core\HTML\RteHtmlParser::transformTextForRichTextEditor
which makes use of TYPO3\CMS\Core\Html\HtmlParser::get_tag_attributes
and TYPO3\CMS\Core\Utility\GeneralUtility::implodeAttributes
which does not understand '
and encodes it as &apos;
.
I would recommend changing the methods get_tag_attributes to decode single quotes.
I have checked the main branch on GitHub for changes in these methods and saw none.
Updated by Garvin Hicking about 1 month ago
- Category set to Link Handling & Redirect Handling
(Thanks for this detailed report! Will try to see if this is still reproducible in main, as some things in the sanitizing have changed which may affect this. Of course this is something with security relevance so we'll need to add tests for this scenario)
Updated by Pierrick Caillon about 1 month ago
- Subject changed from Quote encoded as &apos; in rte for link attributes to Single quote encoded as &apos; in rte for link attributes
- Description updated (diff)
Changed "quote" to "single quote" for correct understanding.
Updated by Oliver Hader 27 days ago
TYPO3\CMS\Core\Html\RteHtmlParser::htmlSanitize
is only processed, if the feature flag security.backend.htmlSanitizeRte
is enabled (which is still disabled per default, to avoid invalid HTML being sanitized/destroyed in the database).
Updated by Oliver Hader 27 days ago
I changed the complexity from "easy" to "medium", since changing the encoding/decoding behavior or HTML strings bears a high risk of introducing new regressions, or even new security vulnerabilities.