Bug #14884
closed'removeTag' does not remove closing tags
Added by Robert Markula over 19 years ago. Updated about 7 years ago.
0%
Description
RTE.default.proc.entryHTMLparser_db.removeTags doesn't remove the whole tags, but only the opening tags. The closing tags are remaining.
RTE.default.proc.entryHTMLparser_db = 1
RTE.default.proc.entryHTMLparser_db {
removeTags= font
}
should do the trick, but this function only removes the opening <font>-tags but does not remove the closing </font>-tags.
This bug has been observed with < font> and < p>-tags in lists ('< ul> and < ol>').
(issue imported from #M1318)
Files
1318.diff (546 Bytes) 1318.diff | Administrator Admin, 2010-10-18 11:47 | ||
test.t3x (3.2 KB) test.t3x | Administrator Admin, 2010-12-20 19:32 |
Updated by Todd about 19 years ago
I've seen this happen with other tags, also, such as bold and span.
The closing tags remain.
Updated by Steffen Kamper over 18 years ago
it's in 4.0 also
lib.parseFunc_RTE.externalBlocks.table.stdWrap.HTMLparser.removeTags = p
this leaves the closing tag of p
important !
Updated by Robert Markula over 18 years ago
Let's try to get this fixed for 4.0.1. Does anybody have an idea where this error comes from?
I'm curious why this problem hasn't been recognized before, because it seems to be quite an obvious one.
Updated by Steffen Kamper over 18 years ago
Why does nobody of the Developer take notice of this Bug ?
For me it's a fundamental Bug and must be fixed asap
Updated by Martin Kutschker over 18 years ago
Steffen, have you seen how many open bugs this bug tracker lists?
BTW, you could speed up things if you cared to proved a patch.
Updated by Jochen Weber over 18 years ago
I think the Problem is the Function t3lib_parsehtml->XHTML_Cleaner()
I do not understand the HTMLParser completely, but when I add the following Lines after 776 into the file class.t3lib_parsehtml.php the parsings seems to work correctly:
else {
if( $tagName=="p" && isset($tags['table']) ) {
$setTag = false;
}
}
Updated by Jochen Weber over 18 years ago
Bug-Id 0003528 seems to be the same...
Updated by Steffen Kamper over 18 years ago
@Martin - I know, but i saw the date, and if i would have a solution, i would post it her ;)
I think this is a very complex thing, maybe it's because the parse-function is called recursive and so the content is parsed twice.
Maybe a solution could be something like that:
#nochmaliges parsen von tabellenzellen auschalten, damit inhalte nicht wieder in p gepackt werden
lib.parseFunc_RTE.externalBlocks.table.HTMLtableCells.default.callRecursive = 0
#nachträgliches ermöglichen von Links in Tabellenzellen
lib.parseFunc_RTE.externalBlocks.table.HTMLtableCells.default.stdWrap.parseFunc {
makelinks = 1
makelinks.http.keep = scheme,path,query
makelinks.mailto.keep = path
tags {
link = TEXT
link {
current = 1
typolink.parameter.data = parameters : allParams
}
}
}
but it's not a bugfix, only other way to pretend ...
Updated by Lars Houmark over 18 years ago
I am also getting this bug in 4.01 - does anybody have a consistent fix? I would be glad to test it ;) I have plenty of testdata...
Updated by Francois Suter almost 18 years ago
I confirm that the patch proposed by Jochen Weber (see note 0009514) works, but I can't judge whether it is an elegant solution of an ugly patch. It's probably not elegant as it implies hardcoding a test on the p tag, but at least it works.
Updated by Martin Kutschker about 16 years ago
Affects also < span > tags (within paragraphs).
Updated by sharquedo about 16 years ago
Adding:
lib.parseFunc_RTE.externalBlocks.table.HTMLtableCells.default >
lib.parseFunc_RTE.externalBlocks.table.HTMLtableCells.default.stdWrap.parseFunc =< lib.parseFunc
to Page TSconfig seems a working alternative method to remove P tags from table data cells.
[quote]
Everything seems to work fine: The <p>-Elements I entered deliberately are
kept, no magical ones are added, and my links are transformed properly.Ciao, Uschi
[/quote]
Source: http://lists.netfielders.de/pipermail/typo3-project-rte/2006-August/000726.html
Updated by Xavier Perseguers about 14 years ago
How to reproduce :
Configuration:
plugin.tx_myext_pi1 {
parseFunc < lib.parseFunc
parseFunc {
allowTags = h1, b, i
denyTags >
nonTypoTagStdWrap.HTMLparser {
allowTags < plugin.tx_myext_pi1.parseFunc.allowTags
denyTags >
keepNonMatchedTags = 0
removeTags = center, font, o:p, sdfield, strike, u, span
- Avoid content being HSC'ed twice
htmlSpecialChars = 0
tags {
p.allowedAttribs = 0
b < .p
b.remap = strong
i < .p
i.remap = em
}
}
}
}
in myext, function main():
$content = <<<EOD
<h1><span lang=FR-CH style='mso-ansi-language:FR-CH'>Voici un test<o:p></o:p></span></h1>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Avec du <span class=SpellE><span style='color:red'>contenu</span></span>
<i style='mso-bidi-font-style:normal'>pour <span class=SpellE>voir</span> </i>à
<span class=SpellE>quel</span> <b style='mso-bidi-font-weight:normal'>point <span
class=SpellE><span style='background:yellow;mso-highlight:yellow'>c’est</span></span>
<span class=SpellE><span class=GramE><u>moche</u></span></span></b><span
class=GramE> !</p>
EOD;
$content = $contentObj->parseFunc($content, $this->conf['parseFunc.']);
return $content;
Updated by Xavier Perseguers about 14 years ago
- trunk (rev. 9137)
- TYPO3_4-4 (rev. 9138)
- TYPO3_4-3 (rev. 9139)
- TYPO3_4-2 (rev. 9140)
and for the sake of completeness as the bug was reported for TYPO3 3.8:
- TYPO3_4-1 (rev. 9141)
- TYPO3_4-0 (rev. 9142)
- TYPO3_3-8-1 (rev. 9143)
- TYPO3_3-8 (rev. 9144)
Updated by Stanislas Rolland almost 14 years ago
This fix seems to be causing major problems: #0016760.
Updated by Xavier Perseguers almost 14 years ago
Do you have an idea how to fix both problems? The behavior before this fix was an invalid content stored in the DB.
Updated by Uwe Wiebach almost 14 years ago
As written at 0016760:
Could this be a solution (not much testing done):
$setTag = ($endTag || trim($tagParts1)) ? 1 : !$tags[$tagName]['rmTagIfNoAttrib'];
Updated by Stanislas Rolland almost 14 years ago
$setTag = 1 must remain.
In this instance of the processing loop, if we are processing a closing tag (endTag), then the contents trim($tagParts1) is always null because it contains the contents of the closing tag, not the contents of the opening tag. At this stage, there is no obvious way to test the contents of the opening tag in order to remove the closing tag if the opening tag was removed.
Updated by Stanislas Rolland almost 14 years ago
I think you could do:
$setTag = 1;
// Remove endtag if $tagName was among removeTags
if ($endTag && $tags[$tagName]['allowedAttribs'] 0 && $tags[$tagName]['rmTagIfNoAttrib'] 1) {
$setTag = 0;
}
Updated by Stanislas Rolland almost 14 years ago
I must admit however, that I am unable to reproduce the reported issue with current SVN branch 4.4. (and $setTag = 1;).
Updated by Xavier Perseguers almost 14 years ago
Just created a test extension (attached) to reproduce the bug.
Your solution behaves the same as the patch that did that:
$setTag = !$tags[$tagName]['rmTagIfNoAttrib'];
But actually it does not work if you see the output of the plugin. However I'm really sure it worked when I committed the patch (I don't say anything about the related, nasty, bug) but now the end tag is always removed, even for allowed tags. Perhaps this little extension helps understanding the real problem.
Updated by Stanislas Rolland almost 14 years ago
Interesting.
Your example HTML is not valid. The italic and bold tags are not closed. I suppose you would need to set
plugin.tx_test_pi1.parseFunc.nonTypoTagStdWrap.HTMLparser.globalNesting = i,em,b,strong
But then your italic and bold tags will also be removed because not closed.
I was entering your example HTML in the RTE in text mode. But the RTE was closing these tags. So I was unable to see any problem...
If you close these tags in your example text, I think that you will see that my solution works.
Updated by Stanislas Rolland almost 14 years ago
Well, I found another case where my proposal also breaks the RTE transformation but in another way.
This seems to depend on other parts of the parser configuration.
I don't think I can find a solution today...
Updated by Stanislas Rolland almost 14 years ago
ok, here it is:
$setTag = 1;
// Remove this closing tag if $tagName was among $TSconfig['removeTags']
if ($endTag && $tags[$tagName]['allowedAttribs'] === 0 && $tags[$tagName]['rmTagIfNoAttrib'] === 1) {
$setTag = 0;
}
This is because of the way HTMLparserConfig works.
Updated by Francois Suter almost 14 years ago
This works for me. I'll notify people in issue 0016760 so that we have more tests.
Updated by Xavier Perseguers almost 14 years ago
OK. Interesting. Then all is good. I did not have a further look at this example, this was copy and paste from Word at some point of time but I guess I missed some closing tags somewhere in my process ;-)
Good to see that this will work.
Thanks for this further testing.
Updated by Xavier Perseguers almost 14 years ago
Ah! I know where those tags where lost... Yesterday, I copied my example HTML from this bugtracker entry and did not notice that Mantis removed many tags. My original code (which I did not have anymore since I wrote it there) was valid.
Updated by Stanislas Rolland almost 14 years ago
See patch attached to issue #24349.
Updated by Oliver Hader over 13 years ago
- Target version changed from 1076 to 4.2.17
Updated by Riccardo De Contardi about 7 years ago
- Status changed from Resolved to Closed