Project

General

Profile

Actions

Bug #14884

closed

'removeTag' does not remove closing tags

Added by Robert Markula almost 19 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Should have
Category:
Frontend
Target version:
Start date:
2005-07-29
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
4.5
PHP Version:
5.3
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

RTE.default.proc.entryHTMLparser_db.removeTags doesn't remove the whole tags, but only the opening tags. The closing tags are remaining.

RTE.default.proc.entryHTMLparser_db = 1
RTE.default.proc.entryHTMLparser_db {
removeTags= font
}
should do the trick, but this function only removes the opening <font>-tags but does not remove the closing </font>-tags.

This bug has been observed with < font> and < p>-tags in lists ('< ul> and < ol>').
(issue imported from #M1318)


Files

1318.diff (546 Bytes) 1318.diff Administrator Admin, 2010-10-18 11:47
test.t3x (3.2 KB) test.t3x Administrator Admin, 2010-12-20 19:32

Related issues 3 (0 open3 closed)

Related to TYPO3 Core - Bug #24349: RTE transformation removes all span tags on save after upgrade TYPO3 4.4.4 to 4.4.5ClosedStanislas Rolland2010-12-17

Actions
Has duplicate TYPO3 Core - Bug #16165: HTMLparser.removeTags doesn't work correct with pClosedXavier Perseguers2006-05-21

Actions
Has duplicate TYPO3 Core - Bug #23699: wrong table-cell rendering in RTE included tablesClosedStanislas Rolland2010-10-10

Actions
Actions #1

Updated by Todd over 18 years ago

I've seen this happen with other tags, also, such as bold and span.
The closing tags remain.

Actions #2

Updated by Steffen Kamper almost 18 years ago

it's in 4.0 also

lib.parseFunc_RTE.externalBlocks.table.stdWrap.HTMLparser.removeTags = p

this leaves the closing tag of p
important !

Actions #3

Updated by Robert Markula almost 18 years ago

Let's try to get this fixed for 4.0.1. Does anybody have an idea where this error comes from?

I'm curious why this problem hasn't been recognized before, because it seems to be quite an obvious one.

Actions #4

Updated by Steffen Kamper almost 18 years ago

Why does nobody of the Developer take notice of this Bug ?
For me it's a fundamental Bug and must be fixed asap

Actions #5

Updated by Martin Kutschker almost 18 years ago

Steffen, have you seen how many open bugs this bug tracker lists?

BTW, you could speed up things if you cared to proved a patch.

Actions #6

Updated by Jochen Weber almost 18 years ago

I think the Problem is the Function t3lib_parsehtml->XHTML_Cleaner()

I do not understand the HTMLParser completely, but when I add the following Lines after 776 into the file class.t3lib_parsehtml.php the parsings seems to work correctly:

else {
if( $tagName=="p" && isset($tags['table']) ) {
$setTag = false;
}
}
Actions #7

Updated by Jochen Weber almost 18 years ago

Bug-Id 0003528 seems to be the same...

Actions #8

Updated by Steffen Kamper almost 18 years ago

@Martin - I know, but i saw the date, and if i would have a solution, i would post it her ;)

I think this is a very complex thing, maybe it's because the parse-function is called recursive and so the content is parsed twice.

Maybe a solution could be something like that:
#nochmaliges parsen von tabellenzellen auschalten, damit inhalte nicht wieder in p gepackt werden
lib.parseFunc_RTE.externalBlocks.table.HTMLtableCells.default.callRecursive = 0

#nachträgliches ermöglichen von Links in Tabellenzellen
lib.parseFunc_RTE.externalBlocks.table.HTMLtableCells.default.stdWrap.parseFunc {
makelinks = 1
makelinks.http.keep = scheme,path,query
makelinks.mailto.keep = path
tags {
link = TEXT
link {
current = 1
typolink.parameter.data = parameters : allParams
}
}
}

but it's not a bugfix, only other way to pretend ...

Actions #9

Updated by Lars Houmark almost 18 years ago

I am also getting this bug in 4.01 - does anybody have a consistent fix? I would be glad to test it ;) I have plenty of testdata...

Actions #10

Updated by Francois Suter over 17 years ago

I confirm that the patch proposed by Jochen Weber (see note 0009514) works, but I can't judge whether it is an elegant solution of an ugly patch. It's probably not elegant as it implies hardcoding a test on the p tag, but at least it works.

Actions #11

Updated by Martin Kutschker over 15 years ago

Affects also < span > tags (within paragraphs).

Actions #12

Updated by sharquedo over 15 years ago

Adding:

lib.parseFunc_RTE.externalBlocks.table.HTMLtableCells.default >
lib.parseFunc_RTE.externalBlocks.table.HTMLtableCells.default.stdWrap.parseFunc =< lib.parseFunc

to Page TSconfig seems a working alternative method to remove P tags from table data cells.

[quote]

Everything seems to work fine: The <p>-Elements I entered deliberately are
kept, no magical ones are added, and my links are transformed properly.

Ciao, Uschi

[/quote]

Source: http://lists.netfielders.de/pipermail/typo3-project-rte/2006-August/000726.html

Actions #13

Updated by Xavier Perseguers over 13 years ago

How to reproduce :

Configuration:

plugin.tx_myext_pi1 {

parseFunc < lib.parseFunc
parseFunc {
allowTags = h1, b, i
denyTags >
nonTypoTagStdWrap.HTMLparser {
allowTags < plugin.tx_myext_pi1.parseFunc.allowTags
denyTags >
keepNonMatchedTags = 0
removeTags = center, font, o:p, sdfield, strike, u, span
  1. Avoid content being HSC'ed twice
    htmlSpecialChars = 0
tags {
p.allowedAttribs = 0
b < .p
b.remap = strong
i < .p
i.remap = em
}
}
}
}

in myext, function main():

$content = <<<EOD
<h1><span lang=FR-CH style='mso-ansi-language:FR-CH'>Voici un test<o:p></o:p></span></h1>

&lt;p class=MsoNormal&gt;&lt;o:p&gt;&nbsp;&lt;/o:p&gt;&lt;/p&gt;
&lt;p class=MsoNormal&gt;Avec du &lt;span class=SpellE&gt;&lt;span style=&#039;color:red&#039;&gt;contenu&lt;/span&gt;&lt;/span&gt;
&lt;i style=&#039;mso-bidi-font-style:normal&#039;&gt;pour &lt;span class=SpellE&gt;voir&lt;/span&gt; &lt;/i&gt;à
&lt;span class=SpellE&gt;quel&lt;/span&gt; &lt;b style=&#039;mso-bidi-font-weight:normal&#039;&gt;point &lt;span
class=SpellE>&lt;span style=&#039;background:yellow;mso-highlight:yellow&#039;&gt;c’est&lt;/span&gt;&lt;/span&gt;
&lt;span class=SpellE&gt;&lt;span class=GramE&gt;&lt;u&gt;moche&lt;/u&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;span
class=GramE> !&lt;/p&gt;
EOD;

$content = $contentObj->parseFunc($content, $this->conf['parseFunc.']);

return $content;

Actions #14

Updated by Xavier Perseguers over 13 years ago

- trunk (rev. 9137)
- TYPO3_4-4 (rev. 9138)
- TYPO3_4-3 (rev. 9139)
- TYPO3_4-2 (rev. 9140)

and for the sake of completeness as the bug was reported for TYPO3 3.8:

- TYPO3_4-1 (rev. 9141)
- TYPO3_4-0 (rev. 9142)
- TYPO3_3-8-1 (rev. 9143)
- TYPO3_3-8 (rev. 9144)

Actions #15

Updated by Stanislas Rolland over 13 years ago

This fix seems to be causing major problems: #0016760.

Actions #16

Updated by Xavier Perseguers over 13 years ago

Do you have an idea how to fix both problems? The behavior before this fix was an invalid content stored in the DB.

Actions #17

Updated by Uwe Wiebach over 13 years ago

As written at 0016760:

Could this be a solution (not much testing done):

$setTag = ($endTag || trim($tagParts1)) ? 1 : !$tags[$tagName]['rmTagIfNoAttrib'];

Actions #18

Updated by Stanislas Rolland over 13 years ago

$setTag = 1 must remain.

In this instance of the processing loop, if we are processing a closing tag (endTag), then the contents trim($tagParts1) is always null because it contains the contents of the closing tag, not the contents of the opening tag. At this stage, there is no obvious way to test the contents of the opening tag in order to remove the closing tag if the opening tag was removed.

Actions #19

Updated by Stanislas Rolland over 13 years ago

I think you could do:

$setTag = 1;
// Remove endtag if $tagName was among removeTags
if ($endTag && $tags[$tagName]['allowedAttribs'] 0 && $tags[$tagName]['rmTagIfNoAttrib'] 1) {
$setTag = 0;
}

Actions #20

Updated by Stanislas Rolland over 13 years ago

I must admit however, that I am unable to reproduce the reported issue with current SVN branch 4.4. (and $setTag = 1;).

Actions #21

Updated by Xavier Perseguers over 13 years ago

Just created a test extension (attached) to reproduce the bug.

Your solution behaves the same as the patch that did that:

$setTag = !$tags[$tagName]['rmTagIfNoAttrib'];

But actually it does not work if you see the output of the plugin. However I'm really sure it worked when I committed the patch (I don't say anything about the related, nasty, bug) but now the end tag is always removed, even for allowed tags. Perhaps this little extension helps understanding the real problem.

Actions #22

Updated by Stanislas Rolland over 13 years ago

Interesting.

Your example HTML is not valid. The italic and bold tags are not closed. I suppose you would need to set
plugin.tx_test_pi1.parseFunc.nonTypoTagStdWrap.HTMLparser.globalNesting = i,em,b,strong

But then your italic and bold tags will also be removed because not closed.

I was entering your example HTML in the RTE in text mode. But the RTE was closing these tags. So I was unable to see any problem...

If you close these tags in your example text, I think that you will see that my solution works.

Actions #23

Updated by Stanislas Rolland over 13 years ago

Well, I found another case where my proposal also breaks the RTE transformation but in another way.

This seems to depend on other parts of the parser configuration.

I don't think I can find a solution today...

Actions #24

Updated by Stanislas Rolland over 13 years ago

ok, here it is:

$setTag = 1;
// Remove this closing tag if $tagName was among $TSconfig['removeTags']
if ($endTag && $tags[$tagName]['allowedAttribs'] === 0 && $tags[$tagName]['rmTagIfNoAttrib'] === 1) {
$setTag = 0;
}

This is because of the way HTMLparserConfig works.

Actions #25

Updated by Francois Suter over 13 years ago

This works for me. I'll notify people in issue 0016760 so that we have more tests.

Actions #26

Updated by Xavier Perseguers over 13 years ago

OK. Interesting. Then all is good. I did not have a further look at this example, this was copy and paste from Word at some point of time but I guess I missed some closing tags somewhere in my process ;-)

Good to see that this will work.

Thanks for this further testing.

Actions #27

Updated by Xavier Perseguers over 13 years ago

Ah! I know where those tags where lost... Yesterday, I copied my example HTML from this bugtracker entry and did not notice that Mantis removed many tags. My original code (which I did not have anymore since I wrote it there) was valid.

Actions #28

Updated by Stanislas Rolland over 13 years ago

See patch attached to issue #24349.

Actions #29

Updated by Oliver Hader almost 13 years ago

  • Target version changed from 1076 to 4.2.17
Actions #30

Updated by Riccardo De Contardi over 6 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF