Project

General

Profile

Actions

Bug #22512

closed

htmlArea RTE: Tables may get lost when using remove format feature

Added by Dimitri Koenig over 14 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Should have
Category:
-
Target version:
-
Start date:
2010-04-26
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

If you copy and paste some word content with a table and use the word copy content clean up feature -> "Remove format" button, the table get lost.
If i just save the document without using the "Remove format" feature everything will be saved correctly but ugly.

(issue imported from #M14202)


Files

Actions #1

Updated by Stanislas Rolland over 14 years ago

What version of TYPO3 and what version of what browser?

Actions #2

Updated by Dimitri Koenig over 14 years ago

IE8
TYPO3 4.3.2

Actions #3

Updated by Dimitri Koenig over 14 years ago

i could isolate the problem at least to the line "140" in remove-format.js where it's about "keep tags, strip attributes"

Actions #4

Updated by Dimitri Koenig over 14 years ago

ok, i found a solution but it's a very interesting one.
these are the original lines:
// keep tags, strip attributes
var regMS3 = new RegExp(" style=\"[^>\"]*\"", "gi");
var regMS4 = new RegExp(" (class|align)=(([^>\s\"]+)|(\"[^>\"]*\"))", "gi");

these are the fixed ones:
// keep tags, strip attributes
var regMS3 = new RegExp("style=\"[^>\"]*\"", "gi");
var regMS4 = new RegExp("(class|align)=(([^>\s\"]+)|(\"[^>\"]*\"))", "gi");

i just removed a whitespace and that's it. but how will this affect some other stuff?

Actions #5

Updated by Stanislas Rolland over 14 years ago

Please try

var regMS4 = new RegExp(" (class|align)=\"[^>\"]*\"", "gi");

without changing the other line.

Actions #6

Updated by Rik Wasmus over 14 years ago

Shouldn't that be:

var regMS4 = new RegExp("( |\t|\n)(class|v?align)=\"[^>\"]*\"", "gi");

An HTML-tag can be spread out over several lines, and if an attribute can start at the beginning that line, hence a newline (\r, \n) instead of a space (or in general, tabs for that matter).

Removing the leading whitespace all together would work in about 99% of cases, but in HTML5 custom attributes are allowed AFAIK, so let's take it into account.

<script type="text/javascript">
var re_space = new RegExp(" (class|v?align)=\"[^>\"]*\"", "gi");
var re_whitespace = new RegExp("( |\t|\n)(class|v?align)=\"[^>\"]*\"", "gi");

alert('<input\nclass="bar"/>'.replace(re_space,''));
alert('<input class="bar"/>'.replace(re_space,''));
alert('<input class="bar"/>'.replace(re_space,''));

alert('<input\nclass="bar"/>'.replace(re_whitespace,''));
alert('<input class="bar"/>'.replace(re_whitespace,''));
alert('<input class="bar"/>'.replace(re_whitespace,''));
</script>

Actions #7

Updated by Dimitri Koenig over 14 years ago

@Stanislas: that works too, thanks.

I wonder why i get different results in Firefox 3.6 and IE8... any ideas?

Actions #8

Updated by Stanislas Rolland over 14 years ago

@Dimitri : In what way are the results different? Do you mean the results of the paste or the results of the remove format operation? Certainly, if you paste content from MS Word, all browsers will do it differently.

Actions #9

Updated by Stanislas Rolland over 14 years ago

@Rik: yes, but newlines are replaced by spaces a few instruction before. Not tabs, though. The attached patch will fix this.

Actions #10

Updated by Stanislas Rolland over 14 years ago

Committed to SVN TYPO3core trunk (revision 7565) and branch TYPO3_4-3 (revision 7566).

Actions #11

Updated by Benni Mack about 6 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF