RemoveFormat destroys the HTML sometimes
Due to a bug in a RegExp, sometimes applying the RemoveFormat to remove HTML- or Word-formatting you end up with non-valid HTML code.
The attached patch (rtehtmlarea-eb.diff) corrects this issues and additionally also removes cellpadding, cellspacing, frame and bgcolor attributes, which are used to style tables (I expect them to disappear if I choose to "remove formatting" so that I can style them in CSS or using classes provided by the RTE).
Try the attached HTML fragment (rtehtmlarea-removeformatbug.html) in a rtehtmlarea (paste it in "text" mode). You will see the problem "in action". Just remove format and select HTML-formatting and Word-Formatting. You end up with "<tdspan3>" tags and other bizarre things).
After applying the patch, you will get a "clean" output after using the removeformat function, as expected.
(issue imported from #M2084)
Updated by Stanislas Rolland over 15 years ago
I cannot reproduce the reported problem using the supplied test case, either with the compressed, nor with non-compressed versions of the script.
However, I noticed a problem when the toggleborders option is set.
Please try the attached patch.
The patch includes the additional attribute cleaning.