Project

General

Profile

Actions

Bug #77642

closed

preg_match: Compilation failed: regular expression is too large at offset 27

Added by Tobias Schaefer over 7 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Must have
Assignee:
-
Category:
Frontend
Target version:
-
Start date:
2016-08-25
Due date:
% Done:

100%

Estimated time:
TYPO3 Version:
11
PHP Version:
7.2
Tags:
Complexity:
Is Regression:
No
Sprint Focus:

Description

If the cropHTML function (typo3/sysext/frontend/Classes/ContentObject/ContentObjectRenderer.php) is called to crop at 1074 (or more) characters it fails with this error message:
PHP Warning: preg_match(): Compilation failed: regular expression is too large at offset 27 in typo3/sysext/frontend/Classes/ContentObject/ContentObjectRenderer.php

Increasing pcre.backtrack_limit or pcre.recursion_limit in php.ini doesn't help.
To reproduce the bug you can use the news extension. Create a news with a teaser text of more the 1073 characters and set up the list plugin to crop the teaser text at 1074 characters. Cropping at 1073 characters will work.

Here you find a discussion about this problem:
http://stackoverflow.com/questions/31172837/regular-expression-is-too-large-error-in-php

Possible solutions:
- If you are trying to match/parse HTML, I would recommend using DOMDocument to parse the HTML and then walk the DOM tree or build an XPATH to find what you're looking for.
- Shorten the Regular Expression by using DEFINE for any redundant sub-expressions (see below).
- Split your regular expression at | and process the resulting sub-expressions separately. If the regex is essentially numerous keywords separated by |, then converting to a strtok or a loop with strpos may be a better & faster choice.

TYPO3: 6.2.26
PHP: 5.5.14
Linux: SLES 12 SP1

Cheers,
Tobias


Related issues 1 (0 open1 closed)

Related to TYPO3 Core - Task #97125: Replace regex for stdWrap cropHTMLClosed2022-03-07

Actions
Actions #1

Updated by Patrick Broens over 7 years ago

Seems to be still a problem in TYPO3 version 7.6.11. I can reproduce it in that version.

Actions #2

Updated by Rémy DANIEL over 7 years ago

I have this problem too: I'm am calling cropHTML with max 1500 chars.
Inside cropHTML, the pattern which try to find html entities ( #(&[^&\\s;]{2,8};|.){0,X}#uis ) blows up because of pcre LINK_SIZE.

This Stackoverflow post explains very well this limit: http://stackoverflow.com/a/33988643/1053453

But I am not a regex guru, so I don't know if this pattern could be optimised, or if another approach needs to be taken.

Actions #3

Updated by Riccardo De Contardi over 5 years ago

I think it is still valid in TYPO3 8.7.19

My Test with a fresh 8.7.19 TYPO3 Installation:

1) On TS Setup write:

page.120 = TEXT
page.120.value (
 //Write here a very long text with 2000+ characters, I omit it here :)
) 
page.120.cropHTML = 964| ... 

2) Go to frontend and refresh

Results:

1) In frontend, the text is not cropped

2) in TYPO3 Log module, you got 2 warnings:

Core: Error handler (FE): PHP Warning: preg_match(): Compilation failed: regular expression is too large at offset 26 in /TYPO3-dists/typo3_src-8.7.19/typo3/sysext/frontend/Classes/ContentObject/ContentObjectRenderer.php line 3752
Core: Error handler (FE): PHP Warning: preg_match(): Compilation failed: regular expression is too large at offset 26 in /TYPO3-dists/typo3_src-8.7.19/typo3/sysext/frontend/Classes/ContentObject/ContentObjectRenderer.php line 3746

With several trials I found that the "magic number" is 964; if you write page.120.cropHTML = 963| ... the crop works fine. I don't know if that depends on my environment settings.

Actions #4

Updated by Susanne Moog about 4 years ago

  • Category set to Frontend
Actions #5

Updated by Christian Toffolo almost 4 years ago

  • Priority changed from Should have to Must have
  • TYPO3 Version changed from 6.2 to 11
  • PHP Version changed from 5.5 to 7.2

Still present in TYPO3 9 LTS and probably also in 10 LTS and current master since the code of method cropHTML didn't change.

Actions #6

Updated by Gerrit Code Review over 2 years ago

  • Status changed from New to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/72048

Actions #7

Updated by Gerrit Code Review over 2 years ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/72048

Actions #8

Updated by Gerrit Code Review over 2 years ago

Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/72048

Actions #9

Updated by Gerrit Code Review over 2 years ago

Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/72048

Actions #10

Updated by Gerrit Code Review over 2 years ago

Patch set 5 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/72048

Actions #11

Updated by Gerrit Code Review over 2 years ago

Patch set 6 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/72048

Actions #12

Updated by Gerrit Code Review over 2 years ago

Patch set 7 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/72048

Actions #13

Updated by Gerrit Code Review over 2 years ago

Patch set 8 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/72048

Actions #14

Updated by Gerrit Code Review over 2 years ago

Patch set 1 for branch 10.4 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/72149

Actions #15

Updated by Stefan Bürk over 2 years ago

  • Status changed from Under Review to Resolved
  • % Done changed from 0 to 100
Actions #16

Updated by Nikita Hovratov about 2 years ago

  • Related to Task #97125: Replace regex for stdWrap cropHTML added
Actions #17

Updated by Benni Mack over 1 year ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF