Bug #81644

GeneralUtility::getUrl() socket method doesn't support chunked Content-Encoding

Added by Jigal van Hemert over 3 years ago. Updated 3 months ago.

Status:
Closed
Priority:
Should have
Assignee:
-
Category:
Link Handling, Site Handling & Routing
Target version:
-
Start date:
2017-06-21
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
7
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

With pageNotFound_handling set to the URL of a page the page not found handler uses GeneralUtility::getUrl() to retrieve the contents of the page. Because it requests the headers too the socket method is used if useCurl is not set.
After reading the headers the rest of the stream is simply read in a single operation. If the server has Content-Encoding set to chunked it will send the content in chunks and put the length of each block in hexadecimal before each chunk (plus a zero after the last chunk).

getUrl() fails to process the chunks correctly and the chunk sizes are simply included in the content.

It doesn't seem that guzzle handles chunked encoded data, but in most cases it will use cUrl internally which handles it. I'll have to test if with disabled cUrl v8/master has the same issue.

We can use a simple function to decode the chunked data.


Files

monitoring.jpg (103 KB) monitoring.jpg After applying the patch Markus Klein, 2020-06-03 20:48

Related issues

Related to TYPO3 Core - Bug #91582: Fetching an internal page as 404 content breaks browser output and CDNsClosedMarkus Klein2020-06-04

Actions
#1

Updated by Riccardo De Contardi 12 months ago

  • Category set to Link Handling, Site Handling & Routing
#2

Updated by Benni Mack 6 months ago

  • Status changed from New to Needs Feedback

Hey Jigal,

we avoid using getUrl() in most places now, and use Guzzle. Are you looking for solutions to fetch a page-not-found "chunked"?

#3

Updated by Jigal van Hemert 6 months ago

Hey Benni,

At the time of submitting the issue I couldn't find that Guzzle handles chunked Content Encoding automatically. cUrl does support chunked data and removes the size numbers before returning the content.

In the situation that the web server is configured to use chunked Content Encoding AND cUrl is disabled AND the page not found handling is set to fetching the contents of a page THEN the output is broken and displays the chunk sizes.
The workaround is to change one of those three conditions. But it would be nice if chunked data was supported.

IIRC Guzzle will automatically detect if cUrl can be used, so the chances of this problem happening are drastically reduced (most web servers will have cUrl).

#4

Updated by Benni Mack 6 months ago

Jigal van Hemert wrote:

The workaround is to change one of those three conditions. But it would be nice if chunked data was supported.

IIRC Guzzle will automatically detect if cUrl can be used, so the chances of this problem happening are drastically reduced (most web servers will have cUrl).

Yes. I believe so too, however, I don't have such a set up at hand to build tests around it to make sure we can support this (with guzzle). How do you suggest we proceed?

#5

Updated by Markus Klein 6 months ago

We face a similar issue with v10 and the PageContentErrorHandler. The fetched page is returned with "Transfer-Encoding: chunked" and this exact response is used to answer the original request. Problem is that this response does not use "Transfer-Encoding: chunked" which yields a failed-connection in the browser or - more weird - several retries by a proxy to "get the other chunks" ultimately DoS-attacking the server with useless requests.

I suggest to remove the header with https://review.typo3.org/c/Packages/TYPO3.CMS/+/64672

#7

Updated by Markus Klein 6 months ago

  • Related to Bug #91582: Fetching an internal page as 404 content breaks browser output and CDNs added
#8

Updated by Benni Mack 4 months ago

Hey all,

now that the changes are merged, is this issue resolved for everybody?

#9

Updated by Markus Klein 4 months ago

Yes, for me.

#10

Updated by Riccardo De Contardi 3 months ago

  • Status changed from Needs Feedback to Closed

I close this issue for now in agreement with the reporter;

If you think that this is the wrong decision or experience the issue again, please open a new issue with a reference to this one.

Thank you

Also available in: Atom PDF