Bug #67136

Using pageNotFound handler with a cURL proxy can cause HTTP headers to be displayed

Added by Alexander Rothmund about 3 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Must have
Assignee:
-
Category:
Frontend
Target version:
Start date:
2015-05-26
Due date:
2016-09-30
% Done:

100%

TYPO3 Version:
6.2
PHP Version:
Tags:
Complexity:
easy
Is Regression:
No
Sprint Focus:
On Location Sprint

Description

There is an issue when using a cURL proxy alongside a pageNotFound_handling to an URL. Here, 2 blocks of HTTP headers can be returned from the proxy server. Mistakenly, TYPO3 only strips the first and ignores the second block.

Here is the (anonymized) example setup with which I have experienced this issue:

$TYPO3_CONF_VARS['FE']['pageNotFound_handling'] = 'index.php?id=123';

... as well as having configured a cURL proxy:

    'SYS' => array(
        'curlProxyServer' => 'http://proxy.example.org:80/',
        'curlUse' => '1',
    ),

I have analyzed the root of this problem, and have found the following things.

Inside TypoScriptFrontendController::pageErrorHandler, TYPO3 requests the 404 Page with HTTP headers, because it then checks those header for a "Content-Type" header, which would get returned alongside the content - to assure that the correct Content-Type header for the 404 content is returned.

TYPO3 does this by simply reading up to the first empty line.

However, when using a cURL proxy, the response can look like this:

HTTP/1.0 200 Connection Established
Proxy-agent: Apache

HTTP/1.1 200 OK
Server: nginx
Date: Fri, 22 May 2015 16:30:14 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 190105
Connection: keep-alive
Strict-Transport-Security: max-age=63072000

<!DOCTYPE html>
...

TYPO3 then simply parses the first 2 lines as the header and returns the rest as content which shows HTTP headers to the website users and can break the page layout.

No headers should be returned.


Related issues

Related to TYPO3 Core - Bug #65801: Headers are visible if URI of pageNotFound_handling has a redirect by .haccess New 2015-03-18

Associated revisions

Revision 9550cdf9 (diff)
Added by Michael Oehlhof over 1 year ago

[BUGFIX] Fix display of HTTP headers using pageNotFound handler

When using the pageNotFound handler with a curl proxy there are no longer
HTTP headers displayed.

Resolves: #67136
Releases: 7.6
Change-Id: I7c6a9fa3bffbd265345e1a7bfa3ebf25bb2d80b9
Reviewed-on: https://review.typo3.org/50876
Tested-by: TYPO3com <>
Reviewed-by: Jan Helke <>
Tested-by: Jan Helke <>
Reviewed-by: Anja Leichsenring <>
Tested-by: Anja Leichsenring <>

History

#1 Updated by Morton Jonuschat about 3 years ago

I think you have a bigger problem and what you are seeing is a result of that. The initial request was HTTP/1.0 and the response is HTTP/1.1 which isn't allowed.

The way that TYPO3 gets the headers is correct according to RFC2616/RFC7230, they clearly state that the header ends on the first empty line.

#2 Updated by Alexander Rothmund about 3 years ago

Disclaimer: I am not a sysadmin :-)

I went and tried to confirm that this is not an issue with our setup but instead standard behavior.

On a random debian VM, I installed cURL as well as 2 proxy servers which I was able to find. These are the results:

squid3:

➜  ~  curl https://www.google.ch --proxy localhost:3128 -i
HTTP/1.0 200 Connection established

HTTP/1.1 200 OK
Date: Thu, 28 May 2015 07:34:54 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=7aebf97cd842337d:FF=0:TM=1432798494:LM=1432798494:S=aNRlNomdtc7SbfrM; expires=Sat, 27-May-2017 07:34:54 GMT; path=/; domain=.google.ch
Set-Cookie: NID=67=GGlZHRlB-9BTD8_vWwIybx_FIMKTaoBHFjH1CarJOZsOB9PBcp8EuxE6fMdTy4VJMayr5SIfKzvd_tuyAYRpjL6-zSk_gFGUJHlEaf3FQC0-BwpuqeXfooFyBwXZdsSC; expires=Fri, 27-Nov-2015 07:34:54 GMT; path=/; domain=.google.ch; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info." 
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 443:quic,p=1
Accept-Ranges: none
Vary: Accept-Encoding
Transfer-Encoding: chunked

<!doctype html>

tinyproxy:

➜  ~  curl https://www.google.ch --proxy localhost:8888 -i                    
HTTP/1.0 200 Connection established
Proxy-agent: tinyproxy/1.8.3

HTTP/1.1 200 OK
Date: Thu, 28 May 2015 07:35:27 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=308977deae2cda42:FF=0:TM=1432798527:LM=1432798527:S=D8Hef5rt2z7ysz6R; expires=Sat, 27-May-2017 07:35:27 GMT; path=/; domain=.google.ch
Set-Cookie: NID=67=PyEkURqZoOmXuj8Il6gCkJUr-hsmQzQCJmcj-R7oHrGoTFMll_12ME6F0ivTXRfza7RZzKmdxHmV5MbjFVsJnNt30ltK1HFRapRTOdxGBKNgJ60f4pAukHY28mmXxSa5; expires=Fri, 27-Nov-2015 07:35:27 GMT; path=/; domain=.google.ch; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info." 
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 443:quic,p=1
Accept-Ranges: none
Vary: Accept-Encoding
Transfer-Encoding: chunked

<!doctype html>

Do note that this only seems to happen in https.

➜  ~  curl http://www.google.ch --proxy localhost:8888 -i 
HTTP/1.0 200 OK
Via: 1.1 tinyproxy (tinyproxy/1.8.3)
Expires: -1
Vary: Accept-Encoding
Set-Cookie: PREF=ID=bfe72406174e7eda:FF=0:TM=1432798547:LM=1432798547:S=HpQcy4PEhFc-8718; expires=Sat, 27-May-2017 07:35:47 GMT; path=/; domain=.google.ch
Set-Cookie: NID=67=Q46_BnSHslCduNZev6ynSrrzZzV3_xNC3HKJPOu_3QfM_xPOJW6Qx2OJj4IFT-gt98Yc-epO4gYh-nfscxqKLUZ2QCTQ-LW8lS7Hc7yYplsbVKgDVPXS5G3VV5Exs-_u; expires=Fri, 27-Nov-2015 07:35:47 GMT; path=/; domain=.google.ch; HttpOnly
Accept-Ranges: none
X-Frame-Options: SAMEORIGIN
Date: Thu, 28 May 2015 07:35:47 GMT
Cache-Control: private, max-age=0
Server: gws
X-XSS-Protection: 1; mode=block
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info." 
Alternate-Protocol: 80:quic,p=0

<!doctype html>

While I am anything but knowledgable about the HTTP defnitions, getting 2 blocks of response headers seems to be the default behavior of cURL when using a proxy server for https requests.

#3 Updated by Morton Jonuschat about 3 years ago

Proxying HTTPS is a different beast...when using the CONNECT Method you are effectively establishing a tunnel where the response doesn't even need to conform to HTTP.
So the response you are seeing is correct and both ends behave according to standard, but the result isn't satisfactory at all. The HTTP Header in this scenario is still only the block until the first empty line.
The body is a raw http response (in the tunnel) where the „interesting“ header (the second block) is included.

#4 Updated by Franz Kugelmann about 3 years ago

i can confirm the behaviour and the resulting problem with the pageNotFoundHandler.
We are using 6.2.14 and curl behind a proxy (same as Alexander).
The second header block is shown on top of the page, because for TYPO3 it is already part of the content.
The response we get:

HTTP/1.1 200 Connection established

HTTP/1.1 200 OK
Date: Fri, 31 Jul 2015 07:05:02 GMT
Server: Apache
Content-Length: 12269
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8

<!DOCTYPE html>
...

#5 Updated by Jürg Blaser almost 3 years ago

My solution:

..\typo3\sysext\frontend\Classes\Controller\TypoScriptFrontendController.php
$res = GeneralUtility::getUrl($code, 1, $headerArr);
// Header and content are separated by an empty line
list($header, $content) = explode(CRLF . CRLF, $res, 2);
// jeb 2015-08-31 [added next if]: Headerbug if page is called without www. and result ist 404 (see this bug https://forge.typo3.org/issues/65801)
if (substr($content, 0, 4) == 'HTTP') {
list($header, $content) = explode(CRLF . CRLF, $content, 2);
}
$content .= CRLF;

#6 Updated by Flummi no-lastname-given about 2 years ago

  • Target version set to next-patchlevel
  • % Done changed from 0 to 90

Thanks @Jürgen Blaser for your solution (your last comment) – I just tested it and it works. Can this fix be included in the next Typo3 release (add these 3 lines at line 2182 of typo3\sysext\frontend\Classes\Controller\TypoScriptFrontendController.php)?

I don't know the workflow at the Typo3 Team, so I just set this to 90% done (it might need a code-review of someone who knows the code) and set the target version to "next-patchlevel". Is this ok? Can somebody take care of this?

#7 Updated by Flummi no-lastname-given almost 2 years ago

  • Due date set to 2016-09-30
  • Priority changed from Should have to Must have
  • % Done changed from 90 to 100
  • Complexity set to easy

Could someone please add this patch? The solution from Jürgen Blaser works just well and was added here one year ago!

#8 Updated by Gerrit Code Review almost 2 years ago

  • Status changed from New to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/50321

#9 Updated by Gerrit Code Review almost 2 years ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/50321

#10 Updated by Michael Oehlhof over 1 year ago

  • Sprint Focus set to On Location Sprint

#11 Updated by Gerrit Code Review over 1 year ago

Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/50321

#12 Updated by Gerrit Code Review over 1 year ago

Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/50321

#13 Updated by Gerrit Code Review over 1 year ago

Patch set 5 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/50321

#14 Updated by Alexander Rothmund over 1 year ago

This patch is not needed in TYPO3 8, because they way the headers are added in master has changed with the switch to Guzzle.

The situation that GeneralUtility::getUrl returns more than one http header block does no longer exist, as that was a side effect of how cURL added the headers with CURLOPT_HEADER set, which is no longer used.

#15 Updated by Gerrit Code Review over 1 year ago

Patch set 1 for branch TYPO3_7-6 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/50876

#16 Updated by Gerrit Code Review over 1 year ago

Patch set 2 for branch TYPO3_7-6 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/50876

#17 Updated by Gerrit Code Review over 1 year ago

Patch set 3 for branch TYPO3_7-6 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/50876

#18 Updated by Michael Oehlhof over 1 year ago

  • Status changed from Under Review to Resolved

Also available in: Atom PDF