Project

General

Profile

Actions

Bug #18754

closed

improved 404 pagenotfound_handling not working for certain requested URLs/resources

Added by Matthew Kennewell almost 16 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Should have
Assignee:
-
Category:
Frontend
Target version:
-
Start date:
2008-05-06
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
4.3
PHP Version:
4.3
Tags:
Complexity:
Is Regression:
No
Sprint Focus:

Description

When using default TYPO3 .htaccess file and with or without 'simulate static documents' it appears that 404 pagenotfound_handling works only for requested files:

- that have .html as the file suffix
- and the file being called is requested from the root of a typo3 site, i.e. www.domain.com.au/file.html

If 'file.html' exists, page is shown

If 'wrongfile.html' does not exist then correct 404 handling takes place
correctly, due to function '$this->checkAndSetAlias()'

The '404 pagenotfound_handling' feature fails to show the 404 headers for the following requested resources:

- www.domain.com.au/file.htm
- www.domain.com.au/file.pdf

- www.domain.com.au/folder/
- www.domain.com.au/folder/file.html
- www.domain.com.au/folder/file.pdf

When the above requested resources fail, the browser is given a 200 OK http header and is shown the home page of the website with the requested resources URL remaining in the browsers address bar. This could be due to $this->id being 'false' and then $this->id is set to '0' in function 'setIDfromArgV()'.

My suggested code to patch class.tslib_fe.php works on the premise that $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling'] value is TRUE, but perhaps the TYPO3 404 handling should still work when $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling'] is FALSE and therefore still show 404 headers and redirect to home page/ root page of website.

Please see attached file(s) class.tslib_fe.modified.php.txt & class.tslib_fe.orig.php.txt to review suggested code as a basis idea working towards possible patching of /typo3/sysext/cms/tslib/class.tslib_fe.php

(issue imported from #M8343)


Files

class.tslib_fe.modified.php.txt (162 KB) class.tslib_fe.modified.php.txt Administrator Admin, 2008-05-06 17:00
class.tslib_fe.orig.php.txt (162 KB) class.tslib_fe.orig.php.txt Administrator Admin, 2008-05-06 17:01
class.tslib_fe.modified.php__updated.txt (162 KB) class.tslib_fe.modified.php__updated.txt Administrator Admin, 2008-05-22 04:04
effects_on_this-id_using_default_class.tslib_fe.php.txt (2.88 KB) effects_on_this-id_using_default_class.tslib_fe.php.txt Administrator Admin, 2008-05-29 14:36
effects_on_this-id_using_modified_class.tslib_fe.php.txt (2.25 KB) effects_on_this-id_using_modified_class.tslib_fe.php.txt Administrator Admin, 2008-05-29 14:36
class.tslib_fe.original.php (198 KB) class.tslib_fe.original.php original file from typo3_src-4.5.29 source Matthew Kennewell, 2013-09-12 09:55
class.tslib_fe.modified.php (199 KB) class.tslib_fe.modified.php modified version placed in typo3_src-4.5.29 source Matthew Kennewell, 2013-09-12 09:55

Related issues 2 (0 open2 closed)

Related to TYPO3 Core - Bug #21852: PageNotFound_handling works incorrectlyRejected2009-12-13

Actions
Related to TYPO3 Core - Bug #58728: Regression: unaccessible protected section with shortcut in rootlineClosed2014-05-12

Actions
Actions #1

Updated by Matthew Kennewell almost 16 years ago

Additional information can be found in a post in the TYPO3 mailing list typo3.dev

Subject "An idea to further process ' page not found ' 404handling"

Originally posted "Tuesday, 29 April 2008"

Actions #2

Updated by Olivier Dobberkau almost 16 years ago

We have experienced a massive Load due of this handling behaviour.

Actions #3

Updated by Matthew Kennewell almost 16 years ago

the attached file, class.tslib_fe.modified.php__updated.txt , has a line added in the following array, which was not in file class.tslib_fe.modified.php.txt

This added line, starting with 5 => , is required to for outputting an error message when suggested new function checkAndSetPageNotFound() sets $this->pageNotFound = 5

the following code block is from file /typo3/sysext/cms/tslib/class.tslib_fe.php around line 884-891
----------------------------------------

if ($this->pageNotFound && $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling'])    {
$pNotFoundMsg = array(
1 => 'ID was not an accessible page',
2 => 'Subsection was found and not accessible',
3 => 'ID was outside the domain',
4 => 'The requested page alias does not exist'
5 => 'The requested page or file resource does not exist'
);
----------------------------------------
Actions #4

Updated by Matthew Kennewell almost 16 years ago

Well I have completed some further research into SSD & 404 handling and it seems that the code suggestions I have made to date may not work as expected, (produce correct 404 headers), so i have come up with the following as another suggested fix for this bug. This suggestion may not consider all things required, its just a step towards fixing bug.

In TYPO3 v4.1.6, file /typo3/sysext/cms/tslib/class.tslib_fe.php , in function fetch_the_id(), approx line 861:
replace:
if (!$this->id) {
with:
if (!$this->id && !$this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']) {

And the same potentially goes for:

In TYPO3 v4.2.0, file /typo3/sysext/cms/tslib/class.tslib_fe.php , in function fetch_the_id(), approx line 929:
replace:
if (!$this->id) {
with:
if (!$this->id && !$this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']) {

Summary:
If 'pageNotFound_handling' set then $this->id maintains its setting of zero, then when $this->id is processed in function getPageAndRootline() and still no page exists then there is a call to $this->pageNotFoundAndExit()

If 'pageNotFound_handling' not set then $this->id will be set to "the id was not previously set, set it to the id of the domain" or "the first 'visible' page in that domain", this is typically the 'home page'.

note:
$this->id was set to zero from this function call $this->setIDfromArgV() in function determineId()

Actions #5

Updated by Matthew Kennewell almost 16 years ago

Whoops: just found out that if you call just the domain of your SSD website this loads the page set for pageNotFound handling.

Sorry guys & gals for the err on my part...

Therefore the above suggested line change needs to consider if NO SITE_SCRIPT exists. here is a new suggested line of code to fix SSD bug.

if (!$this->id && !($this->TYPO3_CONF_VARS['FE']['pageNotFound_handling'] && t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT'))) {

Actions #6

Updated by Matthew Kennewell almost 16 years ago

The previous submitted note with code suggestion, (of mine), did not consider if the requested URL was www.domain.com.au/index.php

So here is an update to the suggested line of code to that supercedes any previous code suggestions of mmine to work towards fixing 404 handling when using SSD:

*
if (!$this->id && (t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT')=='index.php' || !(t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT') && $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']))) { *

I placed this line of code in my local copy of file /typo3/sysext/cms/tslib/class.tslib_fe.php around lines 861-873 in TYPO3v4.1.6 AND around lines 929-948 in TYPO3 v4.2.0

It worked for the following requested resources, when SSD set, config.simulateStaticDocuments = 1

- www.domain.com/ - showed root page
- www.domain.com/index.php - showed root page
- www.domain.com/contact_us.html - showed the contact us page

the follwing resources showed: 404 headers, & showed the page content set for 404 handling
- www.domain.com/wrong_alias.html
- www.domain.com/wrong_file.pdf
- www.domain.com/wrong_folder/
- www.domain.com/wrong_folder/wrong_file.pdf

Please note: The line of code suggested above has not been tested with extensions realURL & coolURI and is likely to break them. Perhaps a way to support either the setting of SSD or realURL or CoolURI etc would be to create an 'if statement' associated with the suggested line of code above to check if config.simulateStaticDocuments set in main ts template, along the lines of:

if config.simulateStaticDocuments = 0 , if this statement is true then use original line of code " if (!$this->id) { "

if config.simulateStaticDocuments = 1 , if this statement is true then use suggested line of code above

BUT:
I tried to configure an if statement to cover this checking of simulateStaticDocuments myself but I couldn't access the websites main ts template 'setup' values, specifically '$this->config['config']['simulateStaticDocuments']', while in function fetch-the_id() in class.tslib_fe.php but to no avail.

Some research later, I now believe that there's no access to TypoScript values until after the ID of the requested page is known, (which now makes sense since ts is unique to a 'domain' and its page tree). I guess you could have 2 different websites set up in the one install of TYPO3 on different domains with one configured for SSD & the other set for realURL, m'mmm.

From my understanding, typoscript cannot be read into a config array until the requested page ID is known.

In file /typo3/sysext/cms/tslib/index_ts.php the call to $TSFE->determineId() i think takes care of working out the ID and inside this function is where the above line of code is suggested to replace the existing line of code, (line number above). After $TSFE->determineId() is processed and an ID is known then functions inside index_ts.php continue to execute and i think that the ts for the resolved domain & page is read into a config array from this function call $TSFE->getConfigArray()

Additional info: created the following 2x files, (see attched); both with basic list of function flow with respect to their effects on the value of $this->id, from resource request to index.php through to where it is determined to exit due to requested resource being false and therefore initiate pageNotFound, (404), or continue with resolving page ID if requested resource is true etc

- effects_on_this-id_using_default_class.tslib_fe.php.txt

- effects_on_this-id_using_modified_class.tslib_fe.php.txt

From here I dont know enough of the TYPO3 API, SSD, RealURL & CoolURI to know how to suggest a way to use my suggested code to fix 404 handling when SSD set without breaking other SEF extensions.

Hope this information will be useful.

Cheers, Matt

Actions #7

Updated by Andi Phringer over 15 years ago

First of all thank you so much, Matt. I was really stuck with this problem for a long time and your approach worked for me as well. I just want to add a small remark as others might be facing this problem as well.

If you want to get the same working with Realurl you should comment out the following line in your realurl configuration:
// 'postVarSet_failureMode'=>'redirect_goodUpperDir',

Else the wrong page names on the root level of the website will still point to the Homepage.

Kind Regards
Andi

Actions #8

Updated by Thomas Deinhamer over 14 years ago

Will this be included into 4.3? I'd really appreciate it, as it makes serving error pages a LOT easier, and also SEO will get a true boost! Thanks so much!
PS: Does this also work with other page types and language ids? As I can remember we had a ot of troubles with L other than 0 and type other than 0. Can someone confirm this further?

Actions #9

Updated by Matthew Kennewell over 13 years ago

Hi,

Will this be included in TYPO3 version 4.5 with Long Term Support or any other upcoming TYPO3 release?

Cheers

Actions #10

Updated by Björn Paulsen over 12 years ago

  • Target version changed from -1 to 4.5.9

This Bug is also in Typo3 4.5 LTS.

I solve this Bug very esay:

Function "fetch_the_id()"
typo3/sysext/cms/tslib/class.tslib_fe.php:914

insert this Code:

if( ($this->id == 0) && ($this->siteScript <> false))
$this->pageNotFound = 1;

after:

$this->idParts = explode(',',$this->id);

// Splitting by a '+' sign - used for base64/md5 methods of parameter encryption for simulate static documents.
list($pgID,$SSD_p)=explode('+',$this->idParts0,2);
if ($SSD_p) { $this->idPartsAnalyze($SSD_p); }
$this->id = $pgID; // Set id
// If $this->id is a string, it's an alias
$this->checkAndSetAlias();
  • insert here ***

and no Problems I found, all 404 Sites comes up.

Actions #11

Updated by Ernesto Baschny over 12 years ago

  • Target version changed from 4.5.9 to 4.5.12
Actions #12

Updated by Michael Cannon about 12 years ago

I confirm the problem and suggest fix in TYPO3 4.5.4.

Actions #13

Updated by Matthew Kennewell over 11 years ago

Hi,

I recently updated to the TYPO3 source v4.5.19 and I see the 404 headers and page handling error referred to in this bug report still occurs, when using SSD.

Is there any chance that this could be reviewed and the core file class.tslib_fe.php be patched/amended to have 404 error handing working correctly?

Thanks in advance.

Matthew

Actions #14

Updated by Alexander Opitz over 10 years ago

  • Category deleted (Communication)
  • Target version deleted (4.5.12)
  • TYPO3 Version set to 4.3
  • Is Regression set to No

Hi,

as this issue is very old. Does the problem still exists within newer versions of TYPO3 CMS (4.5 or 6.1)?

Actions #15

Updated by Alexander Opitz over 10 years ago

  • Status changed from New to Needs Feedback

Updated by Matthew Kennewell over 10 years ago

hi Alexander,

yes i believe this is still a problem.

as a matter of course, every time i upgrade to a new typo3 source i edit this file /typo3/sysext/cms/tslib/class.tslib_fe.php from typo3_src-4.5.29 source so that the website outputs the correct http header 404 error response when any website request that does not exist and then redirects to a dedicated '404 page not found' page within the website.

I just went into a demo website i have tested if this is still a problem.

I restored the original version of the class.tslib_fe.php file and tried the following requests which mostly returned a 200 header response, (one i think output 304), and all showed the Home (parent) page of the website.

- www.domain.com.au/file.htm
- www.domain.com.au/file.pdf

- www.domain.com.au/folder/
- www.domain.com.au/folder/file.html
- www.domain.com.au/folder/file.pdf

I reset back the modified version of the class.tslib_fe.php file and requested the above links and the website gave a 404 header response and showed the 404 page not found page in the website.

It's a shame this was not fixed all those years ago, as all typo3 websites that use simulate static documents have been outputting incorrect http header responses.

Thank you for looking into this issue.

oh, and attached are the original and modified versions of file class.tslib_fe.php from typo3_src-4.5.29 source, so you can compare and see what fixes this for me.

Note: if this issue is reviewed for potential implementing then please be aware if the suggested code affects other URL handlers like RealURL etc before being applied.

kind regards, Matthew

Actions #17

Updated by Alexander Opitz over 10 years ago

  • Category set to Content Rendering
  • Status changed from Needs Feedback to New

I don't know if this will go into 4.5, as it isn't critical and it changes the behavior of TYPO3.
But Maybe for TYPO3 CMS 6.2.

Actions #18

Updated by Matthew Kennewell over 10 years ago

hi,

thanks, it will be great when included.

A shame though that may not go into v4.5 as this version still has a year of maintenance to go.

Yes it changes the behavior of TYPO3 in that it fixes the output of http header responses to work properly especially with 404 errors.

Would you agree this is an important aspect of a websites operation to get right sooner than later?

thanks again

Actions #19

Updated by Ernesto Baschny over 10 years ago

  • Status changed from New to Needs Feedback

Could you provide the proposed fix in form of a patch / review request (see http://wiki.typo3.org/CWT) and maybe an up-to-date explanation of the problem? I was not able to reproduce the issue from the original reporter...

Actions #20

Updated by Ernesto Baschny over 10 years ago

  • Category changed from Content Rendering to Frontend
Actions #21

Updated by Alexander Opitz over 10 years ago

As Ernesto already asked, can you provide the proposed fix in form of a patch and review request?

Actions #22

Updated by Markus Klein over 9 years ago

Last ping, before closing this issue.

Actions #23

Updated by Matthew Kennewell over 9 years ago

this is i believe still an active issue, and an important one related to 404 errors

Actions #24

Updated by Matthew Kennewell over 9 years ago

hello,

m'mm, 6 years further down the TYPO3 road and this appears to still be a concern...

this potentially is a simple fix, (potetnially for all releases of TYPO3),

by just changing/updating one line of code to get 404 pagenotfound working properly.

for example in source TYPO3 4.5 LTS, typo3_src-4.5.35.tar.gz edit file /typo3/sysext/cms/tslib/class.tslib_fe.php

at line 957

if (!$this->id)    {

replace with:

if (!$this->id && (t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT')=='index.php' || !(t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT') && $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']))) {

though as mentioned when this bug was first submitted, it would be good for someone to check if this change influences or impacts other extensions like realURL & coolURI.

kind regards

Matthew

Actions #25

Updated by Alexander Opitz over 9 years ago

Then please answer the question from Ernesto from 12 month ago.

Actions #26

Updated by Matthew Kennewell over 9 years ago

hello Alexander,

I have given all the required information throughout this bug report especially in the original post, on what is the problem and how easily it can be fixed with potentially 1 line of code change.

As i have mentioned a couple of times now that the suggested fix, (mentioned/listed 11days), potentially needs someone with realURL & coolURI and similar extensions experience to test whether there is any impact on these extensions because of the 1 line of code change suggested.

thanks in advance

M

Actions #27

Updated by Matthew Kennewell over 9 years ago

hello all,

this still continues to be a concern in TYPO3 version 4.5.38

any incorrectly requested file or folder is redirected to the root of the website and not the 404 page not found error handling method

regards,
Matthew

Actions #28

Updated by Gerrit Code Review almost 9 years ago

  • Status changed from Needs Feedback to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/40933

Actions #29

Updated by Matthew Kennewell almost 9 years ago

hello,

this would be really important and great to see added to TYPO3 core.

cheers

Matthew

Actions #30

Updated by Gerrit Code Review over 8 years ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/40933

Actions #31

Updated by Riccardo De Contardi over 6 years ago

  • Status changed from Under Review to Closed

I close this issue for now; the patch on Gerrit has been abandoned, I report here the last comment that contains an explanation:

Abandoned

Handling resources via the same RequestHandler as any frontend page seems not useful, should be handled by either a custom RequestHandler (could be done via an extension as well) or via .htaccess directly

If you think that this is the wrong decision or that additional work should be done on this area, please reopen it or open a new issue with a reference to this one. Thank you.

Actions

Also available in: Atom PDF