Bug #18754
closedimproved 404 pagenotfound_handling not working for certain requested URLs/resources
Added by Matthew Kennewell over 16 years ago. Updated almost 7 years ago.
0%
Description
When using default TYPO3 .htaccess file and with or without 'simulate static documents' it appears that 404 pagenotfound_handling works only for requested files:
- that have .html as the file suffix
- and the file being called is requested from the root of a typo3 site, i.e. www.domain.com.au/file.html
If 'file.html' exists, page is shown
If 'wrongfile.html' does not exist then correct 404 handling takes place
correctly, due to function '$this->checkAndSetAlias()'
The '404 pagenotfound_handling' feature fails to show the 404 headers for the following requested resources:
- www.domain.com.au/file.htm
- www.domain.com.au/file.pdf
- www.domain.com.au/folder/
- www.domain.com.au/folder/file.html
- www.domain.com.au/folder/file.pdf
When the above requested resources fail, the browser is given a 200 OK http header and is shown the home page of the website with the requested resources URL remaining in the browsers address bar. This could be due to $this->id being 'false' and then $this->id is set to '0' in function 'setIDfromArgV()'.
My suggested code to patch class.tslib_fe.php works on the premise that $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling'] value is TRUE, but perhaps the TYPO3 404 handling should still work when $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling'] is FALSE and therefore still show 404 headers and redirect to home page/ root page of website.
Please see attached file(s) class.tslib_fe.modified.php.txt & class.tslib_fe.orig.php.txt to review suggested code as a basis idea working towards possible patching of /typo3/sysext/cms/tslib/class.tslib_fe.php
(issue imported from #M8343)
Files
class.tslib_fe.modified.php.txt (162 KB) class.tslib_fe.modified.php.txt | Administrator Admin, 2008-05-06 17:00 | ||
class.tslib_fe.orig.php.txt (162 KB) class.tslib_fe.orig.php.txt | Administrator Admin, 2008-05-06 17:01 | ||
class.tslib_fe.modified.php__updated.txt (162 KB) class.tslib_fe.modified.php__updated.txt | Administrator Admin, 2008-05-22 04:04 | ||
effects_on_this-id_using_default_class.tslib_fe.php.txt (2.88 KB) effects_on_this-id_using_default_class.tslib_fe.php.txt | Administrator Admin, 2008-05-29 14:36 | ||
effects_on_this-id_using_modified_class.tslib_fe.php.txt (2.25 KB) effects_on_this-id_using_modified_class.tslib_fe.php.txt | Administrator Admin, 2008-05-29 14:36 | ||
class.tslib_fe.original.php (198 KB) class.tslib_fe.original.php | original file from typo3_src-4.5.29 source | Matthew Kennewell, 2013-09-12 09:55 | |
class.tslib_fe.modified.php (199 KB) class.tslib_fe.modified.php | modified version placed in typo3_src-4.5.29 source | Matthew Kennewell, 2013-09-12 09:55 |
Updated by Matthew Kennewell over 16 years ago
Additional information can be found in a post in the TYPO3 mailing list typo3.dev
Subject "An idea to further process ' page not found ' 404handling"
Originally posted "Tuesday, 29 April 2008"
Updated by Olivier Dobberkau over 16 years ago
We have experienced a massive Load due of this handling behaviour.
Updated by Matthew Kennewell over 16 years ago
the attached file, class.tslib_fe.modified.php__updated.txt , has a line added in the following array, which was not in file class.tslib_fe.modified.php.txt
This added line, starting with 5 => , is required to for outputting an error message when suggested new function checkAndSetPageNotFound() sets $this->pageNotFound = 5
the following code block is from file /typo3/sysext/cms/tslib/class.tslib_fe.php around line 884-891
----------------------------------------
if ($this->pageNotFound && $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']) {
$pNotFoundMsg = array(
1 => 'ID was not an accessible page',
2 => 'Subsection was found and not accessible',
3 => 'ID was outside the domain',
4 => 'The requested page alias does not exist'
5 => 'The requested page or file resource does not exist'
);
----------------------------------------
Updated by Matthew Kennewell over 16 years ago
Well I have completed some further research into SSD & 404 handling and it seems that the code suggestions I have made to date may not work as expected, (produce correct 404 headers), so i have come up with the following as another suggested fix for this bug. This suggestion may not consider all things required, its just a step towards fixing bug.
In TYPO3 v4.1.6, file /typo3/sysext/cms/tslib/class.tslib_fe.php , in function fetch_the_id(), approx line 861:
replace:
if (!$this->id) {
with:
if (!$this->id && !$this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']) {
And the same potentially goes for:
In TYPO3 v4.2.0, file /typo3/sysext/cms/tslib/class.tslib_fe.php , in function fetch_the_id(), approx line 929:
replace:
if (!$this->id) {
with:
if (!$this->id && !$this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']) {
Summary:
If 'pageNotFound_handling' set then $this->id maintains its setting of zero, then when $this->id is processed in function getPageAndRootline() and still no page exists then there is a call to $this->pageNotFoundAndExit()
If 'pageNotFound_handling' not set then $this->id will be set to "the id was not previously set, set it to the id of the domain" or "the first 'visible' page in that domain", this is typically the 'home page'.
note:
$this->id was set to zero from this function call $this->setIDfromArgV() in function determineId()
Updated by Matthew Kennewell over 16 years ago
Whoops: just found out that if you call just the domain of your SSD website this loads the page set for pageNotFound handling.
Sorry guys & gals for the err on my part...
Therefore the above suggested line change needs to consider if NO SITE_SCRIPT exists. here is a new suggested line of code to fix SSD bug.
if (!$this->id && !($this->TYPO3_CONF_VARS['FE']['pageNotFound_handling'] && t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT'))) {
Updated by Matthew Kennewell over 16 years ago
The previous submitted note with code suggestion, (of mine), did not consider if the requested URL was www.domain.com.au/index.php
So here is an update to the suggested line of code to that supercedes any previous code suggestions of mmine to work towards fixing 404 handling when using SSD:
*
if (!$this->id && (t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT')=='index.php' || !(t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT') && $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']))) {
*
I placed this line of code in my local copy of file /typo3/sysext/cms/tslib/class.tslib_fe.php around lines 861-873 in TYPO3v4.1.6 AND around lines 929-948 in TYPO3 v4.2.0
It worked for the following requested resources, when SSD set, config.simulateStaticDocuments = 1
- www.domain.com/ - showed root page
- www.domain.com/index.php - showed root page
- www.domain.com/contact_us.html - showed the contact us page
the follwing resources showed: 404 headers, & showed the page content set for 404 handling
- www.domain.com/wrong_alias.html
- www.domain.com/wrong_file.pdf
- www.domain.com/wrong_folder/
- www.domain.com/wrong_folder/wrong_file.pdf
Please note: The line of code suggested above has not been tested with extensions realURL & coolURI and is likely to break them. Perhaps a way to support either the setting of SSD or realURL or CoolURI etc would be to create an 'if statement' associated with the suggested line of code above to check if config.simulateStaticDocuments set in main ts template, along the lines of:
if config.simulateStaticDocuments = 0 , if this statement is true then use original line of code " if (!$this->id) { "
if config.simulateStaticDocuments = 1 , if this statement is true then use suggested line of code above
BUT:
I tried to configure an if statement to cover this checking of simulateStaticDocuments myself but I couldn't access the websites main ts template 'setup' values, specifically '$this->config['config']['simulateStaticDocuments']', while in function fetch-the_id() in class.tslib_fe.php but to no avail.
Some research later, I now believe that there's no access to TypoScript values until after the ID of the requested page is known, (which now makes sense since ts is unique to a 'domain' and its page tree). I guess you could have 2 different websites set up in the one install of TYPO3 on different domains with one configured for SSD & the other set for realURL, m'mmm.
From my understanding, typoscript cannot be read into a config array until the requested page ID is known.
In file /typo3/sysext/cms/tslib/index_ts.php the call to $TSFE->determineId() i think takes care of working out the ID and inside this function is where the above line of code is suggested to replace the existing line of code, (line number above). After $TSFE->determineId() is processed and an ID is known then functions inside index_ts.php continue to execute and i think that the ts for the resolved domain & page is read into a config array from this function call $TSFE->getConfigArray()
Additional info: created the following 2x files, (see attched); both with basic list of function flow with respect to their effects on the value of $this->id, from resource request to index.php through to where it is determined to exit due to requested resource being false and therefore initiate pageNotFound, (404), or continue with resolving page ID if requested resource is true etc
- effects_on_this-id_using_default_class.tslib_fe.php.txt
- effects_on_this-id_using_modified_class.tslib_fe.php.txt
From here I dont know enough of the TYPO3 API, SSD, RealURL & CoolURI to know how to suggest a way to use my suggested code to fix 404 handling when SSD set without breaking other SEF extensions.
Hope this information will be useful.
Cheers, Matt
Updated by Andi Phringer about 16 years ago
First of all thank you so much, Matt. I was really stuck with this problem for a long time and your approach worked for me as well. I just want to add a small remark as others might be facing this problem as well.
If you want to get the same working with Realurl you should comment out the following line in your realurl configuration:
// 'postVarSet_failureMode'=>'redirect_goodUpperDir',
Else the wrong page names on the root level of the website will still point to the Homepage.
Kind Regards
Andi
Updated by Thomas Deinhamer about 15 years ago
Will this be included into 4.3? I'd really appreciate it, as it makes serving error pages a LOT easier, and also SEO will get a true boost! Thanks so much!
PS: Does this also work with other page types and language ids? As I can remember we had a ot of troubles with L other than 0 and type other than 0. Can someone confirm this further?
Updated by Matthew Kennewell about 14 years ago
Hi,
Will this be included in TYPO3 version 4.5 with Long Term Support or any other upcoming TYPO3 release?
Cheers
Updated by Björn Paulsen almost 13 years ago
- Target version changed from -1 to 4.5.9
This Bug is also in Typo3 4.5 LTS.
I solve this Bug very esay:
Function "fetch_the_id()"
typo3/sysext/cms/tslib/class.tslib_fe.php:914
insert this Code:
if( ($this->id == 0) && ($this->siteScript <> false))
$this->pageNotFound = 1;
after:
$this->idParts = explode(',',$this->id);
// Splitting by a '+' sign - used for base64/md5 methods of parameter encryption for simulate static documents.list($pgID,$SSD_p)=explode('+',$this->idParts0,2);
if ($SSD_p) { $this->idPartsAnalyze($SSD_p); }
$this->id = $pgID; // Set id
// If $this->id is a string, it's an alias
$this->checkAndSetAlias();
- insert here ***
and no Problems I found, all 404 Sites comes up.
Updated by Ernesto Baschny almost 13 years ago
- Target version changed from 4.5.9 to 4.5.12
Updated by Michael Cannon over 12 years ago
I confirm the problem and suggest fix in TYPO3 4.5.4.
Updated by Matthew Kennewell about 12 years ago
Hi,
I recently updated to the TYPO3 source v4.5.19 and I see the 404 headers and page handling error referred to in this bug report still occurs, when using SSD.
Is there any chance that this could be reviewed and the core file class.tslib_fe.php be patched/amended to have 404 error handing working correctly?
Thanks in advance.
Matthew
Updated by Alexander Opitz about 11 years ago
- Category deleted (
Communication) - Target version deleted (
4.5.12) - TYPO3 Version set to 4.3
- Is Regression set to No
Hi,
as this issue is very old. Does the problem still exists within newer versions of TYPO3 CMS (4.5 or 6.1)?
Updated by Alexander Opitz about 11 years ago
- Status changed from New to Needs Feedback
Updated by Matthew Kennewell about 11 years ago
- File class.tslib_fe.original.php class.tslib_fe.original.php added
- File class.tslib_fe.modified.php class.tslib_fe.modified.php added
hi Alexander,
yes i believe this is still a problem.
as a matter of course, every time i upgrade to a new typo3 source i edit this file /typo3/sysext/cms/tslib/class.tslib_fe.php from typo3_src-4.5.29 source so that the website outputs the correct http header 404 error response when any website request that does not exist and then redirects to a dedicated '404 page not found' page within the website.
I just went into a demo website i have tested if this is still a problem.
I restored the original version of the class.tslib_fe.php file and tried the following requests which mostly returned a 200 header response, (one i think output 304), and all showed the Home (parent) page of the website.
- www.domain.com.au/file.htm
- www.domain.com.au/file.pdf
- www.domain.com.au/folder/
- www.domain.com.au/folder/file.html
- www.domain.com.au/folder/file.pdf
I reset back the modified version of the class.tslib_fe.php file and requested the above links and the website gave a 404 header response and showed the 404 page not found page in the website.
It's a shame this was not fixed all those years ago, as all typo3 websites that use simulate static documents have been outputting incorrect http header responses.
Thank you for looking into this issue.
oh, and attached are the original and modified versions of file class.tslib_fe.php from typo3_src-4.5.29 source, so you can compare and see what fixes this for me.
Note: if this issue is reviewed for potential implementing then please be aware if the suggested code affects other URL handlers like RealURL etc before being applied.
kind regards, Matthew
Updated by Alexander Opitz about 11 years ago
- Category set to Content Rendering
- Status changed from Needs Feedback to New
I don't know if this will go into 4.5, as it isn't critical and it changes the behavior of TYPO3.
But Maybe for TYPO3 CMS 6.2.
Updated by Matthew Kennewell about 11 years ago
hi,
thanks, it will be great when included.
A shame though that may not go into v4.5 as this version still has a year of maintenance to go.
Yes it changes the behavior of TYPO3 in that it fixes the output of http header responses to work properly especially with 404 errors.
Would you agree this is an important aspect of a websites operation to get right sooner than later?
thanks again
Updated by Ernesto Baschny about 11 years ago
- Status changed from New to Needs Feedback
Could you provide the proposed fix in form of a patch / review request (see http://wiki.typo3.org/CWT) and maybe an up-to-date explanation of the problem? I was not able to reproduce the issue from the original reporter...
Updated by Ernesto Baschny about 11 years ago
- Category changed from Content Rendering to Frontend
Updated by Alexander Opitz almost 11 years ago
As Ernesto already asked, can you provide the proposed fix in form of a patch and review request?
Updated by Markus Klein over 10 years ago
Last ping, before closing this issue.
Updated by Matthew Kennewell over 10 years ago
this is i believe still an active issue, and an important one related to 404 errors
Updated by Matthew Kennewell over 10 years ago
hello,
m'mm, 6 years further down the TYPO3 road and this appears to still be a concern...
this potentially is a simple fix, (potetnially for all releases of TYPO3),
by just changing/updating one line of code to get 404 pagenotfound working properly.
for example in source TYPO3 4.5 LTS, typo3_src-4.5.35.tar.gz edit file /typo3/sysext/cms/tslib/class.tslib_fe.php
at line 957
if (!$this->id) {
replace with:
if (!$this->id && (t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT')=='index.php' || !(t3lib_div::getIndpEnv('TYPO3_SITE_SCRIPT') && $this->TYPO3_CONF_VARS['FE']['pageNotFound_handling']))) {
though as mentioned when this bug was first submitted, it would be good for someone to check if this change influences or impacts other extensions like realURL & coolURI.
kind regards
Matthew
Updated by Alexander Opitz about 10 years ago
Then please answer the question from Ernesto from 12 month ago.
Updated by Matthew Kennewell about 10 years ago
hello Alexander,
I have given all the required information throughout this bug report especially in the original post, on what is the problem and how easily it can be fixed with potentially 1 line of code change.
As i have mentioned a couple of times now that the suggested fix, (mentioned/listed 11days), potentially needs someone with realURL & coolURI and similar extensions experience to test whether there is any impact on these extensions because of the 1 line of code change suggested.
thanks in advance
M
Updated by Matthew Kennewell almost 10 years ago
hello all,
this still continues to be a concern in TYPO3 version 4.5.38
any incorrectly requested file or folder is redirected to the root of the website and not the 404 page not found error handling method
regards,
Matthew
Updated by Gerrit Code Review over 9 years ago
- Status changed from Needs Feedback to Under Review
Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/40933
Updated by Matthew Kennewell over 9 years ago
hello,
this would be really important and great to see added to TYPO3 core.
cheers
Matthew
Updated by Gerrit Code Review about 9 years ago
Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/40933
Updated by Riccardo De Contardi almost 7 years ago
- Status changed from Under Review to Closed
I close this issue for now; the patch on Gerrit has been abandoned, I report here the last comment that contains an explanation:
Abandoned
Handling resources via the same RequestHandler as any frontend page seems not useful, should be handled by either a custom RequestHandler (could be done via an extension as well) or via .htaccess directly
If you think that this is the wrong decision or that additional work should be done on this area, please reopen it or open a new issue with a reference to this one. Thank you.