Project

General

Profile

Actions

Bug #16729

closed

Converting external files to current charset fails

Added by Christian Buelter about 18 years ago. Updated about 11 years ago.

Status:
Closed
Priority:
Should have
Assignee:
-
Category:
Indexed Search
Target version:
-
Start date:
2006-11-20
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
4.0
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

When using the crawler extension, converting the charset of the indexed externel URL to UTF-8 fails if the external page does not give the charset in the metatags and is "iso-8859-1".

I solved the problem by adding one line the the function convertHTMLToUtf8 (in the file "class.indexer.php"):

// Find charset:
$charset = $charset ? $charset : $this->getHTMLcharset($content);
$charset = $this->csObj->parse_charset($charset);
// make the indexer convert the page if no charset is given...
if (!$charset) $charset='iso-8859-1';

Of course, this assumes, that the pages is in iso-8859-1 if no charset is given. But this is more likely true than assuming that the page is in utf-8 (as it works now).

(issue imported from #M4537)

Actions #1

Updated by Alexander Opitz over 11 years ago

  • Target version deleted (0)
  • TYPO3 Version set to 4.0

The issue is very old, does this issue exists in newer versions of TYPO3 CMS (4.5 or 6.1)?

Actions #2

Updated by Alexander Opitz over 11 years ago

  • Status changed from New to Needs Feedback
Actions #3

Updated by Alexander Opitz about 11 years ago

  • Status changed from Needs Feedback to Closed

No feedback for over 90 days.

Actions

Also available in: Atom PDF