Project

General

Profile

Actions

Bug #20035

closed

Crawler does not crawl though relative links in an external page

Added by Dennis van over 15 years ago. Updated over 11 years ago.

Status:
Closed
Priority:
Should have
Assignee:
Category:
Indexed Search
Target version:
-
Start date:
2009-02-17
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
4.2
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

The crawler does not crawl through relative links when crawling an external page AND this external page is not just the domain name.

Example:
Relative links work when you crawl http://www.somesite.com/

Relative links DO NOT work when you crawl http://www.somesite.com/somefolder/

This seems to be a small error in class.crawler.php. The original writer seems to have forgotten to add the folder names that were in the url of the page it is crawling.

The diff holds a fix, where the class.crawler.php_back is the original file and class.crawler.php is the fixed file.
(issue imported from #M10463)


Files

crawler.diff (609 Bytes) crawler.diff Administrator Admin, 2009-02-17 15:44
0010463_v2.diff (739 Bytes) 0010463_v2.diff Administrator Admin, 2009-02-19 18:09

Related issues 2 (0 open2 closed)

Related to TYPO3 Core - Bug #22296: IS cannot not index files if absRefPrefix is set and indexExternalURLs is notClosedDmitry Dulepov2010-03-18

Actions
Related to TYPO3 Core - Bug #22229: External URL only indexes first pageClosedXavier Perseguers2010-03-03

Actions
Actions #1

Updated by Rudy Gnodde over 15 years ago

Confirmed on TYPO3 4.2.6 with indexed search version 2.11.1 (which is the one included in TYPO3 4.2.6 as sysext)

Actions #2

Updated by Jeff Segars over 15 years ago

Attaching an updated patch that also accounts for empty paths and links pointing back to the root of the site (starting with a leading slash).

I was starting with a different patch for the leading slash only when I found this bug report, so I haven't tested yet.

Actions #3

Updated by Dmitry Dulepov about 14 years ago

I believe this issue is solved already. At least the code from these patches is present in TYPO3 4.3. Probably, this issue can be closed.

Actions #4

Updated by Stefan Galinski over 11 years ago

  • Status changed from Accepted to Closed
  • Target version deleted (0)
  • TYPO3 Version set to 4.2
Actions

Also available in: Atom PDF