Feature #13680
closedRefactor external link, do not follow senseless loop, add user agent
100%
Description
This takes care of most external link issues.
We still need to wait for the patch for getUrl, but this should be solved soon.- add user-agent #13800
- correct handling of location (Thanks to Daniel Minder)
- move code up and down
- remove useless or duplicated code
Files
Updated by Philipp Gampe over 13 years ago
- File linkval-cleanup.diff added
I will commit in 48 hours if nobody objects.
Updated by Chris topher over 13 years ago
Hi Philipp,
do we need the line t3lib_utility_Debug::debug($cookies);
?
The rest looks fine.
Updated by Philipp Gampe over 13 years ago
No, of course not ...
I am currently rewriting the whole loop and I will come up with a patch. It will cover all open issues regarding external link handling.
Updated by Philipp Gampe over 13 years ago
- Subject changed from Cleanup external link, do not follow senseless loop to Cleanup external link, do not follow senseless loop, add user agent and cookies
Updated by Philipp Gampe over 13 years ago
- File linkval-cleanup.diff linkval-cleanup.diff added
- % Done changed from 0 to 80
Updated by Philipp Gampe over 13 years ago
- File linkval-cleanup_2.diff linkval-cleanup_2.diff added
- some code style cleanups
- add getUrl($url, 3, FALSE, &$report) [needs fix in core]
Updated by Philipp Gampe over 13 years ago
- Subject changed from Cleanup external link, do not follow senseless loop, add user agent and cookies to Refactor external link, do not follow senseless loop, add user agent and cookies
Updated by Daniel Minder over 13 years ago
I tested the patch (with getUrl($url, 2, ...) instead of 3) and found a bug in the Cookie handling.
It's not required that a path is given, and if it's present it does not need to be the first attribute. So, line 117 should be changed to:
if ((preg_match('/Set-Cookie: ([^;]+)/', $line, $cookie))) {
since the actual cookie stops at ; or at the end of the line. I'm not sure if the name or value of the cookie can be a quoted string that includes ;. http://tools.ietf.org/html/draft-ietf-httpstate-cookie-21 defines name and value to be a token as defined in RFC 2616 where token is "1*<any CHAR except CTLs or separators>". Then, the regexp would become more complex...
Updated by Philipp Gampe over 13 years ago
A Cookie an also have an expires entry.
I think Name= + value + expires + ; + path is pretty much standard.
I will do some tests this afternoon.
Updated by Philipp Gampe over 13 years ago
- Subject changed from Refactor external link, do not follow senseless loop, add user agent and cookies to Refactor external link, do not follow senseless loop, add user agent
Updated by Philipp Gampe over 13 years ago
- File linkval-user-agent-cleanup-external-url.diff added
-remove cookie support from patch; it does not always work
Updated by Philipp Gampe over 13 years ago
-remove only fetching headers
this was dropped for now in core, so we will have to come up with a better solution for fetching URLs first.
Updated by Philipp Gampe over 13 years ago
- File deleted (
linkval-user-agent-cleanup-external-url.diff)
Updated by Philipp Gampe over 13 years ago
Updated by Philipp Gampe over 13 years ago
commited to trunk in r47965
and to 4.5 in r47966
Updated by Philipp Gampe over 13 years ago
- Status changed from Accepted to Resolved
- % Done changed from 70 to 100
Applied in changeset r47965.
Updated by Chris topher over 12 years ago
- Status changed from Resolved to Closed
Updated by Michael Stucki almost 11 years ago
- Project changed from 1510 to TYPO3 Core
- Category changed from Linkvalidator to Linkvalidator