Bug #88264

Epic #89797: HrefLang / Canonical issues

Canonical and hreflang with tracking params

Added by Marc Hirdes 9 months ago. Updated about 1 month ago.

Status:
Closed
Priority:
Should have
Category:
SEO
Target version:
-
Start date:
2019-05-03
Due date:
% Done:

0%

TYPO3 Version:
9
PHP Version:
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

Hi,

I currently struggle with hreflang and tracking params. They will not be shown in the hreflang, but then they will have a lost connection. hreflang should allways refer to each other. Otherwise there will be shown an error in the Google Search Console.

my question is, if we have page like for example /home/.

The canonical is
https://www.domain.org/home/

The href-lang is
<link rel="alternate" hreflang="de" href="https://www.domain.org/de/start/"/>

Everything is ok. On https://www.domain.org/de/start/

the canonical would be
https://www.domain.org/de/start/
and the hreflang
<link rel="alternate" hreflang="en" href="https://www.domain.org/home/"/>

If my page is in the Google index with
https://www.domain.org/home/?my_tracking_parameter=xy

Then also the canonical and href lang would be the same as above, so without the parameter. For the canonical this works, but for the hreflang not. There would be a missing conection from https://www.domain.org/de/start/ to https://www.domain.org/home/?my_tracking_parameter=xy

Therefore the hreflang sould not be set if the canonical is a different than the current URL. But then we have the challange with the page cache. So https://www.domain.org/home/?myparameter=xy would have the same cached page as https://www.domain.org/home/.

My only solution for this problem would be to use the sitemap.xml for hreflang instead of the HTML head, as it is shown here https://support.google.com/webmasters/answer/189077#sitemap

What do you think about?

hreflang.jpg View (139 KB) Marc Hirdes, 2019-12-09 13:18

hreflang.jpg View - correct version (139 KB) Marc Hirdes, 2019-12-09 13:21

hreflang_final.jpg View (139 KB) Marc Hirdes, 2019-12-09 13:26


Related issues

Related to TYPO3 Core - Bug #89878: Hreflang links not using canonical urls New 2019-12-06

History

#1 Updated by Richard Haeser 3 months ago

Do I get it correctly that you want to pass your tracking params to the other languages by adding them to the hreflang? Why do you want that? That information is not relevant for a search engine isn't it?

#2 Updated by Richard Haeser 3 months ago

  • Status changed from New to Needs Feedback

#3 Updated by Marc Hirdes 3 months ago

No, I don't want to show the hreflang, if the canonical is different than the current URL.

#4 Updated by Richard Haeser 3 months ago

Why do you want that? I can't get the reason why you would like to hide the hreflang while it is showing the right URL. Hreflang and canonical will be using the canonicalized url's, just like they are needed for the search engine. Maybe I don't understand your issue. Can you clarify?

#5 Updated by Marc Hirdes 3 months ago

Maybe you can get the point here better https://www.searchviu.com/en/hreflang-canonical/ but I also descriped it in the description of this ticket.

Another example.

mypage.com/page/ => canoncical mypage.com/page/ => hreflang mypage.com/de/page/
mypage.com/de/page/ => canoncical mypage.com/de/page/ => hreflang mypage.com/page/
is ok.

mypage.com/page/?param=xyz => canoncical mypage.com/page/ => hreflang mypage.com/de/page/
mypage.com/de/page/ => canoncical mypage.com/page/ => hreflang mypage.com/de/page/
missing the way back to mypage.com/page/?param=xyz - every hreflang has to point to itself

mypage.com/page/?param=xyz => canoncical mypage.com/page/ => hreflang mypage.com/de/page/?param=xyz
would be even more worse.

#6 Updated by iam li 3 months ago

So sorry to hijack your post but where do you put this in Typo3?

hreflang
<link rel="alternate" hreflang="en" href="https://www.domain.org/home/"/>

Thank you,

Marc Hirdes wrote:

Hi,

I currently struggle with hreflang and tracking params. They will not be shown in the hreflang, but then they will have a lost connection. hreflang should allways refer to each other. Otherwise there will be shown an error in the Google Search Console.

my question is, if we have page like for example /home/.

The canonical is
https://www.domain.org/home/

The href-lang is
<link rel="alternate" hreflang="de" href="https://www.domain.org/de/start/"/>

Everything is ok. On https://www.domain.org/de/start/

the canonical would be
https://www.domain.org/de/start/
and the hreflang
<link rel="alternate" hreflang="en" href="https://www.domain.org/home/"/>

If my page is in the Google index with
https://www.domain.org/home/?my_tracking_parameter=xy

Then also the canonical and href lang would be the same as above, so without the parameter. For the canonical this works, but for the hreflang not. There would be a missing conection from https://www.domain.org/de/start/ to https://www.domain.org/home/?my_tracking_parameter=xy

Therefore the hreflang sould not be set if the canonical is a different than the current URL. But then we have the challange with the page cache. So https://www.domain.org/home/?myparameter=xy would have the same cached page as https://www.domain.org/home/.

My only solution for this problem would be to use the sitemap.xml for hreflang instead of the HTML head, as it is shown here https://support.google.com/webmasters/answer/189077#sitemap

What do you think about?

#7 Updated by Richard Haeser about 2 months ago

  • Parent task set to #89797

#8 Updated by Richard Haeser about 2 months ago

  • Tracker changed from Task to Bug

The tracking params, I don't get it: you don't want to add a hreflang nor canonical including your tracking params if it does not change your content because you will tell google to index 2 URL's with the same content.

If you are talking about the combination of canonical url in combination with hreflang, than we have a bug indeed.

<link rel="alternate" hreflang="en-US" href="https://core.ddev.site/canonicals/canonical-to-external-and-having-translations"/>
<link rel="alternate" hreflang="en-US" href="https://core.ddev.site/nl/canonicals/canonical-naar-externe-site-en-met-vertalingen"/>
<link rel="alternate" hreflang="x-default" href="https://core.ddev.site/canonicals/canonical-to-external-and-having-translations"/>

<link rel="canonical" href="https://www.richardhaeser.com"/>

This is not correct indeed, but has nothing to do with tracking params. I have created #89878 for this.

#9 Updated by Richard Haeser about 2 months ago

  • Related to Bug #89878: Hreflang links not using canonical urls added

#10 Updated by Andreas Kiessling about 2 months ago

The problem is, that the addQueryString option just adds all the params it can find.
Marc suggests to disable the hreflang generation, if the requesting url is already wrong. This could be a workaround, but i'd opt for cleaning up the typolink mess in that case.

We have this problem as well in multiple projects: to reproduce, simply clear the cache and request a page with ?foo=bar
The HrefLangGenerator uses the LanguageMenuProcessor -> addQueryString kicks in and screws up the links.
I also get the param added to the the canonical url, maybe Marc can doublecheck his setup.

To get rid of THIS problem, i installed https://github.com/sourcebroker/urlguard: this xclasses the ContentObjectRender and extends the addQueryString stuff.
Now i have to include "simple" params like "foo=bar" that i want to keep, params in an extension namespace are automatically included. All other params are automatically dropped.

Having to manage a list of params that should not screw up your caching and link generation is just a dead end, also the hreflang entries can not simply be rendered with the LanguageMenuProcessor. See #89648 as well, that we need a better and more configurable way to control the params that should end up in these urls.

#11 Updated by Marc Hirdes about 2 months ago

The problem is not addQueryString. This can be solved via the canonicalParameter Settings.

The problem is, if a page has a hreflang, all hreflang should also point to the current page (the back links). I don't know how to describe it otherwise as in the examples above, but one last try.

If my current URL links to another page via hreflang, then on the other page has also be link to the current page.

So if my url is mypage.org/?param=abc and my hreflang is mypage.org/en/, then in theorie on the mypage.org/en/ has to be a hreflang with maypage.org/?param=abc. That is not the case, if the param=abc is not in my canonical parameter list. That is totally ok, to exlude parameters, but if this the case, then the hreflang should not be set, because there is no back link to the current page.

I hope you now get the point. Otherwise we can also write via slack.

#12 Updated by Marc Hirdes about 2 months ago

#13 Updated by Marc Hirdes about 2 months ago

Maybe this image can help you. The bold big written URLs are the current pages.
As you can see the page with param=abc points to the english version without the param. In that case theotically should also en point back to the page with the param.

The solution is to leave the hreflang, if the canonical is different to the current URL.

The third version here is the correct image. Pleae ignore the other hreflang.jpg.

#15 Updated by Richard Haeser about 1 month ago

I get your point. Only thing is that I'm not sure if this is really the best practice. Will check it with the people of Yoast.

#16 Updated by Richard Haeser about 1 month ago

BTW, if this is wrong behaviour, at least Disney is doing this wrong as well: https://disney.de/?foo=bar

#17 Updated by Richard Haeser about 1 month ago

  • Status changed from Needs Feedback to Closed

OK, I got confirmation of what I already thought. Joost de Valk confirmed that the current behaviour is the right one. The hreflang should show the canonicalized version of the URL and not include tracking parameters.

Some examples that do it like TYPO3 does it out-of-the-box:
- https://www.booking.com/ski/country/it.en-gb.html?fbclid=IwAR2A1c5rg6qqpLbwyNgX2vVDDL6ipWkKtINoVXb7Ae0eVYVh5Pwoxn_QqDY
- https://www.disneyplus.com/?fbclid=IwAR2wiNK6D4IbBI6qCSmfOEku6OKWHbqLj80YbyZWUK5jdHSPg3oyq9FG09I
- https://edition.cnn.com/2019/12/10/politics/impeachment-articles-announced/index.html?fbclid=IwAR0CPdVHehy0FPF6gujqGla0sCqiS8u0mc0A8K3ulI94MP6vXx1Cn4F6WE8

I will close this issue. If you think it should be done differently you can use the PSR-14 event that will be introduced with https://review.typo3.org/c/Packages/TYPO3.CMS/+/59059

#18 Updated by Marc Hirdes about 1 month ago

Thanks Richard for your feedback and thanks even more for the PSR-14

Also available in: Atom PDF