Project

General

Profile

Actions

Bug #77644

closed

MySQL driver extension breaks searches with hyphens

Added by Christian Rieke over 7 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Should have
Assignee:
-
Category:
Indexed Search
Target version:
-
Start date:
2016-08-25
Due date:
% Done:

100%

Estimated time:
TYPO3 Version:
6.2
PHP Version:
Tags:
Complexity:
Is Regression:
No
Sprint Focus:

Description

If the MySQL driver extension is used, search phrases with a hyphen such as "eq-5d" will return "no results". This occurs quite often in German language sites, but also when searching for dates ("2016-25-08") or medical abbreviations (above).

The problem is probably due to the fact that MySQL itself does not handle full text searches with hyphens very well. One solution seems to be to use full text search IN BOOLEAN MODE. But that, in turn, would break relevance sorting.
My current workaround is to not use the MySQL driver at all.


Related issues 1 (0 open1 closed)

Related to TYPO3 Core - Bug #80292: Indexed search does not respect hyphens in search stringClosed2017-03-15

Actions
Actions #1

Updated by Nico de Haen over 7 years ago

I wonder why your workaround works.

I have the same problem and investigated it: if mysql full text search is disabled, the indexer builds an index by populating index_word table, but it seems the Lexer removes the hyphen completely in line 149:

foreach ($this->lexerConf['removeChars'] as $skipJoin) {
       $theWord = str_replace($this->csObj->UnumberToChar($skipJoin), '', $theWord);
}

with $this->lexerConf['removeChars'] being a hard coded array with a single value (45) which is the char value for hyphen

I checked older versions of TYPO3 and never found any word with a hyphen in word_index.

I have no clue why the hyphen is replaced, maybe since it appears in english quite rare and mostly to divide words at line breaks?

Actions #2

Updated by Nico de Haen over 7 years ago

Ok, I found why it works if the mysql driver is not used: the hyphen in the search word is also removed when comparing

Actions #3

Updated by Riccardo De Contardi about 6 years ago

I just tried a little test with 8.7.9 (note: indexed_search_mysql has been removed)

- installed indexed_search, frontend indexing:yes
- create a page with "eq-5d" in the title
- create a content element with "eq-5d" in the text
- visited the pages to index them

both were found

Actions #4

Updated by Benni Mack about 4 years ago

  • Related to Bug #80292: Indexed search does not respect hyphens in search string added
Actions #5

Updated by Gerrit Code Review over 2 years ago

  • Status changed from New to Under Review

Patch set 2 for branch 9.5 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/67617

Actions #6

Updated by Gerrit Code Review over 2 years ago

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71885

Actions #7

Updated by Gerrit Code Review over 2 years ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71885

Actions #8

Updated by Gerrit Code Review over 2 years ago

Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71885

Actions #9

Updated by Gerrit Code Review over 2 years ago

Patch set 4 for branch main of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71885

Actions #10

Updated by Gerrit Code Review almost 2 years ago

Patch set 1 for branch 11.5 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/75146

Actions #11

Updated by Stephan Großberndt almost 2 years ago

  • Status changed from Under Review to Resolved
  • % Done changed from 0 to 100
Actions #12

Updated by Benni Mack over 1 year ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF