Page 1 of 1

Search for Text with Umlaute

Posted: Mon Jun 08, 2015 8:25 pm
by Thomas Lohrum
Example:

Note text does contain the word "öffentlich". When you search inside the editor for "offen", no match. That's correct.
However, when you do a search by the main window having "text only" enabled, the note will be found. It shouldn't, though.

Thomas

Re: Search for Text with Umlaute

Posted: Tue Jun 09, 2015 12:40 am
by usbpoweredfridge
Hi Thomas,

Can't reproduce here. I copied your word into a note, and then did two tests:
1. Inside the note editor, searched for 'offen'. Word not found, which is correct as you point out.
2. Inside the notes list, changed the search to 'text only' and then searched for 'offen'. Nothing found, which is also correct, but obviously does not match your results.

This is with 2.9 Beta 3 on an English Windows 7 64bit. Maybe it is related to the language settings in Windows?

Chris

Re: Search for Text with Umlaute

Posted: Tue Jun 09, 2015 12:26 pm
by CintaNotes Developer
Hi Thomas,
I can't reproduce this either in 2.9b3. Could you maybe send me the offending note XML? thanks

Re: Search for Text with Umlaute

Posted: Tue Jun 09, 2015 5:18 pm
by Thomas Lohrum
CintaNotes Developer wrote:Hi Thomas,
I can't reproduce this either in 2.9b3. Could you maybe send me the offending note XML? thanks

Send by mail.

Re: Search for Text with Umlaute

Posted: Wed Jun 10, 2015 10:27 am
by CintaNotes Developer
Thanks, confirmed - could reproduce.
Passed on to Denis, most probably will be fixed in 2.9.1

Re: Search for Text with Umlaute

Posted: Wed Jun 17, 2015 10:22 am
by CintaNotes Developer
Hi Thomas,

after some research we discovered that this behavior is deeply engrained into the Unicode
library that we use, and changing this behavior would be difficult. Would it be acceptable
to do the opposite and make the editor search behave like the notes list search?

Re: Search for Text with Umlaute

Posted: Wed Jun 17, 2015 2:53 pm
by gunars
There was some discussion of what I think is the same issue in:

http://cintanotes.com/forum/viewtopic.php?f=3&t=1774&hilit=+accented#p7958
http://cintanotes.com/forum/viewtopic.php?f=3&t=1774&hilit=+accented#p7962

where Alex said:
I've done some research on the issue, and the problem is quite simple here: SQLite's LIKE operator ignores accented characters and treats them like 'normal' letters: 'ā' gets treated as 'a' etc.
So every time a LIKE search is used (the conditions are different for each note's field), this problem manifests itself.

When SIW is on, Tags and Link search use the LIKE operator => ditto.

Title, Text and Remarks field use FTS search (MATCH operator) - hence no problem. (But only when SIW=off).

Re: Search for Text with Umlaute

Posted: Thu Jun 18, 2015 8:26 am
by CintaNotes Developer
Thanks for the links, Gunars!

Yes, now we went a bit deeper through SQLite to the Unicode library.
However the question remains open - would it be safe to change how the LIKE operator works?
I'm sure that there exist languages where search shouldn't make distinction between accented
a, o, u characters and unaccented ones.

Re: Search for Text with Umlaute

Posted: Sat Jun 20, 2015 2:21 pm
by Thomas Lohrum
CintaNotes Developer wrote:Hi Thomas, after some research we discovered that this behavior is deeply engrained into the Unicode library that we use, and changing this behavior would be difficult. Would it be acceptable
to do the opposite and make the editor search behave like the notes list search?

No, this would make things worse. However, it seems like you found your way to work around the issue, since this got fixed with 2.9.1 beta :)

Thomas

Re: Search for Text with Umlaute

Posted: Mon Jun 22, 2015 10:59 am
by CintaNotes Developer
Thanks for your reply, Thomas!
Yes, we turned off the "unaccenting" feature of the library. As I said, this could have introduced bugs if for some languages
accents shouldn't influence the search behavior (does anyone know such language? Maybe Spanish, French?)