Search for Text with Umlaute

Thomas Lohrum
Posts: 1324
Joined: Tue Mar 08, 2011 11:15 am

Search for Text with Umlaute

Postby Thomas Lohrum » Mon Jun 08, 2015 8:25 pm

Example:

Note text does contain the word "öffentlich". When you search inside the editor for "offen", no match. That's correct.
However, when you do a search by the main window having "text only" enabled, the note will be found. It shouldn't, though.

Thomas
User avatar
usbpoweredfridge
Posts: 410
Joined: Fri Jan 17, 2014 11:08 pm
Contact:

Re: Search for Text with Umlaute

Postby usbpoweredfridge » Tue Jun 09, 2015 12:40 am

Hi Thomas,

Can't reproduce here. I copied your word into a note, and then did two tests:
1. Inside the note editor, searched for 'offen'. Word not found, which is correct as you point out.
2. Inside the notes list, changed the search to 'text only' and then searched for 'offen'. Nothing found, which is also correct, but obviously does not match your results.

This is with 2.9 Beta 3 on an English Windows 7 64bit. Maybe it is related to the language settings in Windows?

Chris
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Search for Text with Umlaute

Postby CintaNotes Developer » Tue Jun 09, 2015 12:26 pm

Hi Thomas,
I can't reproduce this either in 2.9b3. Could you maybe send me the offending note XML? thanks
Alex
Thomas Lohrum
Posts: 1324
Joined: Tue Mar 08, 2011 11:15 am

Re: Search for Text with Umlaute

Postby Thomas Lohrum » Tue Jun 09, 2015 5:18 pm

CintaNotes Developer wrote:Hi Thomas,
I can't reproduce this either in 2.9b3. Could you maybe send me the offending note XML? thanks

Send by mail.
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Search for Text with Umlaute

Postby CintaNotes Developer » Wed Jun 10, 2015 10:27 am

Thanks, confirmed - could reproduce.
Passed on to Denis, most probably will be fixed in 2.9.1
Alex
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Search for Text with Umlaute

Postby CintaNotes Developer » Wed Jun 17, 2015 10:22 am

Hi Thomas,

after some research we discovered that this behavior is deeply engrained into the Unicode
library that we use, and changing this behavior would be difficult. Would it be acceptable
to do the opposite and make the editor search behave like the notes list search?
Alex
gunars
Posts: 234
Joined: Fri Nov 08, 2013 5:35 am
Contact:

Re: Search for Text with Umlaute

Postby gunars » Wed Jun 17, 2015 2:53 pm

There was some discussion of what I think is the same issue in:

http://cintanotes.com/forum/viewtopic.php?f=3&t=1774&hilit=+accented#p7958
http://cintanotes.com/forum/viewtopic.php?f=3&t=1774&hilit=+accented#p7962

where Alex said:
I've done some research on the issue, and the problem is quite simple here: SQLite's LIKE operator ignores accented characters and treats them like 'normal' letters: 'ā' gets treated as 'a' etc.
So every time a LIKE search is used (the conditions are different for each note's field), this problem manifests itself.

When SIW is on, Tags and Link search use the LIKE operator => ditto.

Title, Text and Remarks field use FTS search (MATCH operator) - hence no problem. (But only when SIW=off).
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Search for Text with Umlaute

Postby CintaNotes Developer » Thu Jun 18, 2015 8:26 am

Thanks for the links, Gunars!

Yes, now we went a bit deeper through SQLite to the Unicode library.
However the question remains open - would it be safe to change how the LIKE operator works?
I'm sure that there exist languages where search shouldn't make distinction between accented
a, o, u characters and unaccented ones.
Alex
Thomas Lohrum
Posts: 1324
Joined: Tue Mar 08, 2011 11:15 am

Re: Search for Text with Umlaute

Postby Thomas Lohrum » Sat Jun 20, 2015 2:21 pm

CintaNotes Developer wrote:Hi Thomas, after some research we discovered that this behavior is deeply engrained into the Unicode library that we use, and changing this behavior would be difficult. Would it be acceptable
to do the opposite and make the editor search behave like the notes list search?

No, this would make things worse. However, it seems like you found your way to work around the issue, since this got fixed with 2.9.1 beta :)

Thomas
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Search for Text with Umlaute

Postby CintaNotes Developer » Mon Jun 22, 2015 10:59 am

Thanks for your reply, Thomas!
Yes, we turned off the "unaccenting" feature of the library. As I said, this could have introduced bugs if for some languages
accents shouldn't influence the search behavior (does anyone know such language? Maybe Spanish, French?)
Alex

Return to “Bug Reports”