Wrong results when searching in tags

Peter7885
Posts: 21
Joined: Sat Jan 16, 2016 4:08 pm
Contact:

Wrong results when searching in tags

Postby Peter7885 » Fri Jan 20, 2017 3:05 pm

I am not sure if this is a bug but it is something that sometimes provides wrong search results. For example, let say i have three notes: the first note contain the word 'blue', the second the word 'black' and the third the word 'purple'. The third note has two word tag assigned to it - 'new-color'. Now, if the search is set to search in all notes and i type 'purple' it filters the third note with the word 'purple', but if i start to type the first word of the tag 'new' after 'purple' it hides the third note and doesn't display any results.
date
Posts: 243
Joined: Sat Aug 01, 2015 5:15 am
Contact:

Re: Wrong results when searching in tags

Postby date » Fri Jan 20, 2017 4:29 pm

Click 'Search across field boundaries,' then that note will be found.

http://cintanotes.com/help/#finding-4
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Wrong results when searching in tags

Postby CintaNotes Developer » Fri Jan 20, 2017 6:11 pm

date is right
Alex
Peter7885
Posts: 21
Joined: Sat Jan 16, 2016 4:08 pm
Contact:

Re: Wrong results when searching in tags

Postby Peter7885 » Fri Jan 20, 2017 6:15 pm

Thanks, i didn't know about this option. My mistake.
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Wrong results when searching in tags

Postby CintaNotes Developer » Sat Jan 21, 2017 11:39 am

No problem! It's not your mistake, probably we should have made this option turned on by default.
But the performance suffers from it.
Alex
date
Posts: 243
Joined: Sat Aug 01, 2015 5:15 am
Contact:

Re: Wrong results when searching in tags

Postby date » Sat Jan 21, 2017 1:25 pm

The difference in performance is only marginal, if not non-existent. I measured on a notebook with some less than 10000 notes the time it took to display the search results for a search with two words:

search everywhere: on average 190 milliseconds (218, 171, 187, 203, 187)
everywhere, and across field boundaries: on average 180 millisecons (203, 187, 171, 188, 171, essentially identical)
everywhere, inside words: on average 730 ms
everywhere, inside words and across field boundaries: on average 750 ms (some identical times, so 20 ms difference doesn't mean a thing)
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Wrong results when searching in tags

Postby CintaNotes Developer » Mon Jan 23, 2017 8:50 am

Great research, thanks date!
There are however quite a few gotchas there: a lot depends on the data you have.
For example, let's say 100+ notes have ~1MB of text in their Text (or Remarks) field.
With 'search across fields', every time your search for ANYTHING you'll be searching in these 100+MB of data.
(Even if what you're searching for is in the Title.)
This will definitely be noticeable. What do you think?
Alex
date
Posts: 243
Joined: Sat Aug 01, 2015 5:15 am
Contact:

Re: Wrong results when searching in tags

Postby date » Mon Jan 23, 2017 10:30 am

'What do you think?' :) Perhaps it would be faster if the search stopped looking in a note if the title matched, (or, doesn't search the note itself where the title has matched... I don't know how it works) but it doesn't seem to do it that way. I tried, and found no evidence that the performance with or without search across fields checked makes a difference. (Of course, not considering indexes and those things, but then the time it took would not be consistent... I guess...)

100 notes (the log file actually, truncated to just over 1 MB, copied 100 times.) Title is 'log.' Both with and without 'search across fields' it takes some less than 600 ms to find the word 'log.' (Which is slow, I guess CN is not meant for those kind of texts. ;) )

The same thing, but with the title 'randomword,' which doesn't exist in the note text itself: both checked and unchecked around 600 ms.

Search only in title: between 30 and 40 ms.
Search anywhere for a word that is not there: between 30 and 40 ms. (Is that the indexing thing?)
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Wrong results when searching in tags

Postby CintaNotes Developer » Tue Jan 24, 2017 9:05 am

The way "search across fields" works is the following: all included fields are temporarily unified into one big text field, then CintaNotes searches in this virtual field using N subqueries (N = number of search terms). Other ways to do it involve separate M*N subqueries, where M = number of fields in the note.

Short-circuiting the computation like you suggest ("accept the note as soon as title matched") is possible with the M*N approach, but not possible the way CN currently works. It's highly probable though that since the Title comes first in the unified field, the rest is not searched, giving in effect the same short-circuiting.

Probabyl it might be worth it to switch over to the M*N approach. If the tests show that the performance difference is negligible, it might be possible to simply always use the "search across fields" behavior and remove the option altogether. If you need to search in only one field, you can still do it by limiting the search scope.
Alex
date
Posts: 243
Joined: Sat Aug 01, 2015 5:15 am
Contact:

Re: Wrong results when searching in tags

Postby date » Tue Jan 24, 2017 9:53 am

I realize my mistake, there was a flaw in the testing. Just now I tried the same thing with unique titles.

When 'search across fields' is disabled, the search is only faster (indeed much faster) when there are at least two unique words in one field. In any other case, the search speed is the same, or, more commonly, won't find anything.

So the performance difference is there, but the 'practical' performance difference is negligible.

Because
[*]Peter7885 expected to search anywhere by default, even with more than one word,
[*]there are not many circumstances where disabling the option is advantageous,
[*]the advantage of faster search in those circumstances can be emulated by a conscious choice by limiting the search scope
I think those are good reasons to remove the option and make search across fields default.

I think (don't know, just 'think' ;) ) more 'practical' performance enhancements can be gained by looking into possible optimisations in how CN handles (nested) tags, and tag autocompletion. Somehow with many tags (1000+) the notebook becomes noticeably slower.
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Wrong results when searching in tags

Postby CintaNotes Developer » Wed Jan 25, 2017 6:26 pm

date wrote:I think those are good reasons to remove the option and make search across fields default.

Can't think of any arguments against that. I guess it will be something that will be implemented in the new UI.

I think (don't know, just 'think' ;) ) more 'practical' performance enhancements can be gained by looking into possible optimisations in how CN handles (nested) tags, and tag autocompletion. Somehow with many tags (1000+) the notebook becomes noticeably slower.

Can you pinpoint the exact operations that cause most pain?
Alex
date
Posts: 243
Joined: Sat Aug 01, 2015 5:15 am
Contact:

Re: Wrong results when searching in tags

Postby date » Wed Jan 25, 2017 11:29 pm

CintaNotes Developer wrote:Can you pinpoint the exact operations that cause most pain?
The search operation. Too many tags noticeably slow down the search, somehow CN can find titles faster than tags.
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Wrong results when searching in tags

Postby CintaNotes Developer » Thu Jan 26, 2017 1:51 pm

Do you mean that the search is slow only when tags are part of the search scope?
Alex
date
Posts: 243
Joined: Sat Aug 01, 2015 5:15 am
Contact:

Re: Wrong results when searching in tags

Postby date » Thu Jan 26, 2017 7:15 pm

CintaNotes Developer wrote:Do you mean that the search is slow only when tags are part of the search scope?
Search is (was... I have less tags now) slower when searching everywhere if there are too many tags (1500+), and I suspect it is because of tag autocompletion.
Too many tags will always slow down search, but 'search across fields' slows down search only in less common situations.
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Wrong results when searching in tags

Postby CintaNotes Developer » Sat Jan 28, 2017 5:48 am

I highly doubt the the autocompletion is the reason, most probably it's because of searching in tag names.
(I wonder if searching in tag names is needed at all? I mean, collecting partial matches vs. matching full tag name segments).

The other question is that this high number of tags kind of defeats their purpose. But that's another topic ;)
Alex
date
Posts: 243
Joined: Sat Aug 01, 2015 5:15 am
Contact:

Re: Wrong results when searching in tags

Postby date » Sat Jan 28, 2017 10:50 am

CintaNotes Developer wrote:I highly doubt the the autocompletion is the reason, most probably it's because of searching in tag names.
(I wonder if searching in tag names is needed at all? I mean, collecting partial matches vs. matching full tag name segments).

Ok, but searching in 15 kb of text + id's shouldn't be that big of a deal? Or does it combine all tags for every note, so there are many duplicates.

The other question is that this high number of tags kind of defeats their purpose. But that's another topic ;)

Understand, but for now, it works for me to compare and eliminate, but it's indeed not practical for the most common uses. Also, you can make search-as-you-type slower (like 100 ms before searching) or optional to accomodate for larger db's. But you can save those ideas for when you make CintaNotes Enterprise.
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Wrong results when searching in tags

Postby CintaNotes Developer » Fri Feb 03, 2017 9:07 am

date wrote:Ok, but searching in 15 kb of text + id's shouldn't be that big of a deal? Or does it combine all tags for every note, so there are many duplicates.


Probably some unnecessary inefficiency in the construction of the main SQL query. Will need to analyze it to see where exactly the slowdown is coming from. Are you 100% sure that the number of tags causes slowdown, and not, say, number or size of notes?
Have you tried making a copy of the DB but without tags?

Understand, but for now, it works for me to compare and eliminate, but it's indeed not practical for the most common uses. Also, you can make search-as-you-type slower (like 100 ms before searching) or optional to accomodate for larger db's. But you can save those ideas for when you make CintaNotes Enterprise.

Ideally it should search in real-time. But probably not achievable in the case of unlimited data size, so a little typing buffer is a good idea.
Alex
date
Posts: 243
Joined: Sat Aug 01, 2015 5:15 am
Contact:

Re: Wrong results when searching in tags

Postby date » Fri Feb 03, 2017 6:36 pm

CintaNotes Developer wrote:Will need to analyze it to see where exactly the slowdown is coming from. Are you 100% sure that the number of tags causes slowdown, and not, say, number or size of notes? Have you tried making a copy of the DB but without tags?

Here are two db's that are exactly the same, except one has around 2000 tags: https://yadi.sk/d/DXpp3z9v3Ci2gu
The difference in search speed is very noticeable even with the bare eye.
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Wrong results when searching in tags

Postby CintaNotes Developer » Mon Feb 06, 2017 7:42 am

Thanks! Yes, I can notice the difference.
However, it's definitely not because of tag autosuggest.
Seems that the mere presense of tags in the notebook causes the slowdown.
For example, when you select "Text and title" search scope, search speed doesn't really increase much, compared to the notebook without tags.
I'll dig in to this a bit later.
Alex

Return to “Bug Reports”