Page 1 of 1

Incorrect character count reporting

Posted: Tue Aug 20, 2013 9:03 pm
by Mark S.
The properties dialog appears to report approximately twice the number of characters known to be in the actual note. In the case of a single-character note, the dialog rounds UP to the nearest 1k !

Mark S. (not the other one)
v2.2

Re: Incorrect character count reporting

Posted: Tue Aug 20, 2013 9:08 pm
by Thomas Lohrum
Mark S. wrote:The properties dialog appears to report approximately twice the number of characters known to be in the actual note. In the case of a single-character note, the dialog rounds UP to the nearest 1k !

Mark S. (not the other one)
v2.2

Hi Mark,

this is probably due to storing text as unicode data, which can take two bytes per character.

Thomas

Re: Incorrect character count reporting

Posted: Wed Aug 21, 2013 6:23 am
by CintaNotes Developer
Hi Mark,

yes Thomas is right, CintaNotes doesn't report note size in characters, it reports it in KBytes (the units are stated explicitly after the number).

Re: Incorrect character count reporting

Posted: Wed Aug 21, 2013 4:19 pm
by Mark S.
I guess I jumped to the conclusion that the number displayed would convey something useful. I doubt the number of storage bytes is of interest to typical users.

Anyone who would like to write a report or story in CN would be interested in (1) The number of words and (2) the number of characters. Must be on the map ... let's see ... wow, hard to believe -- only 6 votes. Ok, added my vote. Its one of those things you're not interested in until you ... you really are.

Thanks!
Mark

Re: Incorrect character count reporting

Posted: Wed Aug 28, 2013 6:50 am
by CintaNotes Developer
Mark,
thanks for your opinion. Indeed, number of characters and words seems to be more useful information than pure number of bytes.
Implementing character count is obvious and will be done already in the next version.
Word count however is highly language-dependent and I wonder if it can be done properly for east-asian languages (which put words together with no spaces) without becoming dependent on a huge library like ICE (19MB!).
However a naïve implementation (splitting the text into words by spaces and punctuation) could be added as well.

Re: Incorrect character count reporting

Posted: Wed Aug 28, 2013 11:18 am
by Thomas Lohrum
Mark S. wrote:I doubt the number of storage bytes is of interest to typical users.

To me it is of interest, to know the size the note is taking in the database.

Thomas

Re: Incorrect character count reporting

Posted: Wed Aug 28, 2013 5:05 pm
by Mark S.
CintaNotes Developer wrote:Mark,
thanks for your opinion. Indeed, number of characters and words seems to be more useful information than pure number of bytes.
Implementing character count is obvious and will be done already in the next version.
Word count however is highly language-dependent and I wonder if it can be done properly for east-asian languages (which put words together with no spaces) without becoming dependent on a huge library like ICE (19MB!).
However a naïve implementation (splitting the text into words by spaces and punctuation) could be added as well.

Wow -- run-on text! Ancient western languages used to lack the space as well. I had to translate something like that in a language finals test. I thought it was a bit unfair of the prof, since we hadn't had to deal with any text like that for the entire semester!

It seems to me that a word count feature for the east-asian languages would be in a separate module, and the user would decide if its worth it to install it. I'm also wondering if there isn't a standard library at the operating system level for these languages. It seems like a problem that must come up all the time.

"Naive" works for me -- I was looking through some old books, and found "naive" word counting routines in C going back to the early 90s.

Thanks!
Mark

Re: Incorrect character count reporting

Posted: Wed Aug 28, 2013 5:07 pm
by Mark S.
Thomas Lohrum wrote:
Mark S. wrote:I doubt the number of storage bytes is of interest to typical users.

To me it is of interest, to know the size the note is taking in the database.

Thomas

Hi Thomas,

I'm not sure you qualify as a typical user :)

What use do you make of this? Are you storing massive amounts of info in a single note?

Thanks --
Mark

Re: Incorrect character count reporting

Posted: Wed Aug 28, 2013 6:25 pm
by Thomas Lohrum
Mark S. wrote:What use do you make of this? Are you storing massive amounts of info in a single note?

Hi Mark S.,

i like to know the actual size a note takes in the database. Out of curiousity and to control my data. Size completes the meta-data, just as creation and modification date does. It can also help to decide, which old notes to delete.

Thomas