Incorrect character count reporting

Mark S.
Posts: 81
Joined: Thu Aug 09, 2012 3:39 pm
Contact:

Incorrect character count reporting

Postby Mark S. » Tue Aug 20, 2013 9:03 pm

The properties dialog appears to report approximately twice the number of characters known to be in the actual note. In the case of a single-character note, the dialog rounds UP to the nearest 1k !

Mark S. (not the other one)
v2.2
Thomas Lohrum
Posts: 1324
Joined: Tue Mar 08, 2011 11:15 am

Re: Incorrect character count reporting

Postby Thomas Lohrum » Tue Aug 20, 2013 9:08 pm

Mark S. wrote:The properties dialog appears to report approximately twice the number of characters known to be in the actual note. In the case of a single-character note, the dialog rounds UP to the nearest 1k !

Mark S. (not the other one)
v2.2

Hi Mark,

this is probably due to storing text as unicode data, which can take two bytes per character.

Thomas
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Incorrect character count reporting

Postby CintaNotes Developer » Wed Aug 21, 2013 6:23 am

Hi Mark,

yes Thomas is right, CintaNotes doesn't report note size in characters, it reports it in KBytes (the units are stated explicitly after the number).
Alex
Mark S.
Posts: 81
Joined: Thu Aug 09, 2012 3:39 pm
Contact:

Re: Incorrect character count reporting

Postby Mark S. » Wed Aug 21, 2013 4:19 pm

I guess I jumped to the conclusion that the number displayed would convey something useful. I doubt the number of storage bytes is of interest to typical users.

Anyone who would like to write a report or story in CN would be interested in (1) The number of words and (2) the number of characters. Must be on the map ... let's see ... wow, hard to believe -- only 6 votes. Ok, added my vote. Its one of those things you're not interested in until you ... you really are.

Thanks!
Mark
User avatar
CintaNotes Developer
Site Admin
Posts: 5001
Joined: Fri Dec 12, 2008 4:45 pm
Contact:

Re: Incorrect character count reporting

Postby CintaNotes Developer » Wed Aug 28, 2013 6:50 am

Mark,
thanks for your opinion. Indeed, number of characters and words seems to be more useful information than pure number of bytes.
Implementing character count is obvious and will be done already in the next version.
Word count however is highly language-dependent and I wonder if it can be done properly for east-asian languages (which put words together with no spaces) without becoming dependent on a huge library like ICE (19MB!).
However a naïve implementation (splitting the text into words by spaces and punctuation) could be added as well.
Alex
Thomas Lohrum
Posts: 1324
Joined: Tue Mar 08, 2011 11:15 am

Re: Incorrect character count reporting

Postby Thomas Lohrum » Wed Aug 28, 2013 11:18 am

Mark S. wrote:I doubt the number of storage bytes is of interest to typical users.

To me it is of interest, to know the size the note is taking in the database.

Thomas
Mark S.
Posts: 81
Joined: Thu Aug 09, 2012 3:39 pm
Contact:

Re: Incorrect character count reporting

Postby Mark S. » Wed Aug 28, 2013 5:05 pm

CintaNotes Developer wrote:Mark,
thanks for your opinion. Indeed, number of characters and words seems to be more useful information than pure number of bytes.
Implementing character count is obvious and will be done already in the next version.
Word count however is highly language-dependent and I wonder if it can be done properly for east-asian languages (which put words together with no spaces) without becoming dependent on a huge library like ICE (19MB!).
However a naïve implementation (splitting the text into words by spaces and punctuation) could be added as well.

Wow -- run-on text! Ancient western languages used to lack the space as well. I had to translate something like that in a language finals test. I thought it was a bit unfair of the prof, since we hadn't had to deal with any text like that for the entire semester!

It seems to me that a word count feature for the east-asian languages would be in a separate module, and the user would decide if its worth it to install it. I'm also wondering if there isn't a standard library at the operating system level for these languages. It seems like a problem that must come up all the time.

"Naive" works for me -- I was looking through some old books, and found "naive" word counting routines in C going back to the early 90s.

Thanks!
Mark
Mark S.
Posts: 81
Joined: Thu Aug 09, 2012 3:39 pm
Contact:

Re: Incorrect character count reporting

Postby Mark S. » Wed Aug 28, 2013 5:07 pm

Thomas Lohrum wrote:
Mark S. wrote:I doubt the number of storage bytes is of interest to typical users.

To me it is of interest, to know the size the note is taking in the database.

Thomas

Hi Thomas,

I'm not sure you qualify as a typical user :)

What use do you make of this? Are you storing massive amounts of info in a single note?

Thanks --
Mark
Thomas Lohrum
Posts: 1324
Joined: Tue Mar 08, 2011 11:15 am

Re: Incorrect character count reporting

Postby Thomas Lohrum » Wed Aug 28, 2013 6:25 pm

Mark S. wrote:What use do you make of this? Are you storing massive amounts of info in a single note?

Hi Mark S.,

i like to know the actual size a note takes in the database. Out of curiousity and to control my data. Size completes the meta-data, just as creation and modification date does. It can also help to decide, which old notes to delete.

Thomas

Return to “Bug Reports”