CintaNotes Developer wrote:Seems like you selected the whole page and that's why there's so much weird formatting - all those tables etc.
I selected exactly the same portion of the page as usbpoweredfridge, the difference is caused by how different the browsers send the selection to the clipboard.
It might be better to remove all line breaks from the clipboard, so the contents of the clipboard are the same regardless of the browser, and work from there.
That's what happens now. An open-source HTML parser is used, so CN operates upon a DOM, not HTML source. However, it stil needs to determine the logical places to put line breaks in. This is where the current algorithm might fall short.
Ok, but there is a difference in how the notes look depending on the browser used, so the cause of different is located
before the parser is used.
I already posted an example html, but here is an even simpler example. This is the page source:
Code: Select all
<h1>Example</h1>
<ul>
<li>One</li>
<li>Two</li>
<li>Three</li>
</ul>
The list is not the issue here, the main point is that the HTML is formatted so it is more easy to read. After the first line, there are two newlines, then after <ul> is a new line, and in front of the <li>'s are either tabs or spaces. This is very common and makes the page source more readable, but there is no other purpose.
Now, if I open this HTML with Internet Explorer (6 or 8) and examine the clipboard, in between what is in <!--StartFragment--> and end:
Code: Select all
<H1>Example</H1>
<UL>
<LI>One
<LI>Two
<LI>Three </LI></UL>
Notice these are a few lines. After </H1> is a newline, and after every other line as well. IE places a space there instead of </li> except for the last one. A clipped note in CN contains two empty bullets, 'One,' an empty bullet and Two and Three, somehow no empty bullet there.
The clipboard with Firefox:
Code: Select all
<h1>Example</h1>
<ul>
<li>One</li>
<li>Two</li>
<li>Three</li>
</ul>
It is exactly same as the page source, including all the newlines, spaces and tabs. The note clipped from Firefox is the worst looking one.
The clipboard with Vivaldi (probably the same as Chrome, the new Opera, etc.):
Code: Select all
<h1 style="color: rgb(0, 0, 0); font-family: "Times New Roman"; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;">Example</h1><ul style="color: rgb(0, 0, 0); font-family: "Times New Roman"; font-size: medium; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><li>One</li><li>Two</li><li>Three</li></ul>
Notice everything is on
one line. Most notes clipped from there look nearly perfect (the simple example is perfect), but judging from the announcement clipping example above it only doesn't know:
1) a few missing spaces, the texts without spaces are in different tags but there is no space 'inside' the tag (where the contents are), in the announcement examples above it was probably the <dd>'s but on other sites this is also true for table tags, and:
2) at the end of the bulleted list, there is always a <br> here, but that might be an exception for few sites.
Also there are a few Webkit browsers, the clipboard:
Code: Select all
<span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><h1>Example</h1><ul><li>One</li><li>Two</li><li>Three</li></ul></span>
Also one one line, the clipped note is perfect. So my finding is:
--> if the HTML is
one line on the clipboard, the clipped note turns out
nearly perfect.
If the HTML is spread out over multiple lines on the clipboard, including perhaps tabs and spaces, like Internet Explorer and Firefox do, it looks considerably worse.
So, perhaps, change the clipboard
before sending it to the HTML parser you are using.
If the example was like this:
Code: Select all
<h1>Example</h1><ul><li>One</li><li>Two</li><li>Three</li></ul>
on just one line, it would clip fine from Firefox. For Internet Explorer (if it hasn't changed with the newer versions), it makes newlines on the clipboard even if they weren't in the source, so the clipboard would need some editing anyway.
In the first example I posted somewhere above here I also demonstrated a problem with the tag order, like <tag1><tag2></tag1></tag2>, perhaps this is only with links, but that is a minor issue.
Again, it is nearly perfect, thank you for adding this to CN.