Here I present to you a new command-line tool called "xml2txtdir".
Usage is very simple. Say you have an XML file with notes "D:\myexport.xml" (produced by CintaNotes XML export), and want to turn it into a bunch of txt files in the "D:\txt" directory.
You go to the folder where this tool is located and issue the following command:
xml2txtdir D:\myexport.xml D:\txt
By default the TXT files will have UTF-8 encoding, but you can specify a different encoding with the
"-e" parameter, as follows:
xml2txtdir D:\myexport.xml D:\txt -e utf-16
You can also use "ascii", "windows-1251" etc. encodings, but any non-representable characters will be turned into "?", so I recommend using either utf-8 or utf-16.
Known limitations: text formatting tags like <b> and <i> are not removed (also sometimes it can be of advantage).
And for completeness/tweaking, the source code of the Python script:
Code: Select all
import argparse as ap
import os, os.path
import xml.dom.minidom as xml
import time
VERSION = "1.0"
GREETING = "CintaNotes TXT folder exporter V%s.\n" % VERSION
def main():
print(GREETING)
argsParser = createArgsParser()
args = argsParser.parse_args()
print('Processing..')
count = xmlToTxtFiles(args.inputXML, args.outputFolder, args.encoding)
print('\n-> Written %d file(s).' % count)
def createArgsParser():
parser = ap.ArgumentParser(description = "Converts CintaNotes XML file into a set of TXT files, one TXT for each note.")
parser.add_argument("inputXML", help = 'Source XML file', type = str)
parser.add_argument("outputFolder", help = 'Folder to write TXT files to', type = str)
parser.add_argument("-e", "--encoding", dest="encoding",
help = 'Encoding of TXT files: utf-8 (default) or utf-16', type = str, default = 'utf-8')
return parser
def xmlToTxtFiles(inputXML, outputFolder, encoding):
doc = xml.parse(inputXML)
notes = doc.getElementsByTagName('note')
count = 0
for note in notes:
xmlToTxtFile(note, outputFolder, encoding)
count += 1
return count
def xmlToTxtFile(note, outputFolder, encoding):
filename = genFileName(note)
contents = genFileContents(note)
with open(os.path.join(outputFolder, filename), "wb") as f:
f.write(contents.encode(encoding, errors="replace"))
def genFileContents(note):
title = note.attributes['title'].value
text = note.firstChild.data if note.firstChild else ''
return title + '\n\n' + text
def genFileName(note):
created = note.attributes['created'].value
tags = makeValidFilePath(note.attributes['tags'].value)
title = makeValidFilePath(note.attributes['title'].value[:50])
return '%s [%s] %s.txt' % (created, tags, title)
def makeValidFilePath(s):
s = s.replace('/', '\u2044')
return ''.join(x for x in s if x.isalnum() or x in ' -.{}#@$%^&!_()[]\u2044')
if __name__ == '__main__': main()