The problem I need to solve is this: I have lot of books that I have typeset in Framemaker 5.5.6, Pagemaker 7, or Word format. I need t o convert them to e-books: mobi, epub, etc format. I will be using calibre.exe for the conversion. E-book source is always in html. I want to use proper html/css programming so that there is a uniformity to the look & feel of the e-publications.
All of the word processing apps have an “export to html” function. Trouble is, the output html is all upper case, e.g., <P ALIGN="center">. I like to code web pages so that they comply with xhtml 1.0 strict, which means no upper case tags or attributes, among other things. (No ALIGN attribute, for instance!)
I have in mind a little app that would rest on the desktop, that I could drag and drop a file onto. Something like the good old DOS2UNIX.exe that converts line endings from CRLF to LF. No GUI, in other words.
I have started working on a little autohotkey app, but I''m not so skilled. (If someone would just help me with the code, I would be very grateful!) The code I have so far is copied below, but I consider it to be pseudocode rather than actual code.
BTW, this is exactly the type of app that is best written in assembler, because it's so simple. Why didn't anybody think of it before? --Google searches don't even come close.
Logic, in brief:
Set a couple of boolean variables to false, and a pointer to 0
Scan through the file byte by byte looking for "<" ** NOTE
If found, go into conversion mode, lowercase everything
if converting, and next character is " quote mark, temporarily stop converting
on second " quote mark, resume converting
until ">" is found
until EOF
** Since the source file is output from a word processor, all instances of “<” etc are automatically converted to their html codes < etc, so any “<” encountered will have to be the start of an html tag.
------------------------------------ code follows
; maxmem default = 64 MB. Inputfile must be less than 64MB
filecopy, test-case.html, test-case.bak, 1 ; overwrite any existing bak file
filegetsize, FSize, test-case.html ; test-case.html is my test file
; msgbox, "Size is "%FSize%
Ptr = 0 ; pointer to character position in BigStr
InMarkup = 0 ; false
QuoStr = 0 ; false
TChar = ; temp char to hold bigstr character
fileread, BigStr, test-case.html
FSize := StrLen(BigStr)
; msgbox Outside loop. Filesize: %FSize% Ptr: %Ptr% ; for debugging
if not errorlevel
{
; msgbox Not errorlevel ; for debugging
loop while (%Ptr% < %FSize%)
{
msgbox In loop while... ; for debugging
if (InMarkup)
{
if (!QuoStr)
{
if (TChar >= "A" && TChar <= "Z") {
BigStr%Ptr% := %TChar%+32 }
if (TChar==34) {
QuoStr:=1 }
if (TChar= ">") {
InMarkup:=0 }
}
if (TChar==34) {
QuoStr:=0 }
}
else
{
if (TChar = "<" && BigStr%Ptr%+1<>"!") {
InMarkup:=1 }
}
Ptr:=%Ptr%+1
}
}
FileDelete, test-case.html
FileAppend, %BigStr%, test-case.html
BigStr = ; Free the memory.
------------------------------------ code ends