I have a stack of several hundred HTML files with a root HTML file which links to several HTML sub files which in turn link to several other HTML sub sub files and so on through several levels.
There are also a lot of graphics files which are called up by the HTML files.
I'd like to convert this to a single HTML file with the content of all of the individual HTML files in sequence in the correct order complete with graphics and formatting.
Unfortunately the files have names rather than chapter numbers, and the folder structure on the disc doesn't necessarily match the structure required for the final composite file, So I cant just join them as they'd end up out of sequence - the software needs to follow the hyperlinks in each HTML file to work out what to add where.
Each HTML file is only linked once.
The files are a mixture of *.htm, *.html file extensions
The software would need restrict itself to files in the root folder and its sub folders to prevent it from seeing a link to an external site and trying to download the internet.
It would need to strip out the headers and footers from the individual HTML files (except the first and last) so that the resulting composite file displays correctly.
So basically an HTML joining programme with some way of parsing the files to automatically work out the correct sequence.
Do any of you know of any software, which I can point at the root HTML file and then let it add each of the sub files in the correct order and generate a single output file.
Alternatively is there a programmer here who would like to have a go at creating such software?