Hello,
In case this helps : I have found a way to do that with this freeware : DNgrep
http://code.google.com/p/dngrep/I select the folder where the htm files are located. Then I click on the right icon and I write "*.htm".
Here are the regex :
1) cut everything after keyword2 :
with regex +multiline + dot as newline checked
replace
keyword2.*
with
nothing
+ hit search then hit replace
2) cut anything before keyword1 :
with regex +multiline + dot as newline checked
replace
.*?keyword1
with
nothing
+ hit search then hit replace
voila !
see ya