DonationCoder.com Software > Post New Requests Here
IDEA: Copy text of web page while following links
kab122:
When reading an article on the web, you often have to click
Next >
Next >
Next > Next > Next > Next > Next >
Bleh.
I would love a macro which would copy all the text on a page, append it to a text file, and then click "next" for me. It would repeat copying/clicking until "next" is not found. Don't need the images, just the text.
Sometimes the link is "next page" sometimes "continue to page 2" etc. Perhaps the macro could prompt for the target text, or use the currently highlighted text.
Thanks!
kab122:
No takers, eh?
With much mucking about, I wrote one myself. Any review comments are welcome.
--- ---; CONFIG: CHOOSE YOUR HOTKEY ( # = Winkey )
f_Hotkey = #q
f_TargetText = Next
f_Pause = Y
f_outfile = harvest.txt
; END OF CONFIGURATION SECTION
; -----------------------------------
; Documentation:
; Selects all text on web page,
; copies it to file,
; searches web page for "next" button and clicks it,
; repeat.
; -----------------------------------
; Entry script:
#SingleInstance ; Needed since the hotkey is dynamically created.
Hotkey, %f_Hotkey%, f_Doit
MsgBox, Ready to start, hotkey is "%f_Hotkey%"
return
; -----------------------------------
; Hotkey script:
f_Doit:
WinGetActiveStats, Title, Width, Height, X, Y
; -- Header
InputBox, f_TargetText, Search Key, Enter text linking to next page (searching from bottom of page).
if ErrorLevel <> 0
return
f_Pause = Y
MsgBox, 4, , Pause per screen?
IfMsgBox, No
f_Pause = N
; Otherwise, the user picked yes.
FileDelete %f_outfile%
FileAppend Harvest of window: , %f_outfile%
FileAppend %Title% `n, %f_outfile%
; -- Body
Loop ; Since no number is specified with it, this is an infinite loop unless "break" or "return" is encountered inside.
{
Send, {CTRLDOWN}a{CTRLUP}
Sleep, 50
; copy
Send, {CTRLDOWN}c{CTRLUP}
Sleep, 50
; save
FileAppend `n, %f_outfile%
FileAppend ----- `n, %f_outfile%
FileAppend %clipboard%, %f_outfile%
; find
Send, {CTRLDOWN}f{CTRLUP}
WinWait, Find,
IfWinNotActive, Find, , WinActivate, Find,
WinWaitActive, Find,
; WinWait, Microsoft Internet Explorer,
;Target string:
Send, %f_TargetText%
Sleep, 50
Send, {ALTDOWN}u{ALTUP}{ENTER}
Sleep, 50
Send, {ESC}
Sleep, 50
; WinWait, autohotkey - Google Search - Microsoft Internet Explorer,
; IfWinNotActive, autohotkey - Google Search - Microsoft Internet Explorer, , WinActivate, autohotkey - Google Search - Microsoft Internet Explorer,
; WinWaitActive, autohotkey - Google Search - Microsoft Internet Explorer,
Send, {TAB}{ENTER}
Sleep, 250
; Stop if search displays not found dialog
IfWinExist, Microsoft Internet Explorer
{
WinActivate
break
}
If f_Pause = N
{
WinWaitActive, %Title%,
Sleep, 1500
}
else
{
; Let them see bottom of page
Send, {CTRLDOWN}{END}{CTRLUP}
MsgBox, 4, , Would you like to continue?
IfMsgBox, No
break
}
}
; end loop
FileAppend `n, %f_outfile%
FileAppend ----- `n, %f_outfile%
FileAppend EOF `n, %f_outfile%
MsgBox Done.
;return
ExitApp
mouser:
wow that's pretty cool - i'm going to give it a try.
maybe a nice function to add (if possible) is to print each page?
hitmark:
does it allso copy layout tags? or will it just convert the whole page into a "big" txt file?
what about a article with many pictures? or maybe frames or similar to add stuff to diffrent sides of the main text?
and i take it that it will only work with IE...
jity2:
Hello,
In case you haven't tried: try first to print the text using the print link inside the webpage article (not the EDIT/Print option of your browser). It often gather all the text at once! This is ok for many articles found on online journals.
Hope this helps, :)
Jity
Navigation
[0] Message Index
[#] Next page
Go to full version