ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > Post New Requests Here

IDEA: Copy text of web page while following links

(1/2) > >>

kab122:
When reading an article on the web, you often have to click
Next >
Next >
Next > Next > Next > Next > Next >

Bleh.

I would love a macro which would copy all the text on a page, append it to a text file, and then click "next" for me. It would repeat copying/clicking until "next" is not found. Don't need the images, just the text.

Sometimes the link is "next page" sometimes "continue to page 2" etc. Perhaps the macro could prompt for the target text, or use the currently highlighted text.

Thanks!

kab122:
No takers, eh?

With much mucking about, I wrote one myself. Any review comments are welcome.


--- ---; CONFIG: CHOOSE YOUR HOTKEY ( # = Winkey )
f_Hotkey = #q
f_TargetText = Next
f_Pause = Y
f_outfile = harvest.txt

; END OF CONFIGURATION SECTION

; -----------------------------------
; Documentation:
; Selects all text on web page,
; copies it to file,
; searches web page for "next" button and clicks it,
; repeat.

; -----------------------------------
; Entry script:

#SingleInstance  ; Needed since the hotkey is dynamically created.

Hotkey, %f_Hotkey%, f_Doit

MsgBox, Ready to start, hotkey is "%f_Hotkey%"

return

; -----------------------------------
; Hotkey script:

f_Doit:

WinGetActiveStats, Title, Width, Height, X, Y

; -- Header

InputBox, f_TargetText, Search Key, Enter text linking to next page (searching from bottom of page).
if ErrorLevel <> 0
return

f_Pause = Y
MsgBox, 4, , Pause per screen?
IfMsgBox, No
f_Pause = N
; Otherwise, the user picked yes.

FileDelete %f_outfile%

FileAppend Harvest of window:  , %f_outfile%
FileAppend %Title% `n, %f_outfile%

; -- Body
Loop  ; Since no number is specified with it, this is an infinite loop unless "break" or "return" is encountered inside.
{

Send, {CTRLDOWN}a{CTRLUP}
Sleep, 50

; copy
Send, {CTRLDOWN}c{CTRLUP}
Sleep, 50

; save
FileAppend `n, %f_outfile%
FileAppend ----- `n, %f_outfile%
FileAppend %clipboard%, %f_outfile%

; find
Send, {CTRLDOWN}f{CTRLUP}
WinWait, Find,
IfWinNotActive, Find, , WinActivate, Find,
WinWaitActive, Find,

; WinWait, Microsoft Internet Explorer,

;Target string:
Send, %f_TargetText%
Sleep, 50

Send, {ALTDOWN}u{ALTUP}{ENTER}
Sleep, 50

Send, {ESC}
Sleep, 50

; WinWait, autohotkey - Google Search - Microsoft Internet Explorer,
; IfWinNotActive, autohotkey - Google Search - Microsoft Internet Explorer, , WinActivate, autohotkey - Google Search - Microsoft Internet Explorer,
; WinWaitActive, autohotkey - Google Search - Microsoft Internet Explorer,

Send, {TAB}{ENTER}
Sleep, 250

; Stop if search displays not found dialog
IfWinExist, Microsoft Internet Explorer
{
WinActivate
break
}

If f_Pause = N
{
WinWaitActive, %Title%,
Sleep, 1500
}
else
{
; Let them see bottom of page
Send, {CTRLDOWN}{END}{CTRLUP}
MsgBox, 4, , Would you like to continue?
IfMsgBox, No
break
}
}
; end loop

FileAppend `n, %f_outfile%
FileAppend ----- `n, %f_outfile%
FileAppend EOF `n, %f_outfile%

MsgBox Done.
;return
ExitApp

mouser:
wow that's pretty cool - i'm going to give it a try.
maybe a nice function to add (if possible) is to print each page?

hitmark:
does it allso copy layout tags? or will it just convert the whole page into a "big" txt file?

what about a article with many pictures? or maybe frames or similar to add stuff to diffrent sides of the main text?

and i take it that it will only work with IE...

jity2:
Hello,
In case you haven't tried: try first to print the text using the print link inside the webpage article (not the EDIT/Print option of your browser). It often gather all the text at once! This is ok for many articles found on online journals.
Hope this helps, :)
Jity

Navigation

[0] Message Index

[#] Next page

Go to full version