ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > Coding Snacks

Page data harvest

<< < (2/2)

jpcook:
Thank you also, TJ. Taking a look at WGETand HTTrack..... great packages. 
I do appreciate your guys helping me!  :Thmbsup:

crono:
Hi,

HTML is often poorly written. It could be hard to parse if, for example, end-tags are missing. I highly recommend to "sanitize" it with HTML-Tidy before start parsing. Set the "output-xml" option to get well formed XML which could be parsed with any XML-Parser-Libary (DOM/SAX) - this is often easier than using RegEx.

Navigation

[0] Message Index

[*] Previous page

Go to full version