Welcome Guest.   Make a donation to an author on the site August 22, 2014, 10:47:27 AM  *

Please login or register.
Or did you miss your validation email?


Login with username and password (forgot your password?)
Why not become a lifetime supporting member of the site with a one-time donation of any amount? Your donation entitles you to a ton of additional benefits, including access to exclusive discounts and downloads, the ability to enter monthly free software drawings, and a single non-expiring license key for all of our programs.


You must sign up here before you can post and access some areas of the site. Registration is totally free and confidential.
 
Check out and download the GOE 2007 Freeware Challenge productivity tools.
   
   Forum Home   Thread Marks Chat! Downloads Search Login Register  
Pages: [1]   Go Down
  Reply  |  New Topic  |  Print  
Author Topic: MwImporter php script - Batch import html site into MediaWiki - v2.0 - 5/26/10  (Read 10722 times)
mouser
First Author
Administrator
*****
Posts: 33,256



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« on: May 12, 2010, 07:09:42 PM »

Official web page here: http://www.donationcoder....ser/mwimporter/index.html

JavaJones and I have been working on an Open Source php script thats aid in batch converting and importing an entire directory of html files and images into a MediaWiki site.

Useful if you want to convert your static page site into MediaWiki.

This builds on a number of existing tools, including HTML WikiConverter perl scripts, and the php importing tools that come with MediaWiki.

What it adds is a bunch of nice helper functions that facilitate massaging the html prior to conversion, and wiki text post conversion, handling filename clashes, relative links between pages, and the handling of recursive directories of both static pages and images, using php classes that are easy to extend.

We will be posting a release for anyone who might find this useful soon, though i have my doubts as to whether this isn't the empty set (let me know if not!).



NOTE: I should add that really this is a generic set of php classes for "converting/importing" a recursive directory of files from one format into another, which includes derived classes specifically for converting and importing from html files into a MediaWiki site; but the base classes could serve as a useful starting point for anyone who wants php code to recursively discover and batch process/convert a directory of files from any format to any other format, with helper functions for handling commandline options, temp files, file matching patterns, etc.



DOWNLOAD v2.0 (5/26/10):
http://www.donationcoder....mwimporter/MwImporter.zip

LICENSE: Open Source

AUDIENCE: This code is intended for experienced users who aren't afraid of getting their hands dirty; if you are expecting a super friendly idiot-proof tool you need to look elsewhere (bearing in mind there is nothing else at the current time that will do this stuff).
« Last Edit: June 02, 2010, 05:23:49 AM by mouser » Logged
JavaJones
Review 2.0 Designer
Charter Member
***
Posts: 2,521



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #1 on: May 12, 2010, 07:17:35 PM »

Don't forget it can import image maps, too (targeting the MW imagemap plugin). Of course not many people use image maps these days I suppose... cheesy

- Oshyan
Logged

The New Adventures of Oshyan Greene - A life in pictures...
mouser
First Author
Administrator
*****
Posts: 33,256



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #2 on: May 17, 2010, 12:47:01 PM »

Ok finished up adding the features i wanted to add -- you can now import a large deep directory of pages and images, and it will import properly all of them, creating unique names when the file and page titles are not globally unique (nesc. due to flatness of a wiki), and fixing up all links between pages.

Hopefully this will be useful for someone who is interested in migrating a site from static to wiki format.  I will upload soon.
Logged
mouser
First Author
Administrator
*****
Posts: 33,256



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #3 on: May 26, 2010, 07:38:39 PM »

i have uploaded a first version.. i don't anticipate much use for this, but it's there for those who want to try it.
Logged
RayOfLight
Participant
*
Posts: 3


View Profile Give some DonationCredits to this forum member
« Reply #4 on: August 06, 2010, 05:29:14 AM »

This will be very usefull for us. We will be adding .chm files to our wiki after converting them to html.
I'll try to figure out how it works and then test it...crossing my fingers cheesy

I'm reading in on it and have already some questions.
First one, what does: --mw_dircat_sepdir
It's not (yet) in the help file

Also, because we're not running the wiki on our own server, I'm searching for a way to set everything up in a way that I can ask our webhost to run the necessary command to start it...if that is at all possible...
« Last Edit: August 06, 2010, 06:07:26 AM by RayOfLight » Logged
mouser
First Author
Administrator
*****
Posts: 33,256



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #5 on: August 06, 2010, 08:15:30 AM »

I think i need to upload the latest version with some recent changes.

Here is new readme lines for mw_dircat_sepdir and others:

Quote
--mw_category CATEGORYSTR : adds [[CATEGORY CATEGORYSTR]] tag to EVERY page
--mw_dircat_top STR: if specified then this STR is used as the first part of any directory-based category below
--mw_dircat_subdir: if specified, adds a [[CATEGORY something]] where something is the most recent subdirectory as the category (or $mw_dircat_top if at top)
--mw_dircat_fullpath: if specified, adds the full path like [[CATEGORY a.b.c]] for subdirectory depth
--mw_dircat_sepdir : if specified then adds a separate [[CATEGORY a]] [[CATEGORY b]] like string for each subdirectory as a separate category

i will upload latest version over the weekend.
Logged
JavaJones
Review 2.0 Designer
Charter Member
***
Posts: 2,521



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #6 on: August 07, 2010, 04:50:48 PM »

Very glad to see someone has a use for this! Hopefully once we publicize it more widely (and document it more fully), it will be of even broader interest.

RayOfLight, your use sounds very similar to what this was originally developed for. I was converting from Dr. Explain based help files to a wiki to allow for collaborative editing. And for that it worked pretty well, though I ultimately had to trim down the import quite a lot for other reasons. But the system is capable of quite complex (and customizable) import if desired, including image maps.

- Oshyan
Logged

The New Adventures of Oshyan Greene - A life in pictures...
RayOfLight
Participant
*
Posts: 3


View Profile Give some DonationCredits to this forum member
« Reply #7 on: August 10, 2010, 06:07:42 AM »

I'm still working on a way to set this up so my webhost can do with only one command.
It's one of many things on the todo-list for our fresh started wiki...it's more work than we anticipated...
Logged
mouser
First Author
Administrator
*****
Posts: 33,256



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #8 on: August 10, 2010, 08:39:49 AM »

i suggest that you set up a version on your local pc and test it there.

doesn't matter if it's windows or linux or whatever, setup mediawiki and php, etc on your local machine, and then you can experiment with importing on your local machine at your leisure.

in fact, if you do that, then once you get the pages imported successfully you might be able to export the sql database from your local pc and import it into the server version of your wiki without running the script on your server, maybe.
Logged
RayOfLight
Participant
*
Posts: 3


View Profile Give some DonationCredits to this forum member
« Reply #9 on: August 10, 2010, 08:51:13 AM »

That might be an idea too...maybe it's best if I look into that...it's been a while that I set up local php and so on.
Logged
steppin
Participant
*
Posts: 2

View Profile Give some DonationCredits to this forum member
« Reply #10 on: November 05, 2010, 01:48:27 PM »

Hi mouser - I'm interested in running your "batch html site into MediaWiki" script on my web server, but I am running into difficulties. In August you suggested to RayOfLight that he'd be better off running it on his home computer. Is there a technical reason why this script would not run on a commercially hosted web server (Yahoo, in my case)? When I attempt to run it, it seems unable to find the necessary MW files, specifically commandLine.inc in the /maintenance directory, even though the mwdir variable is pointed directly to the right place. Is there something obvious I'm missing about why precisely doing this on a web server is not going to be possible? Thanks in advance for your thoughts; would love to get this working for converting a bunch of my genealogy web pages.
Logged
mouser
First Author
Administrator
*****
Posts: 33,256



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #11 on: November 05, 2010, 02:10:36 PM »

if your web server lets you run the php files from a commandline then it should work.
you say it can "find" the necesary MW files.. can you clarify if you are sure it can't fine them.. or if maybe the host is not letting them run?
Logged
steppin
Participant
*
Posts: 2

View Profile Give some DonationCredits to this forum member
« Reply #12 on: November 07, 2010, 08:49:52 AM »

Hi Mouser - thanks for the reply. Let me rephrase my initial question. Is it possible to modify this script so that I do not need to use the commandline to do so? My web service provider does not (as far as I can tell) allow commandline access. So I want to run it just from clicking on the script or pointing to the url where the script is found. I've been trying to modify where the script gets the information it needs, so that it does not need any input from the commandline's initial command. But maybe there's some obvious reason I'm missing that just makes this impossible. The exact error message I received was: FATAL ERROR: Could not find mediawiki file (please specify -mwdir= option): http://www.MYHOST.net/dev...intenance/commandLine.inc. But that URL is accurately where that file is stored. Thanks for any thoughts.
Logged
mouser
First Author
Administrator
*****
Posts: 33,256



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #13 on: November 07, 2010, 09:34:18 AM »

Quote
Is it possible to modify this script so that I do not need to use the commandline to do so? My web service provider does not (as far as I can tell) allow commandline access. So I want to run it just from clicking on the script or pointing to the url where the script is found.


i think you are onto the problem.. and your question is the right one..

it's been a while since i worked on mwimporter but my memory is that 1) i always intended to make it so you could do this 2) at some point i realized it wasn't going to be so simple to it.  but i can't remember if #2 is really true or why i have the vague memory of thinking it was going to be tricky.  it seems like it should be doable.
Logged
JavaJones
Review 2.0 Designer
Charter Member
***
Posts: 2,521



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #14 on: November 18, 2010, 12:39:56 AM »

I'd like to have a non-commandline version of this available at some point. Perhaps after UQ mouser? cheesy

- Oshyan
Logged

The New Adventures of Oshyan Greene - A life in pictures...
nminh09
Participant
*
Posts: 3

View Profile Give some DonationCredits to this forum member
« Reply #15 on: August 21, 2011, 11:11:29 PM »

Hi all,
Please help me step by step to use MWImporter tool. I have directory of html files that link together, I want convert and import them into my MediaWiki.
Thanks so much
NGUYEN MINH
Logged
nminh09
Participant
*
Posts: 3

View Profile Give some DonationCredits to this forum member
« Reply #16 on: August 21, 2011, 11:12:26 PM »

P/S: mail E-mail: xxxxxxxxxxx

[email removed because we don't want you to end up get spammed by automated email scrapers that scour the internet just looking for emails -- mouser]
« Last Edit: August 21, 2011, 11:17:35 PM by mouser » Logged
mouser
First Author
Administrator
*****
Posts: 33,256



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #17 on: August 23, 2011, 12:23:30 AM »

Hi nguyen,

Sorry for the delay in responding.  It's been a while since I've used mwimporter, so I'm not going to be able to help too much.  Have you managed to get it working a little bit? Where are you stuck?
Logged
nminh09
Participant
*
Posts: 3

View Profile Give some DonationCredits to this forum member
« Reply #18 on: September 30, 2011, 11:42:18 AM »

Thanks Mouser for responding, I try to run MWImporter tool after install perl cpan modules (I've read README.txt), but it don't respond anything (pause), I don't know why,
I have one big directory of html files, very hard for me to convert it into my wiki. I need a help. Can you show me the way to use MWimporter clearly? Thanks a lot.
Have a nice day.
Logged
mouser
First Author
Administrator
*****
Posts: 33,256



see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #19 on: October 01, 2011, 10:15:56 AM »

email me: mouser@donationcoder.com
Logged
Pages: [1]   Go Up
  Reply  |  New Topic  |  Print  
 
Jump to:  
   Forum Home   Thread Marks Chat! Downloads Search Login Register  

DonationCoder.com | About Us
DonationCoder.com Forum | Powered by SMF
[ Page time: 0.043s | Server load: 0.14 ]