[Israel.pm] Convert MS-Word to Wiki markup with Perl

Uri Bruck bruck at actcom.net.il
Tue Jun 26 00:25:22 PDT 2007


Ran Eilam wrote:
> On 6/25/07, oren maurer <meorero at gmail.com> wrote:
>   
>> Does anyone know of a Perl module that can convert MS-Word documents
>> to Wiki markup? (I mainly mean - MediaWiki )
>>     
>
> Crazy idea, but why not Save As HTML 
MS-Word's HTML is ugly, and if memory serves, they put the style 
information in a separate file.
OpenOffice generates HTML that's less ugly, and has most information 
more readily available.

> (perhaps with some Perl driven
> OLE automation), shake violently with HTML::Tidy -bare -xhtml,
> sanitize with XML::Twig to scrub away the span/font/width/margin/muck
> (i.e. almost everything), sprint to HTML, and finally squeeze through
> HTML::WikiConverter::MediaWiki?
>
>   
>>  #  Please avoid sending me Word |
>>  #   or PowerPoint attachments
>>     
>
> But with this tool you could convert them to MediaWiki syntax! ;)
>
>   


-- 
Thanks,
Uri
http://translation.israel.net 

Si fractum non sit, noli id reficere.




More information about the Perl mailing list