./converters/doc2html, PERL external filter for htdig to convert numerous doc formats to HTML

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: pkgsrc-2011Q2, Version: 3.0nb3, Package name: doc2html-3.0nb3, Maintainer: pkgsrc-users

External converter script for ht://Dig (version 3.1.4 and later), that
converts Microsoft Word, Excel and Powerpoint files, and PDF,
PostScript, RTF, and WordPerfect files to text (in HTML form) so they
can be indexed. Uses a variety of conversion programs:

wp2html - to convert Wordperfect and Word7 & 97 documents to HTML
catdoc - to extract text from Word documents
rtf2html - to convert RTF documents to HTML
pdftotext - to extract text from Adobe PDFs
ps2ascii - to extract text from PostScript
pptHtml - to convert Powerpoint files to HTML
xlHtml - to convert Excel spreadsheets to HTML
or
xls2csv - to obtain data from Excel spreadsheets.

Written by David Adams (University of Southampton), and based on the
conv_doc.pl script by Gilles Detillieux.


Required to run:
[lang/perl5] [textproc/catdoc] [converters/xlhtml]

Master sites:

SHA1: 3681eb77c92be67e495826cca0b28ef7977993d6
RMD160: 8875b5a59237b07653e1265f50a92b3ebea3ed0e
Filesize: 14.701 KB

Version history: (Expand)