Kamis, 18 September 2008

How to convert CHM files under Linux

CHM files, known as Microsoft Compressed HTML Help files, are a common format for eBooks and online documentation. They are basically a collection of HTML files stored in a compressed archive with the added benefit of an index.

Under Linux, you can view a CHM file with the xchm viewer. But sometimes that’s not enough. Suppose you want to edit, republish, or convert the CHM file into another format such as the Plucker eBook format for viewing on your Palm. To do so, you first need to extract the original HTML files from the CHM archive.

This can be done with the CHMLIB (CHM library) and its included helper application extract_chmLib.

In Debian or Ubuntu:

$ sudo apt-get install libchm-bin
$ extract_chmLib book.chm outdir

where
book.chm is the path to your CHM file and outdir is a new directory that will be created to contain the HTML extracted from the CHM file.

In other Linuxes, you can install it from source. First download the libchm source archive from the above website. I couldn’t get the extract_chmLib utility to compile under the latest version 0.38, so I used version 0.35 instead.

$ tar xzf chmlib-0.35.tgz
$ cd chmlib-0.35/
$ ./configure
$ make
$ make install
$ make examples

After doing the “make examples“, you will have an executable extract_chmLib in your current directory. Here is an example of running the command with no arguments and the output it produces:

$ ./extract_chmLib
usage: ./extract_chmLib

After running the utility to extract the HTML files from your CHM file, the extracted files will appear in . There won’t be an “index.html” file, unfortunately. So you’ll have to inspect the filenames and/or their contents to find the appropriate main page or Table of Contents.

Now the HTML is yours to enjoy!

Resources

I got help in writing this article from here and here.

This entry was posted on Sunday, September 24th, 2006 at 0630 UTC and is filed under Palm, Tech, Ubuntu. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.

24 Responses to “How to convert CHM files under Linux”

  1. arnuld Says:
    September 27th, 2006 at 0930 UTC

thanks, it really help me to get out of rut

  1. Ray Says:
    October 1st, 2006 at 2134 UTC

Thanks!

  1. pubcra Says:
    October 25th, 2006 at 1422 UTC

I found that the chm extrator breaks often links. It extracts for a Windows filesystem where lower and upper case are the same. In linux, this breaks of course. Hence, this little perl script to fix it by adding Syslinks. I used it to make links in the Perl Best practices book:

#!/usr/local/perl

use strict;
use warnings;

chdir $ARGV[0] or die “$!”;
my @html_files = glob(”*.html”);

foreach (@html_files) {
my $new_name = $_;
$new_name =~ s/(.+)\.html/\U$1\E.html/; #put file name in capitals
$new_name =~ s/(PERLBP)(.+)/\L$1\E$2/; #put perlb in lower case
symlink $_, $new_name;
}

  1. srbanator Says:
    November 21st, 2006 at 2316 UTC

thankx a lot for the tip

now i can convert chm2html on terminal, a then read it with lynx !

  1. Dario Cesar Says:
    November 28th, 2006 at 2336 UTC

I’ve allways said that mad people is wiser than the so called “sane people”.

Thank you a lot for this article!

  1. Darren Says:
    December 9th, 2006 at 0536 UTC

Hey, thanks everyone for your comments. This chm conversion problem always bugged me, and Google wasn’t helping me too much. But when I finally found the right resources to help me, I decided to write it up so others could find out how to do this too. I’m glad I could help.

  1. Anator Says:
    January 6th, 2007 at 0521 UTC

I tried to install chmlib-0.38 but when I couldn’t get it to work went for 0.35, everything went o.k untill when I tried to run the extract_chmLib.
It gives the following errors.
——————————————————————————-
./extract_chmLib: error while loading shared libraries: libchm.so.0: cannot open shared object file: No such file or directory
——————————————————————————-
Please help me if possible.

Thanks in advance

  1. jacobite Says:
    February 20th, 2007 at 1946 UTC

Is there anything in Linux that will do the reverse? ie; convert a bunch of HTML files to chm?

  1. unikuser Says:
    February 21st, 2007 at 1456 UTC

There is a more easy method that one specified here.
Open kchmviewer or install it (apt-get install kchmviewer)
File->Extract chm content … and select folder you want it to be extracted and you are done

  1. diafanos Says:
    March 11th, 2007 at 1219 UTC

great tool….

And then you can use htmldoc (install from Synaptic) to convert it to pdf.
Really useful to create a pdf e-book

  1. Gabriel Says:
    March 22nd, 2007 at 0356 UTC

I got 0.39 to work on FC5 by passing the –enable-examples option to ./configure.
So, run it like this:
# ./configure –enable-examples
# make
# make install

No need to run “make examples” as ./configure –enable-examples takes care of that.

Make takes what ./configure had available. If you just do ./configure it runs plain, vanilla chmLib. However, by passing the examples option to ./configure, when make is run it puts examples in the program already. When make install is run, extract_chmLib is placed in your /usr/local/bin/ folder.

  1. gdunc Says:
    March 24th, 2007 at 1522 UTC

I’ve tried both methods mentioned here… CHMLIB (CHM library) and KChmViewer and both work equally well. (As you would expect since KChmViewer uses chmLib to extract the CHM files)

The only caveat I’d mention is that KChmViewer creates files that seem to be about 3 times larger than using chmLib alone. KChmViewer is a lot easier for those who might be afraid of the command line, but there is a price to pay.

  1. Jorge Says:
    April 9th, 2007 at 0418 UTC

Thank you for taking the time to write this article.

  1. Alias Says:
    June 21st, 2007 at 1420 UTC

I searched a method to extract chm files under Linux for a long time… so thank you very much !

  1. Bruno Salvino Says:
    July 6th, 2007 at 0115 UTC

Thank you so much for this information!

One of the most useful article I ever read!

  1. David Robinson Says:
    July 7th, 2007 at 1412 UTC

Many thanks for this advice. I’ve been looking for a quick & easy way to do this for quite a while.

  1. nemo.x Says:
    July 8th, 2007 at 1413 UTC

Cool tool! Thanks for all, who make it! OpenSource forever!

  1. Tomek Says:
    July 14th, 2007 at 1605 UTC

Thank You!

  1. Pat Tongco Says:
    July 28th, 2007 at 1914 UTC

I did not notice this tool sitting right there all of the time when I still use chm_http before mirroring the file using all the switches in wget nor pavuk you could possibly imagine, resulting in all kinds of trouble. Imagine doing all trouble the first day you learn to appreciate chm documents only to find out after SEVERAL YEARS that there is a better tool sitting right next to you. Thanks for the tip.

  1. mickro Says:
    December 23rd, 2007 at 1217 UTC

That’s nice but I need to extract the menu. Without it, it’s very hard to navigate into the extracted html files.

Anyone have an idea?

  1. enola Says:
    January 13th, 2008 at 2057 UTC

thank u very much for this one!
see u

(do not forget `life if too short for reboot` ;))

  1. John Hauser Says:
    February 16th, 2008 at 0253 UTC

if you need the “left side navigation index” provided by the CHM file and don’t want to mess with fixing broken links, try using archmage. it’s a different tool to use to turn CHM files back into html files.

http://archmage.sourceforge.net/

in Ubuntu all I had to do was “sudo apt-get install archmage”

to use it: archmage my.chm outdir

it created a “arch_contents.html” that reproduced the “left side navigation” in the CHM file

  1. Daniel Lelis Baggio Says:
    February 19th, 2008 at 0043 UTC

Thanks for the xchm!

  1. Morten Juhl-Johansen Zölde-Fejér Says:
    May 2nd, 2008 at 0700 UTC

Useful - thank you for the tips. Good to see that an older page keeps the ranking.


Tidak ada komentar: