[Israel.pm] convert old DOS Hebrew encoding

Omer Zak w1 at zak.co.il
Thu Oct 28 05:58:48 PDT 2010


On Thu, 2010-10-28 at 08:46 -0400, Tzadik Vanderhoof wrote:
> I have a binary data file, in a format used by a relatively ancient
> program, which I am trying to convert into something sane. With the
> help of a Hex editor I have basically worked out the file format
> except that it contains Hebrew characters with an odd encoding.
> 
> All characters are 8 bits. The "standard" 27 consonants (including
> "final" consonants) go from hex 80 to 9A. Then there are vowels that
> seem to start around hex 9B or so (I'm guessing right after the
> standard consonants end). Then there are "dotted" consonants that seem
> to start at hex E0.
> 
> If I remember correctly, I think this is some sort of DOS encoding
> (perhaps connected to the old "Hebrew chip"). Does anyone have a table
> of this character mapping or a tool that will translate this mapping
> into a more normal Hebrew encoding like Unicode?

The encoding looks like CP862
(http://en.wikipedia.org/wiki/Code_page_862 or
http://www.ascii-codes.com/cp862.html), modified to incorporate vowels
and "dotted" consonants.

The standard iconv tool doesn't seem to support this modified CP862
encoding, however it should be simple to whip up a short tool or script
for converting from this encoding into one of the Unicode encodings,
such as utf-8.

--- Omer


-- 
What happens if one mixes together evolution with time travel to the
past?  See: http://www.zak.co.il/a/stuff/opinions/eng/evol_tm.html
My own blog is at http://www.zak.co.il/tddpirate/

My opinions, as expressed in this E-mail message, are mine alone.
They do not represent the official policy of any organization with which
I may be affiliated in any way.
WARNING TO SPAMMERS:  at http://www.zak.co.il/spamwarning.html



More information about the Perl mailing list