[Israel.pm] newline character as a portability issue

Jason Elbaum Jason.Elbaum at freescale.com
Wed Jul 7 23:17:49 PDT 2004


Shlomo Yona wrote:

> 	LF  eq  \012  eq  \x0A  eq  \cJ  eq  chr(10)  eq ASCII 10
> 	CR  eq  \015  eq  \x0D  eq  \cM  eq  chr(13)  eq ASCII 13

This is correct in ASCII. Historical note from teleprinter days:

LF == "linefeed" which means "advance the paper one line"
CR == "carriage return" which means "move the print head back to the 
leftmost column"

So, technically speaking, to start a new print line you had to do both 
(CRLF): Return to the leftmost column and advance one line.

Since this was redundant - every line ended with two characters - some 
systems decided to treat LF as if it were a CRLF, and other systems 
treated CR as CRLF.

That's why, by default, Unix ends lines with ASCII 10, Mac with ASCII 
13, and DOS with 13-10.


Keep in mind, though, that not all systems use ASCII! Quoting perlport 
again:

> These are just the most common definitions of \n and \r in Perl. There may well be others. For example, on an EBCDIC implementation such as z/OS or OS/400 the above material is similar to "Unix" but the code numbers change:
> 
>     LF  eq  \025  eq  \x15  eq           chr(21)  eq  CP-1047 21
>     LF  eq  \045  eq  \x25  eq  \cU  eq  chr(37)  eq  CP-0037 37
>     CR  eq  \015  eq  \x0D  eq  \cM  eq  chr(13)  eq  CP-1047 13
>     CR  eq  \015  eq  \x0D  eq  \cM  eq  chr(13)  eq  CP-0037 13

You're not likely to encounter EBCDIC systems in the wild these days, 
but you do need to consider just how portable you want your code to be.


Jason Elbaum
Freescale Semiconductor Israel

-- 
[ ] Freescale General Business Information
[X] Freescale Internal Use Only
[ ] Freescale Confidential Proprietary



More information about the Perl mailing list